Hugging Face Transformers provides pre-trained models for NLP tasks — eliminating the need to train from scratch for most applications.

Installation

  pip install transformers torch sentencepiece
  

Sentiment Analysis — Zero Setup

  from transformers import pipeline

classifier = pipeline("sentiment-analysis")

results = classifier([
    "I love this product!",
    "Terrible experience, would not recommend.",
    "It was okay, nothing special.",
])

for text, result in zip(
    ["I love...", "Terrible...", "It was okay..."],
    results,
):
    print(f"{result['label']}: {result['score']:.3f}")
  

Text Classification with Custom Model

  from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_name = "distilbert-base-uncased-finetuned-sst-2-english"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

def classify(text):
    inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
    with torch.no_grad():
        outputs = model(**inputs)
    probs = torch.softmax(outputs.logits, dim=1)
    label = model.config.id2label[probs.argmax().item()]
    confidence = probs.max().item()
    return label, confidence

label, conf = classify("This movie was absolutely fantastic!")
print(f"{label} ({conf:.2f})")
  

Named Entity Recognition (NER)

  ner = pipeline("ner", grouped_entities=True)

text = "Apple Inc. was founded by Steve Jobs in Cupertino, California."
entities = ner(text)

for entity in entities:
    print(f"{entity['word']}: {entity['entity_group']} ({entity['score']:.2f})")
  

Text Generation

  generator = pipeline("text-generation", model="gpt2")

prompt = "Python is a programming language that"
output = generator(prompt, max_length=50, num_return_sequences=1)
print(output[0]["generated_text"])
  

Question Answering

  qa = pipeline("question-answering")

context = """
Python was created by Guido van Rossum and first released in 1991.
It emphasizes code readability and supports multiple programming paradigms.
"""

result = qa(question="Who created Python?", context=context)
print(f"Answer: {result['answer']} (confidence: {result['score']:.2f})")
  

Fine-Tuning on Custom Data

For domain-specific tasks, fine-tune a pre-trained model:

  from transformers import TrainingArguments, Trainer
from datasets import load_dataset

dataset = load_dataset("imdb")  # movie reviews

training_args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=3,
    per_device_train_batch_size=16,
    eval_strategy="epoch",
    logging_steps=100,
)

# Define model, tokenizer, data collator, then:
# trainer = Trainer(model=model, args=training_args, train_dataset=..., eval_dataset=...)
# trainer.train()
  

See Hugging Face docs for full fine-tuning tutorials.

Model Hub

Browse 500,000+ models at huggingface.co/models:

Task Example Model
Sentiment distilbert-base-uncased-finetuned-sst-2-english
Translation Helsinki-NLP/opus-mt-en-fr
Summarization facebook/bart-large-cnn
NER dslim/bert-base-NER
Code generation bigcode/starcoder2-7b

Production Tips

  1. Cache models locally — first download is slow
  2. Use GPU when available — device=0 in pipeline
  3. Batch inputs for throughput
  4. Set max_length to control memory usage
  5. Consider distilled models (DistilBERT) for faster inference

Hugging Face democratized NLP — tasks that required research teams now take five lines of Python.