Back to Catalog
Data Science
ML Workflows
Active Learning
Iteratively selects the most informative samples for labeling to maximize model improvement.
Intent & Description
📋 Context
Labeling all data is expensive. Active learning identifies uncertain or informative samples for human labeling, reducing labeling costs while maximizing model performance.
Real-world Use Case
ML projects with limited labeling budget where you need to maximize model performance with minimal labeled data.
Advantages
- Reduced labeling cost
- Faster model improvement
- Focuses on informative samples
- Efficient resource use
Disadvantages
- Selection strategy complexity
- Computational overhead
- May miss rare classes
- Implementation complexity
Implementation Example
# Active Learning Pattern class ActiveLearner: def __init__(self, model, strategy): self.model = model self.strategy = strategy
def select_samples(self, unlabeled_pool, n_samples): # Score uncertainty for unlabeled samples scores = self.strategy.score(unlabeled_pool, self.model)
# Select most uncertain samples selected_indices = np.argsort(scores)[-n_samples:]
return unlabeled_pool[selected_indices]
def update_model(self, labeled_samples): self.model.fit(labeled_samples)
# Use uncertainty sampling strategy learner = ActiveLearner(model, UncertaintySampling())