Back to projects

Project

Completed

Prompt Classifier

A Python application that trains a SetFit model to classify LLM prompt complexity and recommend the appropriate model tier.

Tech Stack

PythonSetFitHugging Face Transformersscikit-learn

Overview

Prompt Classifier tackles a practical problem in LLM applications: not every prompt needs the most powerful model. Simple queries can be handled by faster, cheaper models like Haiku, while complex reasoning tasks benefit from Opus. This tool automates that routing decision. The application trains a SetFit model—a framework for few-shot text classification—on labelled examples of prompts across different complexity levels. SetFit is particularly well-suited here because it achieves strong performance with limited training data, which is important when curating quality examples is time-consuming. Once trained, the model predicts the complexity of incoming prompts and recommends an appropriate model tier. This enables intelligent routing in production systems, optimising for both cost and quality by matching prompt difficulty to model capability.

Features

  • SetFit model training for few-shot classification
  • Prompt complexity prediction
  • Model tier recommendation (Haiku/Sonnet/Opus)
  • Efficient inference for production use
  • Training pipeline with labelled examples

Challenges

Defining what makes a prompt "complex" is inherently subjective. Building a consistent labelling scheme and curating training data that captured meaningful distinctions required careful thought. Balancing the categories to avoid bias toward simple prompts was also important.

What I Learned

SetFit is impressive for few-shot scenarios—achieving usable accuracy with far fewer examples than traditional fine-tuning. The project also reinforced that data quality matters more than quantity; a hundred well-labelled examples beat thousands of noisy ones.

Future Plans

Add confidence thresholds for edge cases. Implement online learning to improve from production feedback. Explore multi-dimensional classification beyond just complexity.