Domain classification strategies for routing to multiple models

Classifying a prompt through logic, ML, LLM, or hybrid strategies for cost efficiency, latency reduction, privacy, and accuracy

In our first article on routing as an AI developer’s superpower, we introduced multiple strategies for domain classification that can be used by a Large Language Model (LLM) orchestrator to route a prompt to one of multiple large language models.

When you’re working with a single LLM, domain classification is often unnecessary—unless you want to pre-process prompts for efficiency, cost control, or to enforce specific business logic. These techniques may still be useful in such a scenario, but today we’re focusing on multi-model use cases.

While many strategies exist, Aurora Toolkit focuses on four key classification strategies:

Logic-based
ML-based
LLM-based
Ensemble hybrid

Domain classification strategies

Using domain classification for routing strategy matters for:

Cost efficiency
Latency reduction
Privacy and security
Robust accuracy

Logic-based

Using in-app deterministic logic to determine the domain of a prompt is the fastest way to classify it on-device. Your logic could be as simple as estimating the number of tokens in a prompt, or as complex as switching logic that evaluates multiple conditions. It could also be a regex-based check for SSNs or credit card numbers to decide whether the prompt should be evaluated locally or in the cloud.

Logic-based classification often gives you high cost efficiency, latency reduction, and privacy and security. It may give you accuracy, for example matching some standard formats like credit card numbers, but that is not one of its core strengths.

For this technique, Aurora Toolkit provides LogicDomainRouter.

ML-based

Modern operating systems typically provide built-in support for running Machine Learning (ML) models on-device. Android and iOS/macOS platforms, in particular, offer powerful native frameworks—LiteRT (Google’s rebranded successor to TensorFlow Lite) and CoreML—which allow developers to train or fine-tune models using open-source tools like PyTorch and run them on-device with hardware acceleration. ML models are significantly smaller than LLMs, making them fast and easy to integrate into mobile and desktop applications.

ML-based classification gives you latency reduction, privacy and security, cost effiency, and depending on the quality of your models, may also offer strong accuracy when well-trained.

For this technique, Aurora Toolkit provides CoreMLDomainRouter.

LLM-based

In late 2022, LLMs were compelling. By 2023, their capabilities had grown so quickly that entirely new businesses were being built on top of them. In 2024, the conversation shifted—questions about AI’s impact on the job market started to get serious. And in 2025, we’re seeing it play out in real time, with memos from companies like Shopify and Duolingo indicating they’re now “AI-first” when it comes to hiring.

The point is: Large Language Models are now among the most capable classifiers available—often exceeding human-level performance for many tasks.

LLM-based classification gives you robust accuracy, and varying levels of cost efficiency, latency reduction, privacy, and security.

For this technique, Aurora Toolkit provides LLMDomainRouter.

Ensemble hybrid

While the most capable general classifiers are LLMs, they are frequently not cost-effective for such a task. Using an LLM to classify a prompt—only to then pass it to another LLM for processing—can introduce unnecessary latency and cost.

A great way to get better, more balanced results from all the previous strategies is to build an ensemble router using multiple techniques. For example, a logic-based router can be combined with an ML-based router to take your overall accuracy percentage from the 80s to the mid-90s. At scale, this could save thousands of dollars or more—especially when routing avoids your most expensive fallback LLM.

Ensemble-based classification with a highly-capable fallback option gives you the best balance of cost efficiency, latency reduction, privacy and security, and robust accuracy.

For this technique, Aurora Toolkit provides a confidence-based DualDomainRouter.

In the next article, we’ll take a deep dive into several use cases based on these domain routing strategies:

PII filtering
Cost and latency benchmarks
On-device vs. cloud tradeoffs
Building custom routers