Model management

There are many reasons why a user may want to switch the model they are using for generation. Different models come with different capabilities and tradeoffs. Giving users the ability to change their model on the fly, or at least know which one the AI is using at any given time, has become a standardized pattern across AI products.

Why users might switch models

  • Accuracy and reliability: Some models are more prone to hallucinations or errors depending on their training data and foundational prompts.
  • Recency of knowledge: Newer models often contain more recent references and larger training sets, which can lead to more relevant outputs.
  • Cost considerations: Advanced models typically carry higher token or subscription costs. Users may prototype prompts on cheaper models before scaling them on premium ones.
  • Aesthetic differences: In image generation, different models carry distinct styles. Users may choose a specific model for its “look,” much like preferring vinyl for its character even if digital audio is technically higher fidelity.
  • Remixing across models: Some tools allow users to generate in one model for its style and then refine or re-render in another for structure or predictability.
  • Security concerns: Users may avoid certain models when handling sensitive or proprietary data depending on how the provider manages training, storage, and compliance.
  • Research and comparison: Engineers, analysts, and researchers often run the same task across multiple models to benchmark performance.

Model management is no longer optional. As models proliferate and differentiate, users expect the ability to select, compare, and control which system is powering their results.

Model tiers

Model choice is often structured into tiers, each with its own tradeoffs:

  • Free models: Typically smaller, cheaper, and more limited. Useful for casual exploration, onboarding, or testing prompts before scaling.
  • Pro models: Larger, more up-to-date, and more capable. These are often gated behind subscriptions or pay-per-use pricing.
  • Enterprise models: Scoped for compliance, security, and governance. Organizations may lock users into enterprise models to enforce retention, privacy, or regional restrictions.
  • Domain-specialized models: Built or fine-tuned for specific tasks like coding, medical, legal, creative domains. These provide focused strengths but may sacrifice general-purpose ability.

The tier system creates a predictable progression for users, but it also shapes how they perceive value. A free or lighter model may be “good enough,” saving money and compute, while a more powerful option can unlock higher quality if the product makes that difference visible.

Design considerations

  • Make the active model visible at the point of generation. Users should always know which model is powering their output, especially when multiple options are available. Hiding this detail saves space but risks confusion when results vary in quality, tone, or speed.
  • Explain model differences in human terms. “GPT-4o, Claude Sonnet, Llama-3” means little to most users. Translate distinctions into accuracy, recency of knowledge, creativity, cost, or latency.
  • Support seamless model switching without loss of context. Restarting a conversation or re-uploading files just to change models is a major usability hit. Let users change models in between prompts wherever possible to ramp up (or drop down) compute where needed.
  • Provide routing as a default, but keep manual override. Many users prefer the system to choose the “best” model for their task. But power users and enterprises demand control. Offering both reduces friction while avoiding lock-in to opaque model behavior.
  • Encourage cost-aware prototyping. Let users run drafts on cheaper, lighter models before escalating to premium ones. This lowers experimentation costs but requires surfacing where fidelity may drop. Be explicit about what corners are cut on lower-tier runs.
  • Expose usage and cost impacts before selection. Token limits, latency differences, and billing tiers should appear when choosing a model, not after. Hiding costs makes onboarding smoother, but it also undermines trust when unexpected bills or truncations occur.
  • Align permissions with context and audience. In enterprise, admins may need to restrict which models can be used to meet compliance. In consumer tools, explore defaults that protect against runaway costs while still letting curious users try new options.
  • Consider aesthetic and brand alignment. In creative domains, models have “styles” as much as capabilities. Presenting them only as technical tiers misses the emotional and brand implications of choosing one over another.

Examples

ChatGPT relies on an auto-router to select the model it thinks is best for the task, but users can override it. The details it provides are not super helpful to the user though
Claude provides some information for each model in its selector to help users select which one they need for their task
FloraFauna combines the model selector with showing processing costs. This way users can balance the model against their goal, but also against their budget. While drafting, they may need a model that gets them close to the gist of what they are going for, and then use a more expensive model to find tune it
Gemini provides some information about the differences between models in their selector, particularly setting models for math and coding apart
Midjourney allows users to select from their historic models in the creator input. They also provide the option to switch into draft mode, which allows their conversation mode
Perplexity relies on an auto-router but otherwise does not provide information to the user for which to pick. This leaves users solely responsible to figure out how to pick from this long list of options
Vercel allows you to switch between model weights in the flow of work so you can save compute power for more challenging tasks or getting unstuck