Expand

The Expand action lets AI build on an existing piece of content while still respecting its original form or intent. Use cases where this action is appropriate include extending the boundaries or ratio of an image, constructing a larger generation from a clip, or building upon a concept.

People don't always have an end goal in mind when in generative mode. Expanding takes a starting point and then builds upon it, often in multiple directions. Throughout this, the initial seed is kept intact, allowing creative exploration while maintaining connection with the original.

General flow

Expand starts with a starting point, which might include a previous generation, a rough draft, audio or video snippets, or a placeholder prompt. Users should be able to lengthen the original content while remaining true to the original form, or modify the new content through revised prompts, references, and parameters.

This supports an iterative process, and limits how much compute power is used on early concepts or verifying details. Users might use smaller or less refined generations as drafts and work towards a larger construction through multiple expand actions, paired with inpainting and other more convergent actions to fine-tune details as the work grows.

Variations and forms

Expansion takes different shapes depending on the medium:

  • Images: Extending boundaries to meet new aspect ratios, or widening the focal lens and filling the artboard.
  • Video: Starting from a short clip, additional frames are added to lengthen the video, often guided by an artboard or script.
  • Audio: An audio sample may be extended based on the central prompt, or structured sections may be added like intros, outros, or choruses without altering the existing track.
  • Text: From a rough outline or first draft, the text is modified to go deeper into a topic, lengthen the original reference, or move from draft to higher-fidelity form.
  • Code: From an existing code snippet or comment, code is written to expand upon a function or extend the functionality.
  • Prompts: Essentially the prompt enhancer pattern where a short input is refined into a longer, more structured instruction.
  • Conversation: From the flow of conversation, AI is directed to go deeper into a topic, consider different perspectives, or expand to include new references.

At its core, expand turns fragments into wholes. It is a copilot pattern that keeps users in control while exploring what comes next.

Design considerations

  • Anchor the expansion to what matters. Keep the original draft or image intact and make clear which parts are being elaborated. This prevents the “good parts” from being overwritten and reassures users that expansion won’t undo progress.
  • Let users define scope. Expansion should not always apply to the entire output. Give people the option to expand a paragraph, a selected image region, or a clip segment. Targeted scope keeps control with the user and avoids ballooning irrelevant content.
  • Show how much more is coming. Indicate the length, size, or duration of the planned expansion before it runs. Users should know whether “expand” means two more sentences or doubling the word count, so they can decide if the trade-off is worth the cost.
  • Make added content visible. If all changes are not visible from one view, highlight or annotate what has been expanded so people can scan differences quickly.
  • Expose costs early. Expansion often increases token use, render time, or GPU cost. Show these trade-offs up front so users can weigh richness against efficiency.

Examples

Canva’s “Magic Expand” feature expands an image to the size of the artboard (or a custom size, if the user chooses)
Midjourney supports a basic expand function to zoom out the image, as well as pan options as shown here that focus on expanding the image in a specific direction.
Texteditors like Notion tend to offer an action to expand text from the canvas. Users are asked to verify the result before it is adopted.
Replit’s initial CTA features an open to refine the prompt, which takes the simple input and converts it into an PRD that the AI can work from