# Image Generation Best Practices

Understanding the core mental model of Bria's generation pipeline will help you achieve the best, most deterministic results when integrating our APIs.

## The Big Picture: Bria's Mental Model

details
summary
strong
Q: What’s the simplest mental model for Bria image generation?
p
Think of it as a two-step system: 
code
Prompt / reference image
 → 
code
structured_prompt (JSON "blueprint")
 → 
code
deterministic render
.
p
The big win is that you can store and reuse the structured blueprint (plus a seed) to get consistent results and controlled variations.
> **ℹ️ ASYNCHRONOUS ENDPOINTS:** > By default, all Bria v2 endpoints are asynchronous - the API returns a `request_id` and a `status_url` immediately, and you poll that URL until the result is ready. Pass `sync: true` only if you need a synchronous response for simple integrations.


## Global Prompting Best Practices (Across All Endpoints)

details
summary
strong
Q: What’s the best general way to write prompts for high-quality images?
p
Use a "shot spec" prompt, structured like a mini creative brief:
ul
li
strong
Subject:
 What is it?
li
strong
Environment:
 Where is it?
li
strong
Composition:
 Close-up/wide, centered, negative space.
li
strong
Lighting:
 Soft studio, golden hour, neon, etc.
li
strong
Style/medium:
 Photoreal, illustration, 3D, etc.
li
strong
Constraints:
 What must stay the same, what to avoid, required text.
details
summary
strong
Q: Should I write long prompts with lots of adjectives (“ultra, 8K, masterpiece”)?
p
Usually no. Better results come from concrete visual controls (composition, lighting, camera feel, materials, and clear constraints) rather than stacks of vague superlatives.
details
summary
strong
Q: How do I get better photorealism?
p
Add specifics that photographers care about:
ul
li
strong
Lighting:
 Type and direction ("soft diffused key light from upper left").
li
strong
Composition:
 ("3/4 angle, centered, negative space on right").
li
strong
Material cues:
 ("matte ceramic, crisp reflections").
li
strong
Background:
 ("white seamless, subtle shadow").
li
strong
Resolution:
 Set the `resolution` parameter to `4MP`.
details
summary
strong
Q: How do I reliably generate images with text in them?
p
Put the exact text verbatim in quotes and specify:
ul
li
Placement (centered, top-left).
li
Typography expectations (bold sans-serif, balloon lettering).
li
Clarity ("crisp edges, high contrast").
> **💡 SEED MANAGEMENT:** >
**Lock the seed** when you’re evaluating prompt changes and want apples-to-apples comparisons.
**Change the seed** when the “blueprint” is good and you want visual variety.


details
summary
strong
Q: What’s the best way to refine without the image “drifting”?
p
This is the pro workflow:
ol
li
Generate once from text (or text+image).
li
Save the returned 
code
structured_prompt
 and 
code
seed
.
li
Refine using 
code
structured_prompt
 + 
code
prompt
 + 
code
seed
, changing one thing at a time.
p
This keeps composition and style stable while you make targeted improvements.
## Endpoint-by-Endpoint Guides

### POST `/v2/image/generate` (Standard)

Use the Standard pipeline when you want the highest quality, nuance, and instruction adherence.

details
summary
strong
Q: What input combinations does `/v2/image/generate` support?
ul
li
strong
Text to image:
 Prompt only (starting from scratch).
li
strong
Image to image:
 Single image only (reference image drives visual direction).
li
strong
Image + text:
 Single image + prompt (reference image as base, text steers output).
li
strong
Recreate:
code
structured_prompt
 + seed (reproduce a previous image exactly).
li
strong
Refine:
code
structured_prompt
 + prompt + seed (targeted changes without drift).
details
summary
strong
Q: What’s the best workflow to curate top results?
ul
li
strong
Phase A (Explore quickly):
 Set `resolution: "1MP"`, fewer steps (35–40), and use a strong “shot spec” prompt.
li
strong
Phase B (Lock composition):
 Save the 
code
structured_prompt
 and seed from the best result, then refine with small text deltas.
li
strong
Phase C (Final render):
 Bump to `resolution: "4MP"` and increase steps to 50 for maximum detail.
details
summary
strong
Q: How should I choose `aspect_ratio` for better composition?
p
Choose it based on the real output placement: 
code
1:1
 for product tiles, 
code
4:5
 for social feeds, 
code
9:16
 for stories/reels, and 
code
16:9
 for banners. (Other ratios like 
code
2:3
, 
code
3:4
, etc., are also supported).
### POST `/v2/image/generate/lite`

Choose Lite when speed or data privacy is the priority, or when the workflow requires an on-prem deployment.

details
summary
strong
Q: How is Lite different from the Standard pipeline?
p
Input combinations are identical, but under the hood:
ul
li
strong
VLM bridge:
 Uses the open-source FIBO-VLM instead of Gemini 2.5 Flash (lower fidelity on complex inputs).
li
strong
Image model:
 Uses distilled Fibo Lite (faster inference, lower quality ceiling).
li
strong
Deployment:
 Fully local on-prem deployment is supported.
li
strong
Missing parameters:
code
resolution
, 
code
steps_num
, and 
code
negative_prompt
 are not available in Lite.
### POST `/v2/structured_prompt/generate`

Runs only the first half of the Standard generation pipeline (the translation step). It returns a JSON blueprint without rendering an image.

details
summary
strong
Q: Why generate a structured prompt separately?
ul
li
strong
Human-in-the-loop review:
 Inspect or edit the JSON for brand consistency before spending generation time.
li
strong
Programmatic editing:
 Build UIs where users tweak a JSON spec, or maintain prompt libraries.
li
strong
Hybrid deployment:
 Use Bria's state-of-the-art VLM bridge via API while self-hosting the open-source FIBO image model on-prem.
details
summary
strong
Q: How do I write inputs that produce better structured prompts?
p
Treat your input like a creative brief. The VLM bridge responds well to specifics (Subject, Composition, Lighting, Constraints). The more concrete your input, the less the bridge has to guess, and the less you'll need to correct the JSON.
### POST `/v2/structured_prompt/generate/lite`

The Lite equivalent. Supports the same four input flows but uses the open-source FIBO-VLM bridge instead of Gemini 2.5 Flash. Use this endpoint when your actual image generation will run on the Lite pipeline.

### POST `/v2/structured_prompt/generate_from_diff`

Used to repair or update structured prompts.

details
summary
strong
Q: When do I use `generate_from_diff`?
p
Use it when your product allows users to directly edit structured prompt JSON, and you want Bria to interpret the semantic diff and output an optimized structured prompt reflecting the user’s change.
> **💡 STRUCTURED PROMPT EDITING:** > Always encourage small, targeted edits (lighting, composition) rather than full rewrites. Lock the seed while testing "did this change do what I expect?" to maintain disentangled control.


## Example Prompts (Copy-Ready)

details
summary
strong
📸 Photoreal Product
blockquote
“Studio product photo of a matte black insulated travel mug on a white seamless background. 3/4 angle, centered subject with clean negative space on the right. Soft diffused key light from upper left, subtle shadow, crisp edges, realistic materials.”
details
summary
strong
🌆 Lifestyle Ad
blockquote
“A young professional cycling across a modern city bridge at sunrise, wide shot, subject slightly left of center with negative space on the right for copy. Warm golden-hour light, soft lens flare, realistic motion blur on background, crisp subject focus.”
details
summary
strong
🔤 Text in Image
blockquote
“Balloon lettering spelling exactly 'HAPPY NEW YEAR 2026', centered, high contrast on white background, soft studio lighting, crisp edges, realistic foil reflections.”