Stable Diffusion Prompt Guide 2026: From Basics to Advanced Techniques

Stable Diffusion is the most powerful and most demanding AI image generator available. Unlike Midjourney's managed cloud or DALL-E 3's conversational interface, SD gives you direct access to the model's internals: token weights, samplers, CFG scale, negative prompts, and a vast ecosystem of fine-tuned models. With that power comes a steeper learning curve.

This guide covers everything: from the basic comma-separated token syntax to advanced weight manipulation, model selection, sampler choice, and the differences between SD 1.5, SDXL, and SD 3.5. You'll also learn how to use our Stable Diffusion prompt generator and tools like ImageToPrompt to generate SD-ready prompts from reference images.

Why Stable Diffusion Prompting Is the Most Complex — and Most Powerful

SD's complexity is a feature. Every parameter is exposed because the community demanded it. Here's what SD offers that most cloud-based generators don't:

Negative prompts: A separate field for explicitly excluding unwanted elements — critical for avoiding common SD artifacts
Token weighting: Assign numerical importance to any concept in your prompt
CFG scale: Control how closely the model follows your prompt (vs. creative freedom)
Sampler selection: Choose the denoising algorithm, each with different quality/speed/style tradeoffs
Model ecosystem: Thousands of fine-tuned models specialized for anime, photorealism, architecture, product photography, etc.
LoRA and embeddings: Inject specific styles, characters, or concepts with a few extra tokens

If you're coming from Midjourney or DALL-E 3, expect a 2–4 week adjustment period before SD results match your expectations. The ceiling is much higher, but the floor requires work to reach.

Prompt Syntax Basics: Token Lists vs Natural Language

SD was originally trained on LAION datasets with short, descriptive alt-text captions. As a result, comma-separated short phrases (tokens) work more reliably than full sentences — especially on SD 1.5 and SDXL.

Token-based (recommended for SD 1.5 and SDXL)

        masterpiece, best quality, 1girl, long silver hair, blue eyes, white dress, standing in a moonlit forest, ethereal lighting, bokeh background, photorealistic
      

Natural language (works better on SD 3.5)

        A photorealistic portrait of a young woman with long silver hair and blue eyes, wearing a flowing white dress, standing in a moonlit forest with soft ethereal lighting and a blurred background.
      

SD 3.5 was specifically trained with natural language prompts and handles them much better than previous versions. For SD 1.5 and SDXL, stick to comma-separated tokens for best results.

The Token Weight System

Token weights let you tell the model which elements of your prompt are most important. Without weights, all tokens receive equal attention — the model may de-emphasize something you care about in favor of a generic element.

Increasing weight (parentheses)

        (important concept:1.3)    ← 30% more emphasis
((very important:1.5))    ← double parentheses also increase weight
(normal importance)       ← single parentheses = slight boost (~1.05x)
      

Decreasing weight (square brackets)

        [less important:0.8]      ← 20% less emphasis
[barely noticeable:0.5]   ← 50% less emphasis
      

Practical weight examples

        masterpiece, (photorealistic:1.4), portrait of a woman, (red hair:1.3), green eyes, [background:0.7], soft lighting, sharp focus
      

This prompt prioritizes photorealism and red hair, while slightly de-emphasizing the background so it doesn't compete with the subject.

Caution with high weights: Weights above 1.5 often cause artifacts, overexposure, or color distortion. Stay between 0.7 and 1.4 for most use cases. Extreme weights (above 1.8) will break image quality.

Structuring Your Positive Prompt

SD responds to token position — earlier tokens receive slightly more weight in the attention mechanism. The recommended structure for maximum control:

Recommended token order

Quality tokens (most important)
Medium/format (photography, painting, illustration, etc.)
Subject (main focus of the image)
Subject details (appearance, attributes)
Action/pose
Setting/environment
Lighting
Style and artist references
Camera/technical details

Quality Tokens: The SD Vocabulary for "Make It Good"

Quality tokens are phrases that SD associates with high-quality training images. Including them at the start of your prompt biases the model toward better outputs. Not all of these work equally well on all models — test and remove any that don't improve your results.

Universal quality tokens

masterpiece, best quality, ultra-detailed, high resolution

Photography-specific quality tokens

        photorealistic, hyperrealistic, photograph, RAW photo, 8k uhd, DSLR, sharp focus, high detail, professional photography
      

Illustration/art quality tokens

        highly detailed illustration, intricate details, sharp lines, beautiful artwork, trending on artstation, detailed digital painting
      

Tokens to use cautiously

Some tokens (like "award winning" or "featured on artstation") have become diluted from overuse. Test their effect by running with and without them — on some models they add noise rather than quality.

Negative Prompts: Your Most Powerful SD Tool

Negative prompts are what separates experienced SD users from beginners. While positive prompts tell the model what to include, negative prompts tell it what to exclude. The effect is dramatic — a well-crafted negative prompt can prevent the most common SD failure modes entirely.

For a complete treatment of negative prompts, see our Negative Prompts in Stable Diffusion guide. Here's the essential starter negative prompt:

Universal negative prompt (copy and use)

        lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry, artist name, deformed, ugly, duplicate, morbid, mutilated, out of frame, extra fingers, mutated hands, poorly drawn hands, poorly drawn face, mutation, deformed, blurry, bad proportions, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many fingers, long neck
      

Photography-specific negative additions

illustration, painting, cartoon, anime, drawing, sketch, 3d render, CGI

Portrait-specific negative additions

        bad eyes, asymmetrical eyes, crossed eyes, bad teeth, bad lips, uneven skin tone, skin blemishes, acne
      

CFG Scale: Following Your Prompt vs Creative Freedom

CFG (Classifier-Free Guidance) scale controls how strictly the model adheres to your prompt. Low values give the model creative freedom but may drift from your description. High values force strict adherence but cause artifacts and oversaturation.

CFG Value	Behavior	Best For
1–4	Very loose, model does its own thing	Artistic experimentation, happy accidents
5–7	Balanced, some creative interpretation	Most art styles, illustrations, creative images
7–9	Good prompt adherence, standard quality	Default for most use cases
10–12	Strong prompt adherence, some saturation	Precise compositions, complex scenes
13–17	Very strict, increasing artifacts	Rarely recommended
18+	Overcooked, flat, posterized	Avoid

Recommended default: CFG 7 for most art styles, CFG 6 for photorealistic, CFG 8–9 for complex prompts with many elements.

Sampler Selection: Which One to Use and When

Samplers are the denoising algorithm that generates your image step by step. Different samplers produce different results at the same step count and can produce dramatically different image character.

Sampler	Speed	Quality	Best For
DPM++ 2M Karras	Fast	Excellent	Default recommendation, photorealism
DPM++ SDE Karras	Slow	Excellent, detailed	High-detail portraits, fine textures
Euler a	Fast	Good, varied	Exploration, getting diverse results
Euler	Fast	Good, consistent	Consistent iterations on the same prompt
DDIM	Fast	Good, smooth	Inpainting, img2img workflows
LMS Karras	Medium	Good	Artistic styles, illustrations
Heun	Slow (2x steps)	High quality	Final renders when quality matters most

Recommendation for beginners: Start with DPM++ 2M Karras at 20–30 steps. It produces excellent results efficiently and is the community default for good reason.

Step Count Recommendations

More steps generally produce more detailed, refined images — up to a point. Beyond that point, you waste compute time with diminishing returns or slight degradation.

10–15 steps: Rough draft, quick experiments, exploring composition
20–25 steps: Good quality, efficient, covers 90% of use cases
30–40 steps: High quality, more detail, slower generation
50+ steps: Diminishing returns for most samplers. Only worth it for Euler a, which keeps changing at high step counts

Rule of thumb: 25 steps with DPM++ 2M Karras is the standard. Increase to 35 for final renders; decrease to 15 for rapid exploration.

SD 1.5 vs SDXL vs SD 3.5: Key Prompt Differences

Stable Diffusion 1.5

The original widely-used version. Has the largest ecosystem of fine-tuned models, LoRAs, and embeddings. Prompt token limit: 75 tokens (77 with separator). Responds best to comma-separated tokens. Negative prompts are essential. Default resolution: 512×512 (upscale after).

        masterpiece, best quality, photorealistic, (portrait of a young woman:1.2), brown hair, brown eyes, white blouse, sitting at a cafe, warm lighting, bokeh background, sharp focus, 8k
      

Stable Diffusion XL (SDXL)

Larger model with significantly improved understanding of complex prompts and human anatomy. Default resolution: 1024×1024. Handles longer prompts better than SD 1.5. Still benefits from quality tokens at start. Has separate base and refiner models for best results.

        professional portrait photography, young woman, brown hair pulled back, warm smile, sitting at a sunlit coffee shop, shallow depth of field, golden hour light from window, Canon 5D Mark IV, 85mm lens, f/1.8, photorealistic, high detail
      

Stable Diffusion 3.5

The latest generation. Understands natural language much better than previous versions. Produces more accurate text in images. Better prompt coherence for complex multi-element scenes. Uses a different architecture (Multimodal Diffusion Transformer). Natural language works well alongside or instead of token lists.

        A professional portrait of a young woman with warm brown hair, sitting in a sunlit coffee shop with golden afternoon light coming through the window. She's wearing a white blouse and has a natural, warm smile. Shot with shallow depth of field, photorealistic, high detail.
      

LoRA and Embedding Integration in Prompts

LoRAs (Low-Rank Adaptations) are small model add-ons that inject a specific style, character, or concept. They're activated by including their trigger word(s) in your prompt.

Using a LoRA in A1111/Forge

        masterpiece, best quality, [LoRA trigger word], [rest of your prompt] <lora:lora_filename:0.8>
      

The number after the colon (0.8) is the LoRA strength. Values between 0.6 and 1.0 work for most LoRAs. Higher values amplify the style but can cause artifacts. Lower values blend the style more subtly with the base model.

Textual Inversions (Embeddings)

Embeddings are activated simply by typing their name as a token in your prompt:

        masterpiece, best quality, EasyNegative, bad_prompt_version2, [rest of negative prompt]
      

EasyNegative and bad_prompt_version2 are popular negative prompt embeddings that pack hundreds of exclusion tokens into a single word — include them in your negative prompt for a quick quality boost.

Stable Diffusion SDXL output — high quality image showing what's achievable with a well-structured positive prompt and appropriate quality tokens — SDXL output: structured prompt with quality tokens and negative guidance

Stable Diffusion alternative output — showing different prompt structure approach and how token weighting changes the final result — Token weight variation: same subject, different emphasis

Anatomy of a Complete SD Prompt

Here's a fully annotated example showing every element in action:

        POSITIVE:
masterpiece, best quality, ultra-detailed,     ← quality tokens
photorealistic, RAW photo,                      ← medium/format
(1girl:1.1),                                    ← subject (weighted)
long auburn hair, amber eyes, freckles,         ← subject details
wearing a leather jacket, sitting on a          ← action/setting
motorcycle on a rain-slicked city street,
night scene, neon reflections on wet asphalt,   ← environment details
(volumetric lighting:1.2), (rim light:1.1),    ← lighting (weighted)
cinematic, film grain, 35mm,                    ← style/technical
bokeh, f/2.8, sharp focus on face              ← camera details

NEGATIVE:
lowres, bad anatomy, bad hands, extra fingers,
worst quality, low quality, blurry, watermark,
poorly drawn face, deformed, ugly, bad proportions
      

15 Complete Example Prompts Across Categories

1. Photorealistic Female Portrait

        masterpiece, best quality, photorealistic, RAW photo, portrait of a beautiful woman in her 30s, dark red hair, green eyes, natural makeup, white silk blouse, soft studio lighting, rembrandt lighting, shallow depth of field, bokeh, 85mm lens, sharp focus, 8k
      

2. Fantasy Landscape

        masterpiece, best quality, ultra-detailed, epic fantasy landscape, ancient ruined temple overgrown with vines, glowing magical stones, misty atmosphere, golden light through forest canopy, digital painting, concept art, artstation trending, cinematic composition, volumetric lighting
      

3. Cyberpunk City

        masterpiece, best quality, cyberpunk city street at night, rain-slicked asphalt, neon signs in Japanese and English, holographic advertisements, crowded with pedestrians in futuristic clothing, (atmospheric fog:1.2), volumetric neon lighting, cinematic, ultra-detailed, 8k
      

4. Product Photography

        masterpiece, best quality, product photography, perfume bottle on white marble surface, soft studio lighting, bokeh background, water droplets on glass, luxury editorial style, clean and minimal, sharp focus, commercial photography, white background
      

5. Anime Character

        masterpiece, best quality, 1girl, anime style, silver hair, blue eyes, school uniform, cherry blossom background, sunlight filtering through petals, detailed illustration, artstation, beautiful detailed eyes, sharp focus, vibrant colors
      

6. Abstract Digital Art

        masterpiece, best quality, abstract digital art, flowing liquid metal, iridescent colors, fractal patterns, dark background, (bioluminescent:1.3), surreal, cinematic lighting, ultra-detailed, 8k, trending on artstation
      

7. Architecture Exterior

        masterpiece, best quality, architectural photography, modern minimalist house, large glass windows, concrete and wood exterior, surrounded by lush garden, golden hour lighting, (dramatic sky:1.2), architectural visualization, ultra-detailed, sharp focus
      

8. Food Photography

        masterpiece, best quality, food photography, overhead shot of ramen bowl with rich broth, soft-boiled egg, chashu pork, green onions, nori, steam rising, (warm lighting:1.2), natural light, wooden table surface, editorial food styling, shallow depth of field
      

9. Historical Portrait

        masterpiece, best quality, oil painting portrait, 17th century Dutch master style, merchant in dark clothes, fur collar, holding a globe, dark background, (Rembrandt lighting:1.3), detailed brushwork, museum quality, highly detailed
      

10. Sci-Fi Spaceship

        masterpiece, best quality, ultra-detailed, sci-fi spacecraft approaching a ringed gas giant, (hard science fiction:1.2), NASA concept art style, photorealistic rendering, solar light from the right, stars and nebula in background, cinematic composition
      

11. Nature Macro

        masterpiece, best quality, macro photography, dew drops on spider web at sunrise, (golden hour light:1.3), bokeh background, sharp focus on droplets, (prismatic light refraction:1.2), nature photography, canon 100mm macro lens
      

12. Character Concept Art

        masterpiece, best quality, character concept art, female warrior in ornate dark plate armor, (battle-worn:1.1), scarred face, determined expression, full body shot, neutral background, detailed armor design, artstation trending, sharp focus, professional concept art
      

13. Cozy Interior

        masterpiece, best quality, interior photography, cozy home library, floor-to-ceiling bookshelves, leather armchair, warm fireplace light, (golden hour window light:1.2), Persian rug, plants, candles, atmospheric, soft shadows, inviting atmosphere
      

14. Watercolor Illustration

        masterpiece, best quality, watercolor illustration, charming street cafe in Paris, loose expressive brushwork, soft washes of color, warm afternoon light, people at outdoor tables, flowering window boxes, impressionist influence, detailed watercolor painting
      

15. Portrait with Environment

        masterpiece, best quality, photorealistic, environmental portrait of an old fisherman, weathered face, kind eyes, yellow rain slicker, harbor in fog behind him, (moody overcast light:1.2), documentary photography style, Sebastião Salgado influence, deep focus, grain, black and white
      

Using ImageToPrompt to Generate SD Prompts from References

The fastest way to develop your SD prompting skills is to analyze images that already look the way you want. ImageToPrompt uses Claude Vision to extract detailed, SD-compatible prompts from any uploaded image.

When you upload an image, the tool identifies:

Subject and composition elements
Lighting type and direction
Color palette and mood
Art style and medium
Technical characteristics (depth of field, grain, etc.)

You can select the Stable Diffusion output format to get a prompt already structured in the comma-separated token format SD expects. This eliminates guesswork when trying to replicate a specific visual style.

For comparison with other generators, see our Stable Diffusion vs Midjourney vs DALL-E 3 comparison. For negative prompt strategies, see the negative prompts deep dive. For general prompt engineering principles, see prompt engineering for AI art.

Why Stable Diffusion Prompting Is the Most Complex — and Most Powerful

Prompt Syntax Basics: Token Lists vs Natural Language

Token-based (recommended for SD 1.5 and SDXL)

Natural language (works better on SD 3.5)

The Token Weight System

Increasing weight (parentheses)

Decreasing weight (square brackets)

Practical weight examples

Structuring Your Positive Prompt

Recommended token order

Quality Tokens: The SD Vocabulary for "Make It Good"

Universal quality tokens

Photography-specific quality tokens

Illustration/art quality tokens

Tokens to use cautiously

Negative Prompts: Your Most Powerful SD Tool

Universal negative prompt (copy and use)

Photography-specific negative additions

Portrait-specific negative additions

CFG Scale: Following Your Prompt vs Creative Freedom

Sampler Selection: Which One to Use and When

Step Count Recommendations

SD 1.5 vs SDXL vs SD 3.5: Key Prompt Differences

Stable Diffusion 1.5

Stable Diffusion XL (SDXL)

Stable Diffusion 3.5

LoRA and Embedding Integration in Prompts

Using a LoRA in A1111/Forge

Textual Inversions (Embeddings)

Anatomy of a Complete SD Prompt

15 Complete Example Prompts Across Categories

1. Photorealistic Female Portrait

2. Fantasy Landscape

3. Cyberpunk City

4. Product Photography

5. Anime Character

6. Abstract Digital Art

7. Architecture Exterior

8. Food Photography

9. Historical Portrait

10. Sci-Fi Spaceship

11. Nature Macro

12. Character Concept Art

13. Cozy Interior

14. Watercolor Illustration

15. Portrait with Environment

Using ImageToPrompt to Generate SD Prompts from References

Related Guides

Try It Yourself