How to Write AI Image Prompts for Beginners: Complete 2026 Tutorial

You open Midjourney, type "a cool dragon," and hit enter. The result is... fine. Generic, even. Your friend types something completely different and gets a stunning cinematic masterpiece. What's the difference? The prompt.

Writing effective AI image prompts is a learnable skill. It's not magic, and you don't need to be an artist or a programmer. This tutorial will take you from writing one-word prompts that produce mediocre results to crafting detailed, structured prompts that consistently generate exactly the image you visualize.

By the end of this guide, you'll understand the five core elements every great prompt contains, how to build prompts step by step, and how to use a tool like ImageToPrompt to reverse-engineer prompts from images you love.

Why Good Prompts Matter (And What Makes a Bad One)

AI image generators like Midjourney, Stable Diffusion, DALL-E 3, and Flux are not mind readers. They're pattern-matching engines trained on billions of images and their captions. When you type a prompt, the model searches its learned associations and generates an image that statistically matches what you described.

A bad prompt fails in one of three ways:

Too vague: "a landscape" could be anything — a watercolor painting, a photograph, a pixel art scene, day or night, mountains or beach. The model will guess.
Contradictory: "dark bright neon photorealistic cartoon" sends the model in multiple directions at once. The output will be confused.
Missing context: "a woman" tells the model nothing about age, ethnicity, expression, clothing, setting, lighting, or style. You'll get the most average possible woman in the most average possible setting.

A good prompt is specific, consistent, and layered. It tells the model what you want to see, how it should look, and the technical parameters it needs to match your vision.

Quick Test: Before writing your next prompt, ask yourself: "Could this description apply to 1,000 different images?" If yes, it's too vague. Aim for something that could only reasonably apply to 10–20 images.

The 5 Elements of Every Great AI Image Prompt

Great prompts are built from five building blocks. You don't always need all five — sometimes a strong two-element prompt is more effective than a weak five-element one — but understanding all five gives you full control.

1. Subject

The subject is the main thing in your image: a person, an object, a creature, a place, or an abstract concept. This is the most critical element. Be specific.

Weak: "a dog"
Better: "a golden retriever puppy"
Strong: "a golden retriever puppy sitting in autumn leaves, looking up at the camera with tongue out"

2. Style

Style tells the model what visual language to use. Without a style, the model picks one for you — usually photorealistic or whatever was most common in its training data for that subject.

Photography styles: portrait photography, street photography, macro photography, aerial photography
Illustration styles: watercolor, ink illustration, flat design, editorial illustration
Painting styles: oil painting, impressionist, acrylic painting, gouache
Digital art styles: concept art, digital painting, 3D render, pixel art
Specific artists (use carefully): "in the style of Studio Ghibli," "impressionist like Monet"

3. Composition

Composition describes how the subject is framed within the image. This is something many beginners skip, but it dramatically affects the final output.

Shot types: close-up, medium shot, full body, wide shot, establishing shot
Camera angle: eye level, low angle, high angle, bird's eye view, worm's eye view, Dutch angle
Framing techniques: rule of thirds, centered composition, golden ratio, negative space
Depth: shallow depth of field, deep focus, bokeh background

4. Lighting

Lighting can transform an image from flat and boring to emotionally powerful. Professional photographers obsess over light because it defines how everything looks. Your AI model understands lighting language.

Time of day: golden hour, blue hour, midday, nighttime, overcast
Light source: studio lighting, natural light, candlelight, neon lighting, bioluminescence
Quality: soft light, hard light, diffused light, dramatic shadows, high contrast
Direction: front-lit, backlit (silhouette), side-lit (Rembrandt lighting), rim light

5. Technical Parameters

Technical parameters are model-specific instructions that control output quality and format. These vary by platform but typically include aspect ratio, quality modifiers, and rendering style.

Aspect ratio: 16:9 (landscape), 9:16 (portrait/stories), 1:1 (square), 4:5 (Instagram portrait)
Quality markers (Midjourney): --quality 2, --stylize 750
Quality tokens (Stable Diffusion): "masterpiece, best quality, ultra-detailed"
Rendering: 8K resolution, photorealistic, hyperrealistic, cinematic

Starting Simple: Single Subject Prompts and How to Expand Them

The best way to learn prompt writing is to start with a single subject and layer complexity progressively. Here's a live example:

Iteration	Prompt	What Changed
1	a lighthouse	Starting point
2	a lighthouse on rocky cliffs	Added environment
3	a lighthouse on rocky cliffs during a storm	Added weather/mood
4	a lighthouse on rocky cliffs during a storm, dramatic waves crashing, oil painting	Added style
5	a lighthouse on rocky cliffs during a storm, dramatic waves crashing, oil painting, golden light breaking through clouds, low angle shot	Added lighting and composition
6	a lighthouse on rocky cliffs during a storm, dramatic waves crashing, oil painting by J.M.W. Turner, golden light breaking through storm clouds, low angle wide shot, highly detailed, impasto texture	Added artist reference and texture detail

Each iteration adds specificity without contradicting the previous elements. The final prompt will produce a dramatically more impressive result than the first — and you can see exactly why at each step.

Understanding How Different AI Models Interpret Prompts

Not all AI image generators work the same way. The same prompt will produce very different results across platforms, and understanding these differences saves you hours of frustration.

Midjourney

Midjourney responds well to aesthetic and emotional language. It's trained on high-quality curated art and photography, so it has strong aesthetic defaults. It uses parameter flags (--ar, --style, --chaos) after the main prompt and weights with double colons (::). Natural language descriptions work well. See our complete Midjourney prompt guide for deeper coverage.

Stable Diffusion

Stable Diffusion uses comma-separated token lists rather than natural language sentences. Quality tokens at the start of the prompt heavily influence output. It has a separate negative prompt field for excluding unwanted elements. Token weights like (important:1.3) give you fine-grained control. See our SD vs Midjourney vs DALL-E comparison for more.

DALL-E 3

DALL-E 3 (used in ChatGPT) understands natural language extremely well and follows instructions literally. It's the best model for beginners because you can write conversational prompts. It automatically refuses certain content and rewrites prompts internally for safety.

Flux

Flux (developed by Black Forest Labs) handles natural language like DALL-E 3 but produces images with more photographic realism. It's excellent for complex compositional scenes described in plain English. See our Flux AI prompt guide for model-specific tips.

Subject Vocabulary: What to Call Things

Using the right vocabulary in your prompts activates specific associations in the model's training data. Here are the terms that produce the most consistent results:

People

Age: toddler, child, teenager, young adult, middle-aged, elderly
General: person, man, woman, figure, silhouette, portrait subject
Roles: warrior, scientist, merchant, explorer, chef, musician
Expressions: smiling, contemplative, stoic, joyful, melancholy, fierce
Clothing: casual, formal, medieval armor, futuristic suit, Victorian dress

Places and Environments

Natural: forest, mountain range, ocean cliff, desert dunes, arctic tundra, tropical jungle
Urban: city street, rooftop, alleyway, subway station, market square
Interior: cozy cabin, gothic cathedral, minimalist apartment, ancient library, space station
Scale cues: vast, intimate, towering, cramped, sprawling

Objects and Props

When including objects, specify material, condition, and context: "a weathered leather journal with gold clasps" versus "a book." "A glowing orb of swirling blue energy, floating" versus "a ball."

Style Vocabulary: Photography, Illustration, Painting, 3D

Style vocabulary is where beginners can make the biggest gains. Here are specific terms that reliably produce distinct visual aesthetics:

Photography Styles

Portrait: studio portrait, environmental portrait, candid portrait, headshot
Landscape: landscape photography, long exposure, HDR photography
Documentary: street photography, photojournalism, documentary style
Commercial: product photography, editorial photography, fashion photography
Technical: macro photography, aerial photography, underwater photography

Illustration and Painting

Watercolor: loose watercolor, detailed watercolor illustration, botanical watercolor
Ink: pen and ink illustration, crosshatching, brush ink painting, manga style
Oil painting: classical oil painting, impressionist oil, alla prima, plein air
Digital illustration: flat vector illustration, character concept art, children's book illustration

3D and Digital

3D render: octane render, Blender 3D, Cinema 4D, unreal engine 5
Game art: pixel art, low poly art, isometric game art, concept art
Sci-fi/fantasy: digital painting, matte painting, cinematic concept art

Composition Terms That Actually Work

These compositional keywords reliably change how the subject is framed in the output:

Term	What It Does	Best For
close-up / extreme close-up	Fills frame with subject detail	Portraits, textures, details
medium shot / waist up	Shows subject from waist to head	Portrait, character art
full body shot	Shows entire person head to toe	Fashion, character design
wide shot / establishing shot	Subject small in environment	Landscapes, scenes
bird's eye view / top-down	Looking straight down	Maps, food, flat lay
worm's eye view	Looking straight up	Architecture, heroic poses
Dutch angle	Camera tilted diagonally	Tension, unease, action
rule of thirds	Subject off-center	Natural-feeling compositions
shallow depth of field	Background blurred (bokeh)	Portraits, product shots
symmetrical composition	Perfect mirror balance	Architecture, formal portraits

Lighting Terms That Transform Images

Lighting is the single most underused element in beginner prompts. Adding one specific lighting term can transform a flat, generic image into something cinematic.

Natural Lighting

Golden hour: warm orange-gold light, long shadows, romantic and cinematic feel
Blue hour: cool blue twilight just after sunset, atmospheric and moody
Overcast: soft diffused light, no harsh shadows, great for portraits
Harsh midday sun: high contrast, strong shadows, intense and energetic
Moonlight: cool silver-blue light, mysterious, low visibility

Artificial and Special Lighting

Studio lighting: controlled, professional, even light with fill and key lights
Rembrandt lighting: dramatic side light with triangular highlight on cheek
Neon lighting: colorful urban glow, cyberpunk aesthetic, color reflections
Candlelight / firelight: warm flickering orange, intimate and primal
Bioluminescence: glowing blue-green underwater or forest light
Volumetric light / god rays: visible light beams through atmosphere
Backlit / silhouette: subject dark against bright background

Adding Mood and Atmosphere

Mood words work as semantic shorthand that activates entire clusters of visual associations. A single mood word can change color palette, contrast, composition tendency, and even subject expression.

Epic / cinematic: wide shot, dramatic lighting, high contrast, sweeping scope
Serene / peaceful: soft light, muted palette, open space, gentle subject
Melancholy / somber: desaturated colors, overcast light, isolated subject
Whimsical / magical: pastel colors, sparkles, fantasy elements, soft focus
Gritty / raw: high grain, desaturated, urban, worn textures
Mysterious / ethereal: fog, mist, diffused light, ambiguous depth
Vibrant / energetic: saturated colors, dynamic composition, motion blur
Cozy / warm: warm tones, soft light, intimate framing, comfortable setting

Stage 1 AI prompt result — basic minimal prompt showing the AI's default interpretation without specific guidance — Basic prompt: generic, flat, uninteresting

Stage 5 AI prompt result — fully developed prompt with style, lighting, composition and mood producing a compelling image — Developed prompt: style, lighting, composition added — dramatic improvement

Your First Prompt: Step-by-Step Walkthrough

Let's build a complete prompt from scratch. The goal: a cinematic portrait of a female astronaut on an alien planet.

Step 1: Define the subject

"a female astronaut in a worn spacesuit"

Step 2: Add the environment

"standing on the surface of a red alien planet, jagged rock formations in the background, two moons visible in the sky"

Step 3: Choose a composition

"medium shot, low camera angle looking slightly up at her, rule of thirds"

Step 4: Define the lighting

"warm orange sunset light from the left, long shadows, rim light from a distant star"

Step 5: Pick a style

"cinematic photography, hyperrealistic, 8K, sharp focus"

Step 6: Add mood

"epic, solitary, awe-inspiring"

The Complete Prompt

        a female astronaut in a worn spacesuit standing on the surface of a red alien planet, jagged rock formations in the background, two moons visible in the sky, medium shot, low camera angle looking slightly up at her, warm orange sunset light from the left, long shadows, rim light from a distant star, cinematic photography, hyperrealistic, 8K, sharp focus, epic, solitary, awe-inspiring
      

This prompt will produce dramatically more impressive results than "an astronaut on a planet." Every word earns its place.

Common Beginner Mistakes and How to Avoid Them

Mistake 1: Using adjectives without nouns

"Beautiful, amazing, stunning" — these don't tell the model what looks beautiful. Instead: "beautiful detailed oil painting" or "stunning golden hour portrait photography."

Mistake 2: Asking for what you don't want

"A portrait without sunglasses" forces the model to think about sunglasses. Instead, just describe what you want: "a portrait, eyes visible and expressive." In Stable Diffusion, move unwanted elements to the negative prompt.

Mistake 3: Stacking contradictory styles

"Photorealistic watercolor 3D render illustration" — pick one or two compatible styles. Photorealistic and watercolor are opposites.

Mistake 4: Ignoring aspect ratio

A landscape scene in a square format will lose half its impact. Always specify aspect ratio when you know the intended use: --ar 16:9 for landscape, --ar 9:16 for portraits/stories, --ar 1:1 for social media.

Mistake 5: Changing everything at once

When an image doesn't turn out right, changing 10 things in your prompt makes it impossible to know what fixed it. Change one element at a time and iterate.

Mistake 6: Trusting only text description

If you have a reference image in mind, use it. Tools like ImageToPrompt can analyze any image and extract the exact prompt elements that define its style — which you can then adapt for your own project.

Practice Exercises: 5 Prompts to Try Right Now

The best way to internalize prompt writing is to practice. Here are five exercises that will stretch different skills:

Exercise 1: The Portrait Challenge

Write a portrait prompt using: one person type + one setting + one lighting type + one style. Then generate it, identify what you'd change, and iterate twice.

Starter: elderly fisherman, harbor at dawn, golden hour backlight, documentary photography

Exercise 2: The Style Swap

Take the same subject and generate it in 3 completely different styles. Note how much the style alone changes the feeling.

Subject: a cat sitting on a windowsill in rain → try: watercolor illustration, dark moody photography, neon-lit digital art

Exercise 3: The Lighting Study

Take one simple subject ("a wooden table with a vase of flowers") and generate it with 5 different lighting conditions. Compare the emotional difference.

Exercise 4: The Detail Escalation

Start with a 3-word prompt. Add elements one by one, generating after each addition, until you have 8+ elements. Document how each addition changed the output.

Exercise 5: The Reverse Engineer

Find an image online that you love. Use ImageToPrompt to extract its prompt. Study the extracted prompt to understand what makes that image work, then adapt it for a different subject.

Using ImageToPrompt to Learn From Images You Love

One of the fastest ways to level up your prompt writing is to analyze images that already look the way you want your images to look. ImageToPrompt does exactly this: you upload any image, and Claude Vision analyzes it to extract a detailed, usable AI generation prompt.

Here's how to use it as a learning tool:

Find images with aesthetics you want to replicate (on Behance, Pinterest, Artstation, etc.)
Upload them to ImageToPrompt
Read the extracted prompt carefully — note which elements create the style you love
Build a prompt template from the pattern you notice across multiple similar images
Adapt that template to your new subject

This workflow turns beautiful images into a personal prompt vocabulary. Within a week of consistent practice, you'll have a library of phrases that reliably produce the aesthetics you're after.

For model-specific guidance, see our deep dives: Midjourney Prompt Guide 2026, Stable Diffusion Prompt Guide, and our advanced prompt engineering guide.

Why Good Prompts Matter (And What Makes a Bad One)

The 5 Elements of Every Great AI Image Prompt

1. Subject

2. Style

3. Composition

4. Lighting

5. Technical Parameters

Starting Simple: Single Subject Prompts and How to Expand Them

Understanding How Different AI Models Interpret Prompts

Midjourney

Stable Diffusion

DALL-E 3

Flux

Subject Vocabulary: What to Call Things

People

Places and Environments

Objects and Props

Style Vocabulary: Photography, Illustration, Painting, 3D

Photography Styles

Illustration and Painting

3D and Digital

Composition Terms That Actually Work

Lighting Terms That Transform Images

Natural Lighting

Artificial and Special Lighting

Adding Mood and Atmosphere

Your First Prompt: Step-by-Step Walkthrough

Step 1: Define the subject

Step 2: Add the environment

Step 3: Choose a composition

Step 4: Define the lighting

Step 5: Pick a style

Step 6: Add mood

The Complete Prompt

Common Beginner Mistakes and How to Avoid Them

Mistake 1: Using adjectives without nouns

Mistake 2: Asking for what you don't want

Mistake 3: Stacking contradictory styles

Mistake 4: Ignoring aspect ratio

Mistake 5: Changing everything at once

Mistake 6: Trusting only text description

Practice Exercises: 5 Prompts to Try Right Now

Exercise 1: The Portrait Challenge

Exercise 2: The Style Swap

Exercise 3: The Lighting Study

Exercise 4: The Detail Escalation

Exercise 5: The Reverse Engineer

Using ImageToPrompt to Learn From Images You Love

Related Guides

Try It Yourself