You have a photo — a landscape you shot on vacation, a portrait with perfect light, a product image from a competitor, a screenshot from a film. You want to generate something similar, or use that photo's aesthetic as the foundation for AI-generated images. But you don't know what prompt would produce that look.

This is one of the most common problems for anyone using AI image generation seriously, and there are three distinct methods for solving it. This guide covers all three — automatic extraction, manual analysis, and a hybrid approach — with specific guidance for different types of photos.

Why Real Photos Are Great References for AI Generation

Photographs contain information that's hard to specify from imagination alone:

The goal isn't to copy the photo — it's to extract the visual language so you can apply it to new subjects.

Method 1: Automatic Extraction Using ImageToPrompt

The fastest method: upload your photo to ImageToPrompt and let Claude Vision analyze it. The tool examines every visual element of the image and returns a structured prompt that captures the key characteristics.

Step-by-Step Walkthrough

  1. Prepare your photo. Any common image format works (JPEG, PNG, WebP). The tool handles everything from phone photos to high-resolution DSLRs. Clear images produce better prompts than blurry or highly compressed ones.
  2. Navigate to imagetoprompt.dev. The tool is free and requires no account.
  3. Upload your image. Drag and drop or use the file selector. The upload processes in a few seconds.
  4. Select your target model. Choose the AI generator you're planning to use — the prompt format differs between Midjourney, Stable Diffusion, DALL-E 3, and Flux. The tool adjusts its output accordingly.
  5. Review the generated prompt. The output is organized by visual element: subject, lighting, style, composition, and technical characteristics. You'll see what the model identified as the defining features of your photo.
  6. Copy and adapt. Use the prompt as-is for a similar image, or modify the subject while keeping the style, lighting, and composition elements to create something new.

What the Tool Extracts Well

What It Extracts Less Reliably

Method 2: Manual Photo Analysis

Manual analysis takes longer but teaches you the skill of visual reading — which ultimately makes you better at both prompting and photography. It's especially valuable when you want to deeply understand what makes a particular photo work.

The Systematic Analysis Framework

Work through these six dimensions in order. Take notes as you go — your final prompt is built from these notes.

1. Subject

What is the primary subject? Describe it without assuming context:

2. Environment

3. Lighting Analysis

This is the most important dimension. Train yourself to identify:

4. Color Analysis

5. Camera and Technical

6. Style and Medium Assessment

Building the Manual Prompt

Once you've worked through all six dimensions, assemble your notes into a prompt:

[STYLE/MEDIUM] + [SUBJECT DESCRIPTION] + [ENVIRONMENT] + [LIGHTING] + [CAMERA/TECHNICAL] + [MOOD]

Method 3: Hybrid — Auto-Extract Then Refine Manually

This method combines the speed of automatic extraction with the accuracy of manual analysis. It's the recommended approach for most use cases because it's both fast and produces the highest quality prompts.

The Hybrid Workflow

  1. Upload your photo to ImageToPrompt and generate an initial prompt
  2. Read the extracted prompt carefully — identify what it got right and what it missed or mis-described
  3. Apply the manual analysis framework (above) to the same image, focusing on the dimensions the auto-extract seemed weakest on
  4. Merge: keep the auto-extracted elements that seem accurate, replace or supplement with your manual observations
  5. Test the combined prompt, evaluate the output, iterate

This hybrid approach typically produces prompts 30–40% more accurate than auto-extraction alone, in a fraction of the time of full manual analysis.

Photo Categories: Specific Tips for Each Type

Portrait Photos → Character Prompts

Portraits carry enormous amounts of information that AI can use. The key elements to extract:

What to adapt: Replace the specific person's description with your desired character description, but keep all the lighting, style, and technical information.

Landscape Photos → Environment Prompts

Landscapes yield rich environmental vocabulary. Focus on:

Architecture Photos → Building and Scene Prompts

Food Photos → Product and Food Photography Prompts

Abstract and Detail Photos → Texture and Pattern Prompts

What Gets Lost in Translation

No prompt — manually written or auto-extracted — perfectly captures everything in a photograph. Understanding the limitations helps you compensate for them:

Emotion and Human Presence

A photograph of a real person carries the weight of that person's genuine emotion, history, and presence. AI prompts describe the visual surface. The "feel" of genuine emotion in a photo is extremely hard to prompt for and often results in AI-generated faces that look pleasant but hollow. Compensate by being very specific about expression and using mood language.

Specific People

AI cannot reproduce specific individuals from a prompt (without LoRA or reference image workflows). A prompt extracted from a photo of a specific person will produce a similar-feeling image with a different, AI-generated face.

Copyrighted or Trademarked Elements

Brand logos, trademarks, and copyrighted characters visible in a photo cannot and should not be included in prompts. Remove these from the extracted prompt or substitute generic descriptions.

Location-Specific Uniqueness

The specific character of a real place — the exact quality of the light at Santorini, the particular stone of Florence — can be approximated but not precisely replicated. Use the description as a guide and accept some variation.

Improving Photo-Derived Prompts for Each AI Model

ModelAdaptation Tip
MidjourneyAdd --style raw for maximum fidelity to prompt. Add --ar to match original photo proportions. Consider adding reference image with --sref or --iw for visual guidance alongside the text prompt.
Stable DiffusionConvert extracted description to comma-separated tokens. Add quality tokens at the front. Move unwanted elements to negative prompt. Add photography-specific tokens: "RAW photo, DSLR, photorealistic."
DALL-E 3Convert to a descriptive paragraph rather than a list. DALL-E 3 handles natural language well. Add "photograph" or "photography" to anchor the output style.
FluxNatural language works well. Be specific about technical elements — Flux handles "shot on Canon 5D at f/1.8, 85mm, golden hour" type descriptions effectively. See our Flux prompt guide.
Original reference photograph used as input to the ImageToPrompt tool — source image for AI prompt extraction workflow
Step 1: The original photo uploaded to ImageToPrompt
AI-recreated image generated from the prompt extracted from the original photograph — showing the image-to-prompt workflow result
Step 3: AI-recreated version using the extracted prompt

Real Examples: 5 Photos with Extracted Prompts

Example 1: Street Portrait

Photo: A man in his 60s photographed on a narrow European street, late afternoon, looking directly at camera, slight smile, shallow depth of field.

Extracted prompt:

environmental portrait, elderly man in his 60s, weathered kind face, slight warm smile, direct eye contact, standing on a narrow cobblestone street in a European old town, late afternoon golden light, warm golden hour, shallow depth of field, blurred buildings behind, documentary portrait photography, candid realism, 85mm lens feel, slight grain, film photography aesthetic

Adaptation (new subject): Replace "elderly man in his 60s" with "young woman in her 30s, dark hair, dark eyes, confident expression" — keep everything else.

Example 2: Mountain Landscape

Photo: Jagged mountain peaks at blue hour, snow on peaks, dark valley below, one star visible, deep blue-purple sky.

Extracted prompt:

landscape photography, dramatic mountain peaks with snow-capped summits, blue hour, deep blue-purple twilight sky, dark valley below, first stars appearing, silhouetted foreground rocks, cold and serene atmosphere, long exposure feel, no people, National Geographic quality, --ar 16:9

Example 3: Product Shot

Photo: A skincare serum bottle on white marble, soft diffused light from left, water droplets on bottle, minimal white background.

Extracted prompt:

luxury product photography, glass serum bottle, water droplets on surface, white marble surface with subtle veining, soft diffused light from left, soft shadow to right, white background, minimal styling, editorial beauty photography, clean and premium, shallow depth of field, sharp focus on bottle, commercial quality --ar 1:1

Example 4: Food Overhead

Photo: Overhead shot of a bowl of ramen, steam rising, wooden surface, chopsticks, warm ambient light.

Extracted prompt:

overhead food photography, bird's eye view, bowl of ramen with rich golden broth, soft-boiled egg halved, chashu pork slices, green onion, nori, steam rising, chopsticks resting on bowl edge, dark wooden table surface, warm ambient lighting, no harsh shadows, food editorial styling, rustic Japanese restaurant aesthetic, --ar 1:1

Example 5: Abstract Texture

Photo: Close-up of weathered concrete wall, peeling paint layers in teal and orange, cracks, aged texture.

Extracted prompt:

macro texture photography, weathered concrete wall surface, layers of peeling paint in teal and orange, exposed concrete underneath, cracks and imperfections, aged and worn surface, flat even lighting, no strong shadows, texture reference, close-up detail, abstract photography, film grain, muted palette --ar 1:1

Privacy and Ethics of Using Photos of Real People

Before uploading a photo containing identifiable people:

The most ethical and most effective use of photo-to-prompt tools is to extract aesthetic vocabulary — lighting, style, composition — from images and apply that vocabulary to new, fictional subjects.

For more on extracting prompts from images, see our complete image-to-prompt guide and reverse-engineering AI art prompts. For model-specific guidance on using extracted prompts, see our best image-to-prompt tools comparison.