I Built One Claude Skill That Generates 20 Paid Ads From One Product Photo (For My E-Commerce Brand)

I built a Claude Code skill that wraps GPT Image 2 and generates 20 realistic iPhone-UGC-style ads from a single product photo. Took 50 hours of testing to crack the realism formula. Now I run my

May 20, 2026

I Built One Claude Skill That Generates 20 Paid Ads From One Product Photo (For My E-Commerce Brand)

TL;DR: I built a Claude Code skill that wraps GPT Image 2 and generates 20 realistic iPhone-UGC-style ads from a single product photo. Took 50 hours of testing to crack the realism formula. Now I run my entire Mogano ad pipeline through it in 10 minutes instead of paying $300 per UGC video. This is the framework — 4 files, full realism rules, 10 natural-language commands.


The Problem With Prompting GPT Image 2

If you’ve ever tried to use GPT Image 2 (or any image gen model) for product ads, you know the failure mode. You prompt once. The output is okay. You prompt again. The output is different in a way you didn’t want. The skin gets too smooth. The lighting changes. The product looks subtly wrong. You cherry-pick the few good ones from a batch of 20 and call it a day.

Then you do that across 50 products. Now you have a full-time job.

I ran Mogano (a jewelry e-commerce brand) on that workflow for about three months. Each product needed creative for paid social. I’d prompt, regenerate, cherry-pick, edit, send to Meta. The unit economics worked because the alternative — UGC creators at $300 per video — was worse. But the time cost was brutal.

So I stopped prompting and built a skill instead.

The Shift: Stop Prompting, Start Building

This is the move I’ve been making across my entire AI workflow this year. Stop writing one-shot prompts. Start building Claude Code skills that hold the brand identity, the realism rules, and the natural-language interface in one place.

The difference is structural. A prompt is a one-time instruction. A skill is a wrapper that runs the same way every time. You build it once. It runs forever.

For ad creative, that means: brand identity, color rules, lighting preferences, model demographics, composition style, product fidelity — all locked into files Claude reads on every run. The model can’t drift because the rules are out of its conversational memory and into structural memory.

The 4-File Skill Structure

Every brand skill I’ve built has the same 4 files. Drop them in a folder, point Claude Code at the folder, and the skill is built.

1. about-brand.md — Brand identity. Who the customer is, what the product solves, the aesthetic, the tone. 200-400 words. This is what stops Claude from generating generic ads.

2. style-rules.md — Visual rules. Color palette (actual hex codes, not “modern”), lighting style (specific descriptions, not “good lighting”), composition preferences, model demographics, locations, props.

3. prompt-templates.md — Natural-language commands. “Generate 20 lifestyle ads / Generate 10 close-ups / Generate 5 unboxing shots.” Each command maps to a full prompt under the hood.

4. realism-rules.md — The realism enforcement layer. The non-negotiables that GPT Image 2 forgets unless you remind it on every call. This is where the iPhone-UGC magic lives.

The Realism Rules (This Is The Whole Game)

Without the realism file locked into your skill, GPT Image 2 will produce stock-photo-looking output. Smooth skin. Editorial lighting. Perfect framing. Looks like a Shutterstock result, not a phone shot.

The realism rules force the model to imitate the constraints of an actual phone camera.

  • Skin: “Uneven skin tone. Visible pores. Natural blemishes. No retouching. No glossy highlights.”
  • Lighting: “Available light only. No studio softbox. Soft shadows from window or overhead lamp. Mixed warm/cool light sources.”
  • Compression: “iPhone-grade compression artifacts. Slight noise in low-light areas. Not editorial quality.”
  • Composition: “Slightly off-center framing. Hand-held perspective. Not perfectly straight horizon.”
  • Product fidelity: “Product must be the exact item in the reference photo. Same colors, same proportions, same logo placement. Do not stylize or ‘improve’ the product.”
  • Model: “Model demographics match target customer. Natural expressions. Mid-action shots, not posed.”

Once those rules are locked into the skill file, the model can’t drift toward beauty. Your eye reads the constraints as real before your brain catches up.

How You Use It

The skill turns the workflow into conversation. You don’t write prompts. You just talk.

“Generate 20 lifestyle ads for [product]. Mix of indoor and outdoor. iPhone UGC style.”

That’s the entire input. The skill reads about-brand.md, style-rules.md, and realism-rules.md, assembles the full GPT Image 2 prompt under the hood, and runs the batch. Output: 20 images with the product locked in every frame, different faces, different angles, different settings, same realism baseline.

The 10 commands I keep in prompt-templates.md:

  1. “Generate 20 lifestyle ads. Mix of indoor and outdoor.”
  2. “Generate 10 close-up product shots. Hands holding only.”
  3. “Generate 10 user-generated selfies with [product]. iPhone front camera angle.”
  4. “Generate 10 morning routine shots featuring [product]. Soft window light.”
  5. “Generate 10 review-style shots. Person showing [product] to camera, casual setting.”
  6. “Generate 10 social-feed style shots. Composition matches IG Reels aspect.”
  7. “Generate 10 unboxing shots. Hands, packaging, natural background.”
  8. “Generate 10 before-after pairs. Person using [product] over time.”
  9. “Generate 10 group/social shots. Multiple people, [product] visible naturally.”
  10. “Generate 10 problem-state shots. Person before they had [product], frustrated.”

Every command is stable. You can run the same command 6 months from now and get the same quality output. That’s the structural advantage.

The Cost Math

Before the skill: $300 per UGC video, 3-5 day turnaround, 1 video at a time, cherry-pick the takes that work.

After the skill: $0.06 per image via kie.ai’s gpt-image-2 wrapper, 10-minute turnaround, 20 ads at a time, all usable because the realism rules are locked in.

For Mogano, that math worked out to roughly $5,000/month saved on creative production, plus speed gains that let me run 5x more variations through Meta’s algorithm and find winners faster. The skill paid for the 50 hours of building it within the first month.

Why The Skill Wins Long-Term

The deeper reason this works isn’t the cost savings. It’s the compounding edge.

Most brands build one ad campaign at a time. Each campaign is a one-shot creative effort. Skill-built brands build one skill and run unlimited campaigns. The first 50 hours of building the skill saves you the next 500 hours of prompting one ad at a time.

This is the difference between Level 3 (Operator) and Level 5 (Ghost) on the AI Adoption Ladder. Level 3 uses Claude to do work. Level 5 builds skills that do work while you sleep.

FAQ

Do I need to know how to code to build this?

No. The 4 files are all markdown. You write them like you’d write a brief for a contractor. The hardest part is being specific enough — “modern aesthetic” doesn’t work, but “warm-toned, kitchen-table flatlays with morning light” does. Plain English, just specific.

What if I’m using nanobanana (Gemini) instead of GPT Image 2?

Same structure works. Swap realism-rules.md for the model’s quirks (Gemini handles text rendering differently than gpt-image-2) and adjust the prompt-templates. The 4-file skeleton is model-agnostic.

How long until the skill is “done”?

You’ll iterate for 30-50 hours over the first 2-3 weeks. Then it stabilizes. Mine is in maintenance mode now — I tweak realism-rules.md once a quarter when the model updates, and that’s it.

Will this work for a brand that’s not e-commerce?

The 4-file structure works for any visual-content workflow. SaaS landing page screenshots, course content, info-product graphics, real estate listings, dating app photos. Anywhere you need consistent visual output across many variations.

What’s the catch?

You have to actually build the skill. Most operators read about this approach, agree it’s better, then go back to prompting one ad at a time because building the skill takes a weekend they don’t want to spend. The compounding edge is doing it now.


Want The Full Build Template?

Comment SKILL on the Reel for the full Brand Skill Builder template — the 4 files with copy-paste structure, the realism rules in full, the 10 natural-language commands, and the kie.ai/gpt-image-2 API setup. Or join the Actionable AI community to copy my entire ad creative stack (brand skill + ad copy skill + campaign launcher + creative auditor) running on Claude Code.