OpenAI GPT Image 1.5: 4x Faster for ChatGPT
OpenAI launches GPT Image 1.5 with 4x faster generation, precise editing, and better instruction following. API pricing is 20% cheaper than v1.
Read Article →
OpenAI released ChatGPT Images 2.0 on April 21, 2026, the company’s first image model built on its O-series reasoning architecture. The model plans compositions, searches the web for context, and renders text at 99% accuracy across all scripts before generating a single pixel. Within 12 hours of launch, it claimed the #1 spot on the Image Arena leaderboard with a 1,512 Elo score, beating Google’s Nano Banana 2 by 242 points. That margin is the largest ever recorded on the benchmark. DALL-E 2 and DALL-E 3 will both be retired on May 12, 2026.
ChatGPT Images 2.0 researches prompts, plans spatial relationships, and verifies output quality before generating any visual. OpenAI describes it as a “visual thought partner” that uses the same reasoning layer powering its most advanced language models.
This reasoning comes from the O-series architecture. Before producing pixels, the model breaks down complex prompts into composition plans, identifies spatial relationships between elements, and can search the web for real-time reference material. The result is better handling of multi-element scenes, accurate text placement, and consistent visual identity across batched outputs.
Two access tiers exist. Instant mode ships to all ChatGPT users (including free accounts) with core quality improvements like better layouts and sharper text. Thinking mode unlocks the full reasoning pipeline: web search, multi-image batching (up to 8 coherent images per prompt), and output verification. Thinking mode requires a Plus ($20/month), Pro ($200/month), Business, or Enterprise subscription.
Plans composition, researches prompt context, and verifies output before creating any image
Near-flawless accuracy across Japanese, Korean, Chinese, Hindi, Bengali, and Latin scripts
One prompt generates up to 8 images with consistent character and object identity
Pulls real-time context for current events, products, and people (Thinking mode only)
Generate UI mockups, prototypes, and visual assets inside OpenAI's coding environment
Provenance information embedded in all generated images for content authenticity tracking
The multi-image capability is the one most likely to save time in practice. A single prompt can produce a set of social media assets, a storyboard sequence, or a product shot series where characters and objects stay visually consistent. Previously, each image had to be prompted individually and assembled by hand.
ChatGPT Images 2.0 is available across all ChatGPT subscription tiers, with capabilities scaling by plan. API access follows token-based pricing with per-image costs between $0.04 and $0.35 depending on prompt complexity and output resolution (up to 2K).
API expected to open to developers in early May 2026
| Access Level | Monthly Cost | Capabilities |
|---|---|---|
| Free | $0 | Instant mode: improved quality, better text rendering |
| Plus | $20/mo | Thinking mode: web search, multi-image, verification |
| Pro | $200/mo | Full capabilities, priority access |
| API (gpt-image-2) | Token-based | $8/M input, $30/M output, ~$0.04-$0.35/image |
OpenAI did not disclose the model’s architecture, describing it only as a “generalist model” without specifying whether it uses diffusion, autoregressive, or hybrid approaches. The knowledge cutoff is December 2025.
Images 2.0 cannot accurately render events, people, or products that appeared after December 2025 without supplementing its training data through live web search (Thinking mode only).
OpenAI is retiring both DALL-E 2 and DALL-E 3 on May 12, 2026, consolidating around Images 2.0 as the sole image generation model in ChatGPT. GPT-Image-1.5, the intermediate upgrade released in December 2025, stays available through the API for legacy integrations but is no longer the default.
The deprecation marks a clean architectural break. Instead of maintaining separate image models alongside its language models, OpenAI is unifying both under the same reasoning framework. Image generation becomes a built-in capability of GPT rather than a parallel system.
Multi-image batching with character consistency removes a friction point from design workflows. A marketing team can generate a family of social media assets or a storyboard spread from a single instruction without manually stitching separate outputs together.
The Codex integration is worth watching. Image generation now sits inside the same environment developers use for code, slides, and browser automation. That puts OpenAI in competition with Midjourney and Google on image quality and, separately, with Canva and Figma on workflow integration.
The benchmark results shift the competitive math. Midjourney, Stability AI, and Google now face a model with leading quality scores distributed across ChatGPT’s 200-million-plus user base. For most of 2026, OpenAI and Google had been trading the top leaderboard position within tight margins. A 242-point gap is a different kind of lead.
The model’s safety architecture (content filtering, C2PA metadata, and what OpenAI described as “ongoing monitoring”) also sets expectations for provenance standards. As regulatory scrutiny of synthetic media intensifies globally, embedding authenticity metadata at the generation stage may become the baseline, not a differentiator.
ChatGPT Images 2.0 is OpenAI's latest image generation model, released April 21, 2026. It is the first image model built on OpenAI's O-series reasoning architecture, which plans compositions and searches the web for context before generating images. It renders text at 99% accuracy across all languages and took the #1 spot on the Image Arena leaderboard within 12 hours of launch with a record 242-point lead.
Core quality improvements are available to all ChatGPT users, including free accounts, through Instant mode. Advanced features like reasoning, web search, multi-image generation (up to 8 images per prompt), and output verification require a ChatGPT Plus subscription ($20/month) or Pro subscription ($200/month). Business and Enterprise plans also include full capabilities.
DALL-E 2 and DALL-E 3 will both be retired on May 12, 2026. GPT-Image-1.5 (released December 2025) remains available through the API for legacy integrations. ChatGPT Images 2.0 replaces DALL-E as OpenAI's primary image generation system going forward.
ChatGPT Images 2.0 topped the Image Arena leaderboard with a 242-point lead, the largest margin ever recorded. Unlike Midjourney, which operates through Discord and a web interface without a public API, Images 2.0 is integrated into ChatGPT and Codex. Midjourney offers stronger community features and style presets, while Images 2.0 has advantages in text rendering, reasoning-driven composition, and ecosystem integration.
The API model identifier is gpt-image-2 with token-based pricing: $8 per million tokens for image input, $2 for cached input, and $30 per million tokens for image output. Per-image costs typically range from $0.04 to $0.35 depending on prompt complexity and resolution (up to 2K). The API is expected to open to developers in early May 2026.
OpenAI claims 99% text rendering accuracy across any language and script, including Japanese, Korean, Chinese, Hindi, and Bengali. This is a major improvement over DALL-E 3 and other AI image generators, which frequently distorted letterforms and produced gibberish. If this figure holds in independent testing, Images 2.0 becomes viable for production graphic design and marketing assets.