GPT Image 1.5
Model quality & comparisons
- Many compare GPT Image 1.5 unfavorably to Nano Banana Pro (Gemini Pro Image):
- Common view: Nano Banana Pro still best for realism and editing; GPT 1.5 feels “70% as good” or “1–2 generations behind” in some tests.
- Some users find GPT 1.5 roughly on par in image fidelity but clearly worse in prompt adherence and “world model” (e.g., incorrect boat type, geometry, clocks).
- Others highlight GPT 1.5’s strong performance on user-voted leaderboards and specific benchmarks (e.g., image-edit and text-to-image arenas, GenAI Showdown), especially for steerability and localized edits.
- Midjourney is still preferred by several for style, creativity and aesthetic polish; OpenAI and Google are seen as skewed toward photorealism.
- Seedream models are mentioned as strongest for aesthetics; Nano Banana Pro for editing/realism; GPT Image 1.5 perceived as OpenAI doing “enough” to keep users from defecting.
Workflows and capabilities
- Strong enthusiasm around “previz-to-render” workflows: rough 3D/blockout → high-quality render while preserving layout, blocking, poses, and set reuse.
- GPT Image 1.x models praised for understanding scene structure and upscaling/repairing rough previz; Nano Banana Pro often preserves low-fidelity stand-ins instead of refining them.
- Desired future: precise visual editing like “molding clay” (pose dragging, object moving, relighting, image→3D and Gaussian workflows), consistent characters/styles, and better use of reference images.
- Some users report impressive niche capabilities: sprite sheets, pseudo-UV maps, app UI theming, image edits from textual design references.
Technical issues & rollout problems
- Complaints about API availability: announcement said “available today” but many get 500s or model-not-found; staggered rollout not clearly communicated.
- Latency: GPT 1.5 often ~20–25s vs <10s for competitors.
- Prior “yellow tint” / “urine filter” is widely discussed; theories include style-tuning artifacts, training data bias, or intentional branding; new model seems less affected but grading still “off” to some.
- Models still fail on basic visual reasoning (triangles, 13-hour clocks, FEN chessboards, specific spatial relationships).
Safety filters, bias & usability
- Nano Banana Pro’s safety training has made some image edits unusable (over-triggering on “public figures” or benign photos). GPT Image sometimes seen as more usable here but still very strict on copyright.
- Some report racial bias in competing models (e.g., auto-“Indianizing” a face), while GPT Image preserved appearance better in that case.
- Debate over generating images of children: allowed in both systems but heavily constrained; concerns about misuse vs benign family/“imagined children” use cases.
Watermarking, detection & authenticity
- OpenAI embeds C2PA/metadata; users can see AI tags via EXIF, but note metadata is easy to strip or bypass via img2img.
- Some argue watermarking creates a false sense of trust: absence of a watermark may be misread as “real”.
- Others want the opposite: cryptographically signed outputs from cameras and hardware-level provenance to confirm authenticity of real photos/videos.
- Consensus that robust detection of fakes at scale is likely impossible; best hope is partial mitigations and provenance for trustworthy sources.
Copyright, ownership & legal anxieties
- Strong backlash from some artists and photographers against their work being used in training without consent; they emphasize agency, association, and discomfort with venture-backed companies monetizing their work.
- Counter-voices dismiss concern (“if it’s online, expect it to be reused”) or see this as Schumpeterian destruction of a broken IP regime.
- One photographer found GPT output closely mimicked a rare photo they had taken, reinforcing fears about derivation.
- Speculation that large IP holders (Disney, etc.) will respond with aggressive licensing and platform-level demands, possibly restricting fan content and UGC.
- Others predict a “post-copyright era,” though this is contested; entrenched rights-holders are expected to fight hard.
Cultural impact, “fake memories” & trust
- Many are disturbed by the product framing: “make images from memories that aren’t real,” fabricate life events, or insert yourself with others (including celebrities or dead relatives).
- Concerns about parasocial uses, disingenuous social media, and deepening confusion between real and fake; some call this “terrifying” and say “truth is dead.”
- A minority is euphoric: sees this as “magical,” democratizing visual expression for non-artists and akin to a new computing era.
- Others foresee widespread use but mostly for “slop” (presentations, LinkedIn posts, propaganda), not deep creativity.
- Nostalgia for authenticity appears: hopes for analog photography comeback, imperfect hand-drawn aesthetics, and in-person verification of reality.
Ecosystem, business & energy
- Users welcome competition but question OpenAI’s long-term angle: is image/video just an expensive way to retain users in the AI “wars”?
- Skepticism about future price hikes despite current $20/month flat pricing; some expect consolidation and tacit collusion at the high end.
- Ethical/environmental unease about “burning gigawatts” for synthetic imagery vs arguments that energy should be made abundant so such uses don’t matter.