2024-12-16

Veo 2: Our video generation model

Model quality and comparisons

Many find Veo 2 visually impressive, with some examples “stunning” and often preferred over Sora Turbo on shared prompts (e.g., pelican on a bicycle).
Others note clear artifacts: morphing/skating legs, non-physical object motion, uncanny faces, weird slow-motion feel, inconsistent adherence to prompts.
Benchmarks show Sora not clearly leading; Kling and Tencent’s Hunyuan are cited as competitive or better on some prompts.
Some argue this is “the worst it will ever be”; others doubt linear/exponential improvement will automatically lead to full movies or “holodecks.”

Access, openness, and cherry-picking

Frustration that Veo 2 / VideoFX are geo-restricted or behind waitlists; some say we should ignore closed, demo-only releases.
Several recall earlier Google models where internal access revealed heavy cherry-picking compared to glossy demos.
Others argue demos still meaningfully indicate progress, akin to early JWST images.

Compute and open-source ecosystem

Hunyuan, LTX, and other open(-ish) models already run on high-end consumer GPUs (e.g., 24 GB), though often with constraints and tricky setups.
Debate over whether open models (like Stable Diffusion/Flux in images) will dominate video versus closed players (Midjourney/ChatGPT-style).

Use cases and practical value

Near-term uses: b-roll, backgrounds, ads, meme/dank content, auto-generated music videos, stock-like footage, filler in games and websites.
Some are already using it in TV stations and for public advertising spots.
Skeptics question whether current limitations on continuity, character consistency, and control make it unsuitable for coherent narratives or serious production.

Creators, labor, and value of human-made work

Strong concern that video gen tools will displace videographers, VFX artists, animators, and YouTubers, shifting value and control to platforms like Google.
Disagreement over whether audiences will keep valuing “human-made” content once AI becomes indistinguishable, or whether non-synthetic will become a premium/artisanal niche.

Training data, platforms, and legality

Google’s access to YouTube is seen as a huge advantage; others note everyone can scrape it, legally or not.
Debate over whether human training on YouTube versus corporate model training are morally or legally different, especially around copyright and consent.

Misinformation, trust, and safety

Many worry hyperrealistic video will supercharge propaganda, election interference, and cults of personality, further eroding trust in media.
Suggestions include cryptographic signing of camera output and public education campaigns that “you can’t trust images/videos/audio anymore,” though others see this as technically or socially fragile.
Some argue similar fears existed for earlier technologies (print, photography, TV, Photoshop); others counter that ease, scale, and speed of modern generation are qualitatively new.

Societal and philosophical reactions

Threads explore accelerationism, capitalism-as-AI, and whether life is getting “worse” despite tech progress.
Split between those excited by democratized creativity and those who see “zero-effort slop,” porn, and ad content as the main outcome.
Persistent disagreement over whether these systems “understand” anything versus being sophisticated pattern predictors—and what that means for future AGI claims.

Related topics