2024-08-01

Flux: Open-source text-to-image model with 12B parameters

Model variants, licensing & availability

Three variants discussed:
- FLUX.1 [schnell]: 4‑step, Apache 2.0, open weights, “fast” but slightly lower quality.
- FLUX.1 [dev]: open weights with non‑commercial license, guidance‑distilled.
- FLUX.1 [pro]: highest quality, API-only, closed weights.
Confusion and criticism around calling “dev” open source given usage restrictions; several argue “open weights” is more accurate.
Some uncertainty over what “guidance-distilled” means and how exactly dev/pro differ in practice.

Image quality, prompt adherence & comparisons

Many commenters find quality “remarkably good,” some saying dev/pro rival or exceed Midjourney 6.x and SD3, especially for photorealism and text-in-image.
Schnell is praised for speed and surprisingly good text rendering; also reveals watermarks/logos from training data more clearly.
Others note weak adherence in official examples (beach, cooking), missing requested elements, and vague “artsy” prompt wording.
Flux often fails at complex compositional prompts, spatial relations, negation (“no X”), engineering diagrams, precise layouts, and specific stylistic requests (e.g., certain fine‑art painters).

Hardware, local use & tooling

Official guidance: 12B parameters, ~24–33 GB VRAM typical; A100 not strictly required.
Reports of workable setups: 24–32 GB gaming cards, 32 GB V100, Jetson AGX Orin (slow), and even 8–12 GB VRAM with heavy offloading (very slow).
Mixed results on Apple Silicon due to bfloat16/MPS issues.
Popular frontends: ComfyUI, StableSwarmUI, Automatic1111; schnell/dev already integrated.

Censorship, NSFW & bias

Hosted endpoints apply NSFW filters; sometimes return black images. This is attributed to post‑inference classifiers, not the core model, but shows the model can generate NSFW internally.
Noted political bias: generic prompts like “a president” yield similar-looking specific figures.
Some users explicitly seek uncensored local use and expect fast NSFW fine‑tunes.

Data, IP & “open source” debate

Strong debate over whether models are copyrightable, whether licenses on weights are enforceable, and if models are derivative works of training data.
Concerns about learned logos/watermarks suggesting copyrighted sources; others argue training likely uses publicly visible, already‑quoted material.
Broader argument over misuse of the term “open source” for models without training data and with restrictive terms.

fal.ai UX, pricing & positioning

fal.ai clarifies it did not build Flux, only hosts optimized inference.
Mixed feedback: initial no‑login access later gated; GitHub-only login; prompts lost on sign-in; “low balance” emails despite free credits; unclear free-tier limits.

Related topics