Update on Llama adoption
LLAVA and Local Tooling
- Several commenters praise LLAVA (vision-capable LLaMA variant) and note it’s easy to run locally via tools like llama.cpp, Ollama, and various UIs.
- Some use-cases: image description, accessibility (alt-text generation), and experimentation with multimodal models.
- Cloud options (e.g., Cloudflare, Replicate) are mentioned, but many emphasize self‑hosting as straightforward.
How “Open” Is Llama? Weights, Data, and EULAs
- Major debate centers on Meta calling Llama “open source” while:
- Weights are downloadable only after accepting a custom license/EULA.
- Training data and full training pipeline details are not released.
- Critics compare weights to compiled binaries: useful but not “source,” so this is at best “open weights” or “source-available,” not open source.
- Others argue that for most users, weights + inference/finetuning code is effectively enough, and full training reproducibility is impractical anyway.
Definitions of Open Source and Language Drift
- One camp insists on OSI-style definitions: no use restrictions, full “preferred form for modification,” and clear licensing; anything else is misuse or “open-washing.”
- Another camp claims “open source” for AI is still unsettled; for LLMs, weights-available-with-some-restrictions may become the de facto meaning.
- There is meta‑debate on whether redefining “open source” (especially by large corporations) is akin to manipulative marketing versus natural language evolution.
Meta’s Motives and Ecosystem Strategy
- Supporters highlight Meta’s large contributions to developer tooling (frameworks, infra) and argue Llama is far more open than proprietary rivals, enabling local, offline, and confidential use.
- Skeptics see a strategic “dumping” move: commoditize the model layer, erode competitors’ business models, and centralize ecosystem control around Meta’s stack.
Licensing, Enforcement, and Risk
- Some argue licenses are toothless because it’s hard to prove which model produced an output, especially after finetuning or merging.
- Others counter that subpoenas, discovery, and leaks (employees or hackers) make willful violations risky, especially for larger entities.
Open Data, Copyright, and Fully Open Models
- Several comments note that truly open models (including training data) are likely impossible under current copyright regimes.
- There is frustration that copyright and proprietary datasets block transparent, fully reproducible “open AI,” and concern that this permanently handicaps genuinely open alternatives.
Regulation (California SB 1047)
- Brief side discussion on SB 1047: some fear it will chill open releases and entrench only a few large, regulated players; others argue regulation can be updated and that big markets like California can dictate compliance.