Facebook scraped every Australian adult user's public posts to train AI
Scope of Meta’s Data Use
- Meta trained AI models on many years of Facebook/Instagram public posts, including Australian adults’ content; some wonder if deactivated/deleted or restricted-audience posts are included (unclear from thread).
- Several argue this is unsurprising and consistent with Meta’s long‑standing business model and broad ToS licenses; others emphasize that public acknowledgment is new and important.
Public vs. Private, and “Consent”
- Strong debate over what “public” means:
- Some say if you post publicly, anyone (including Meta) can read and reuse it.
- Others counter that “Facebook public” is behind a login, not web‑public, and users didn’t foresee large‑scale AI training as a use.
- Many point out that click‑through ToS is not informed consent; people rarely grasp technical possibilities or legal implications.
Children’s Data and Age Issues
- Concern that self‑reported age is unreliable; under‑13 data may have been collected and used.
- Some highlight stricter legal/ethical standards for children’s data (e.g., COPPA‑type rules) and describe use of kids’ content as “icky” or “monstrous.”
Legal and Copyright Debates
- Split views on legality:
- One side: Meta has explicit license via its terms; training on public content should be allowed and is analogous to humans learning from reading.
- Other side: training on full works without permission is framed as mass copyright infringement, especially harmful to small creators whose styles can be replicated at scale.
- Multiple comments argue existing copyright law (infringing outputs, not training) should apply; others call for new laws tailored to AI.
Ethics, Privacy, and Power Imbalance
- Critics stress that “can view” ≠ “can do anything with”; compare to reusing a billboard or exploiting users’ data asymmetrically while forbidding users from scraping platforms.
- Some see this as another step in pervasive surveillance and data exploitation, with users unable to meaningfully opt out.
Data Quality and AI Outcomes
- Some mock the idea that 16 years of Facebook posting is “intelligence,” citing memes, low‑quality content, and engagement‑driven toxicity.
- Others reply that large social datasets are valuable for learning real-world language and behavior, though they also embed social biases.