Lies I was told about collab editing, Part 1: Algorithms for offline editing
Nature of conflicts: syntax vs semantics
- Many comments stress the difference between “mathematically conflict‑free” and “semantically correct” results.
- CRDTs/OT can ensure eventual consistency, but cannot ensure that the merged text says what authors intended.
- Some argue true resolution requires shared human context, goals, and even “politics” between collaborators; algorithms can only approximate.
Limits of CRDTs and “conflict-free”
- Overlapping edits (e.g., editing a word that another user deleted) are highlighted as a fundamental hard case.
- Several note that CRDTs guarantee convergence/commutativity, not meaning; conflict‑free at the data level often yields clearly wrong text.
- There is interest in CRDTs that explicitly preserve conflicts (multi-value registers, conflict annotations) rather than silently auto‑merging.
Offline editing as UX/semantics problem
- Long‑lived offline branches (e.g., edits on a plane later auto‑merged) cause surprising and unwanted results.
- Multiple commenters suggest offline collaboration should look more like Git/Word: show diffs, mark conflicts, and require explicit human acceptance.
- For many domains (legal, journalism, scientific writing), review workflows and explicit sign‑off are seen as essential.
Git, semantic diff, and alternative algorithms
- Git is praised for explicit conflict marking but criticized for poor, low‑level diffs that ignore AST/semantic structure.
- Interest in semantic diff/merge for code, circuits, layouts, and rich documents; some report past attempts (e.g., AST merges) proved very complex.
- Alternative approaches mentioned: event‑graph‑based CRDTs, custom text sync algorithms, differential sync, and server‑ordered “rebase/prediction” instead of pure CRDT/OT.
Bringing conflicts into the data model
- Several propose representing conflicts structurally inside the data (e.g., conflict ranges, lattice‑based models, conflict types like XOR/aggregate).
- This would allow collaborative resolution over time and possibly better tooling, while retaining CRDT convergence guarantees.
LLMs for merge assistance
- Some advocate using LLMs to merge conflicting edits, arguing they can infer intent better than traditional algorithms.
- Others counter that LLMs are unreliable, hard to analyze, and don’t satisfy CRDT requirements like determinism and associativity.
- Emerging consensus: LLMs might be helpful as a layer on top of explicit conflict detection, but not as a replacement for deterministic merge algorithms or human review.
Infrastructure and practicality
- CRDT storage is reported to be heavy on relational databases (especially Postgres); key‑value or LSM‑tree stores are suggested as better fits.
- Several note that long‑running offline merges are a niche but demanding use case; short‑term offline plus good conflict UI may be the pragmatic target.