Lies I was told about collab editing, Part 1: Algorithms for offline editing

Nature of conflicts: syntax vs semantics

  • Many comments stress the difference between “mathematically conflict‑free” and “semantically correct” results.
  • CRDTs/OT can ensure eventual consistency, but cannot ensure that the merged text says what authors intended.
  • Some argue true resolution requires shared human context, goals, and even “politics” between collaborators; algorithms can only approximate.

Limits of CRDTs and “conflict-free”

  • Overlapping edits (e.g., editing a word that another user deleted) are highlighted as a fundamental hard case.
  • Several note that CRDTs guarantee convergence/commutativity, not meaning; conflict‑free at the data level often yields clearly wrong text.
  • There is interest in CRDTs that explicitly preserve conflicts (multi-value registers, conflict annotations) rather than silently auto‑merging.

Offline editing as UX/semantics problem

  • Long‑lived offline branches (e.g., edits on a plane later auto‑merged) cause surprising and unwanted results.
  • Multiple commenters suggest offline collaboration should look more like Git/Word: show diffs, mark conflicts, and require explicit human acceptance.
  • For many domains (legal, journalism, scientific writing), review workflows and explicit sign‑off are seen as essential.

Git, semantic diff, and alternative algorithms

  • Git is praised for explicit conflict marking but criticized for poor, low‑level diffs that ignore AST/semantic structure.
  • Interest in semantic diff/merge for code, circuits, layouts, and rich documents; some report past attempts (e.g., AST merges) proved very complex.
  • Alternative approaches mentioned: event‑graph‑based CRDTs, custom text sync algorithms, differential sync, and server‑ordered “rebase/prediction” instead of pure CRDT/OT.

Bringing conflicts into the data model

  • Several propose representing conflicts structurally inside the data (e.g., conflict ranges, lattice‑based models, conflict types like XOR/aggregate).
  • This would allow collaborative resolution over time and possibly better tooling, while retaining CRDT convergence guarantees.

LLMs for merge assistance

  • Some advocate using LLMs to merge conflicting edits, arguing they can infer intent better than traditional algorithms.
  • Others counter that LLMs are unreliable, hard to analyze, and don’t satisfy CRDT requirements like determinism and associativity.
  • Emerging consensus: LLMs might be helpful as a layer on top of explicit conflict detection, but not as a replacement for deterministic merge algorithms or human review.

Infrastructure and practicality

  • CRDT storage is reported to be heavy on relational databases (especially Postgres); key‑value or LSM‑tree stores are suggested as better fits.
  • Several note that long‑running offline merges are a niche but demanding use case; short‑term offline plus good conflict UI may be the pragmatic target.