XML is a cheap DSL

What “cheap” means for XML as a DSL

  • Many interpret “cheap” as low setup cost, not computational efficiency.
  • XML is seen as a ready-made parser/AST with ubiquitous libraries, XPath/XSLT/XQuery tooling, and widespread platform support.
  • For custom DSLs that must run in many environments, reusing XML parsing and tooling is viewed as a major win versus inventing and porting a bespoke language.
  • Others argue the “cheapness” is illusory when you factor in schema design, tooling complexity, and team learning curve.

XML vs JSON/YAML and other formats

  • Strong camp: JSON “just works” for APIs and data interchange; simple types, maps naturally to in-memory structures, requires less boilerplate than XML+SAX/DOM/XSD.
  • Counterpoint: JSON’s lack of schema, comments, richer types, and streaming/query tools shifts complexity into ad hoc validation code.
  • YAML is widely used but heavily criticized as footgun-prone, underspecified, and hard to parse safely.
  • Several mention S-expressions, EDN, Lisp, or eDSLs in languages like Haskell/OCaml/Scala as cleaner foundations for DSLs.
  • Some propose constrained or profiled subsets of XML or JSON to avoid “kitchen sink” complexity.

Schemas, validation, and correctness

  • XML with XSD/RELAX NG/Schematron is praised for structural guarantees and catching typos/shape errors early.
  • Others note schema validation can’t express all business rules; domain logic still needs code-level validation.
  • JSON Schema and tools like Zod are seen as bringing some of this rigor to JSON, though usage is uneven.

Parsing complexity and performance

  • Multiple comments say fully correct XML parsing (DTD, entities, namespaces, security hardening) is non-trivial and often slow or memory-heavy, especially with DOM.
  • Streaming approaches (SAX, pull parsers, visibly pushdown automata) help but are harder to program against.
  • Some argue most DSLs are small enough that performance isn’t the bottleneck; others blame “it’s fast enough” thinking for modern latency bloat.

DSL ergonomics, debugging, and alternatives

  • Angle-bracket DSLs are often seen as noisy and hard to author/debug compared to command/argument syntaxes or embedded DSLs in general-purpose languages.
  • Debugging XML-based DSLs and XSLT/XQuery was widely described as painful; many teams eventually rewrote logic in Python/Java/etc.
  • There’s recurring concern about DSLs becoming full programming languages (Greenspun’s rule) and accumulating hidden complexity.
  • Nonetheless, several real systems (tax engines, payroll formulas, e-invoicing, enterprise integration) already encode logic in XML, showing the approach is practical if not pleasant.