I ported JustHTML from Python to JavaScript with Codex CLI and GPT-5.2 in hours

LLM-assisted porting and the power of conformance tests

  • The thread sees this as a prime example of a specific new capability: automated porting of libraries when there’s a large, implementation-independent conformance suite.
  • The 9k+ html5lib tests are viewed as the key “oracle,” enabling an agent to iterate until it passes everything and achieving bug-for-bug compatibility.
  • Several commenters argue this pattern could generalize to many other ports and ecosystems, especially where solid specs and tests already exist.

Language-independent test formats & test generation

  • People ask whether there are standard, language-agnostic test formats; suggestions include TAP, Cucumber, tape, etc., though nothing emerges as a clear universal standard.
  • Others propose pipelines: use LLMs (and possibly fuzzing) to generate high-coverage tests from an existing implementation, then give another agent those tests to clone the behavior in a new language.
  • Skeptics note that achieving thorough coverage and “necessary and sufficient” tests is much harder than it sounds.

Open source, tests, and AI-era incentives

  • Some now keep tests private, partly to make AI-powered cloning harder; others take the opposite stance, seeing language-independent test suites as high-leverage public goods.
  • There’s a broader debate over whether sharing in the AI era is empowering collaboration or enabling “IP theft,” with SQLite’s private tests cited as a protective strategy.

Licensing, copyright, and derivative works

  • Multiple commenters argue that LLM ports are clearly derivative works: original licenses (especially MIT-style) must be preserved and original authors credited.
  • GPL is discussed as a moral line: even if “copyright laundering” via specs might be technically possible, some consider it ethically off-limits unless the port remains GPL.
  • Others highlight unresolved questions: whether LLM-assisted outputs are copyrightable at all, what counts as “sufficient human authorship,” and who owns code largely written by an AI vendor’s model.

Impact on software work and cost

  • Some readers see this as evidence that certain categories of coding (especially mechanical ports) are getting dramatically cheaper, potentially reducing demand for junior/mid engineers.
  • Others push back: this is an idealized case with a great spec, test suite, and preexisting API; most real-world projects lack these, involve evolving, fuzzy requirements, and still require deep human understanding.

Limits, generalization, and quality concerns

  • Commenters warn that success here depends on HTML being a very well-known domain for models and on small, well-scoped prompts; large, messy inputs still degrade quality.
  • Not all AI-assisted ports work this well; examples are shared where agents produced unshippable, subtly broken ports.
  • There’s concern about non-idiomatic target code: mechanical translation can produce ugly, flag-heavy structures that don’t fit the destination language’s style.

Specs, tests, and maintainability vs disposable code

  • Several people argue this reinforces a shift: specs + tests become the true source of truth; code is disposable and regenerable.
  • Others caution that maintainability still matters: in real systems, requirements and inputs evolve, and constant rewrites would be too risky and disruptive.
  • There’s also a fear that if tests/specs become the primary long-lived artifact, the “fun” and craft of coding may give way to writing tests and specs while agents write the code.

HTML5 parser ecosystem side-notes

  • Discussion highlights that multiple HTML5 parsers (Rust, Python, OCaml) share the same html5lib test corpus, underlining how powerful shared conformance suites can be.
  • Some note that html5lib tests sometimes diverge from real browser behavior (e.g., SVG namespacing), suggesting another avenue: systematically comparing those test suites against Chrome/Firefox.
  • The long-standing Firefox pipeline of maintaining an HTML parser in Java and mechanically translating it to C++ is raised as a similar, pre-LLM example of “code as compiled artifact,” with speculation that future toolchains could use TypeScript or other high-level sources similarly.

Ethics, responsibility, and disclosure

  • One commenter criticizes raising “Is it ethical/legal?” only after publishing, arguing that ethics should precede action.
  • The author’s position, as interpreted in the thread, is that this was a conscious line-walking experiment to demonstrate what’s now possible and spark debate, not a claim that this model of development is unambiguously good.