2024-06-19

AI-powered conversion from Enzyme to React Testing Library

Project Overview & Approach

Slack migrated ~15k React tests from Enzyme to React Testing Library (RTL) using a pipeline that combines AST-based transforms and LLMs.
Rule-based codemods were tried first; complexity and edge cases exploded, so LLMs were added as a higher-level transformer on AST-structured input.
Manual review of converted tests remains required.

Automation Effectiveness (22% vs 80%)

One subset: ~2,300 tests → ~500 fully auto-converted and passing (22%). Slack equates this to ~22% developer-time savings for that slice.
Another evaluation: for selected files, about 80% of content was judged “accurately converted,” with ~20% needing manual intervention.
Commenters debate whether “80% automatically converted tests” is a fair summary, or whether the more honest headline is that only ~22% of tests were fully auto-migrated.

Codemods, Vim Macros, and “Dumb” Tools

Some argue that interactive text tools (e.g., Vim macros) or well-crafted AST codemods could achieve 60–80% coverage for this kind of mechanical migration.
Others counter that codemods were explicitly tried and hit complexity limits, and Vim macros are not more powerful than AST-based codemods, just more interactive.

Enzyme vs React Testing Library & Migration Strategy

Enzyme is considered effectively abandoned and tightly coupled to React internals; it doesn’t support newer React versions without major effort.
Some suggest it might have been cheaper to fund Enzyme/react-18 support or adopt community adapters.
Others argue maintaining such a deep-internals test framework is riskier and more expensive long term than moving to RTL’s more “user-centric” API.

Test Quality & “Are These Tests Still Testing Anything?”

Multiple commenters question using “test passes” as the main success metric; a migrated test might pass while no longer asserting the same behavior.
Mutation testing is raised as a way to evaluate whether the test suite still catches real defects.
Some share experience that RTL can make it easy to write tests that always pass unless carefully designed.

AST vs CST Terminology

One thread questions whether “AST” is used correctly, suggesting a CST-like structure is needed to preserve formatting.
Responses note that real-world syntax trees blur AST/CST boundaries; tools often keep only the syntax details they care about and rely on formatters to normalize whitespace.

Media & Hype Skepticism

Several comments criticize the secondary InfoQ article as spinning or misreading Slack’s numbers to hype AI.
Others point out that the InfoQ framing isn’t obviously wrong given Slack’s own ambiguous statistics and marketing tone.
Broader frustration is expressed with AI hype cycles and management pressure to showcase AI use, even when it’s only part of a larger, fairly standard migration effort.

Related topics