AI-powered conversion from Enzyme to React Testing Library
Project Overview & Approach
- Slack migrated ~15k React tests from Enzyme to React Testing Library (RTL) using a pipeline that combines AST-based transforms and LLMs.
- Rule-based codemods were tried first; complexity and edge cases exploded, so LLMs were added as a higher-level transformer on AST-structured input.
- Manual review of converted tests remains required.
Automation Effectiveness (22% vs 80%)
- One subset: ~2,300 tests → ~500 fully auto-converted and passing (22%). Slack equates this to ~22% developer-time savings for that slice.
- Another evaluation: for selected files, about 80% of content was judged “accurately converted,” with ~20% needing manual intervention.
- Commenters debate whether “80% automatically converted tests” is a fair summary, or whether the more honest headline is that only ~22% of tests were fully auto-migrated.
Codemods, Vim Macros, and “Dumb” Tools
- Some argue that interactive text tools (e.g., Vim macros) or well-crafted AST codemods could achieve 60–80% coverage for this kind of mechanical migration.
- Others counter that codemods were explicitly tried and hit complexity limits, and Vim macros are not more powerful than AST-based codemods, just more interactive.
Enzyme vs React Testing Library & Migration Strategy
- Enzyme is considered effectively abandoned and tightly coupled to React internals; it doesn’t support newer React versions without major effort.
- Some suggest it might have been cheaper to fund Enzyme/react-18 support or adopt community adapters.
- Others argue maintaining such a deep-internals test framework is riskier and more expensive long term than moving to RTL’s more “user-centric” API.
Test Quality & “Are These Tests Still Testing Anything?”
- Multiple commenters question using “test passes” as the main success metric; a migrated test might pass while no longer asserting the same behavior.
- Mutation testing is raised as a way to evaluate whether the test suite still catches real defects.
- Some share experience that RTL can make it easy to write tests that always pass unless carefully designed.
AST vs CST Terminology
- One thread questions whether “AST” is used correctly, suggesting a CST-like structure is needed to preserve formatting.
- Responses note that real-world syntax trees blur AST/CST boundaries; tools often keep only the syntax details they care about and rely on formatters to normalize whitespace.
Media & Hype Skepticism
- Several comments criticize the secondary InfoQ article as spinning or misreading Slack’s numbers to hype AI.
- Others point out that the InfoQ framing isn’t obviously wrong given Slack’s own ambiguous statistics and marketing tone.
- Broader frustration is expressed with AI hype cycles and management pressure to showcase AI use, even when it’s only part of a larger, fairly standard migration effort.