List animals until failure
Potential LLM / cognition benchmark
- Several comments suggest using the game as an LLM benchmark: how many unique animals a model can list without repetition or invalid entries.
- Ideas for harder variants: require no token reuse at all, or enforce patterns (e.g., each token is an anagram of one N steps back) to test planning over long context windows.
- People speculate that “thinking” models might adopt strategies like alphabetical order or calling tools to track past outputs.
Implementation and data source
- The game is explicitly non-LLM: basic text parsing plus key–value tables, with main maps for lowercased titles and a taxonomy tree.
- Data ultimately comes from Wikidata, which explains deep coverage (e.g., tardigrades, obscure insects, dinosaurs) and oddities (joke entries like “drop bear”).
- There are extra tables for easter eggs and special responses; some users inspect hashes and discover specific strings they map to.
Easter eggs and personality
- Numerous special responses delight players: “Are you Australian?” for dingoes, special handling of “human,” jokes for unicorn, haggis, Obama, car, etc.
- Visual touches (background shifts, title color changes, clown and animal emojis) and a playful “JS disabled” message make it feel hand-crafted and personal.
Gameplay strategies and user experience
- Players report a wide range of scores (tens to a few hundred) and strong mental fatigue under the timer.
- Common strategies: alphabet (A–Z), grouping by biome (sea/forest/jungle), taxonomic groups (reptiles, birds, insects), extinct animals, or even using Pokémon as cues.
- Some use it as language practice; others note mobile input and UI lag as main difficulties.
Taxonomy, semantics, and inaccuracies
- Heated debates arise over equivalences: chipmunks vs squirrels, pigeon vs dove, frogs vs toads, buffalo vs bison, elk vs deer, dingo vs dog, parrot vs budgie, jellyfish vs Portuguese man o’ war.
- The system often treats general common names as parents of more specific ones, sometimes in ways users find wrong or unintuitive (e.g., “panther,” jellyfish vs siphonophore).
- This leads to semantic arguments about common vs scientific names, what counts as a “vegetable,” and whether colonies of zooids are “one animal.”
Reverse engineering and maximum score
- One commenter fully analyzes the internal dataset and rules: deduplications, “too specific” species, unreachable entries, and parent–child relationships.
- They compute a theoretical max score (~322k animals) and show that, with a custom script and data-structure optimizations, it can be achieved in seconds—though the in-game timer would still run for weeks.