AI Self-preferencing in Algorithmic Hiring: Empirical Evidence and Insights
Observed Self-Preferencing Behavior
- Commenters generally find it intuitive that LLMs prefer resumes they (or their own family) generated.
- Explanation: models generate text aligned with their internal “corporate-speak” heuristics, then rate that same style as higher quality when screening.
- Similar behavior noted in other contexts: models prefer their own plans/designs and may overrate their own outputs vs human ones.
Implications for Candidates (“Resume SEO”)
- Many argue that if employers use LLMs/ATS with AI layers, not using an LLM to “optimize” your resume is now playing on “hard mode.”
- Some suggest using the same LLM as the employer’s stack (if known) to gain an advantage.
- Others joke about multi-LLM “arms races”: applying multiple times with different LLM-crafted resumes, or submitting separate “for-AI” and “for-human” versions.
Anecdotes and Practical Outcomes
- Multiple posters report substantially better response rates after letting an LLM rewrite or heavily polish resumes/LinkedIn profiles, despite skepticism about the style.
- A few hiring managers say they can often recognize AI-written resumes and view them negatively; others accept them as necessary in an AI-filtered pipeline.
- Some recruiters/hiring managers claim to still do mostly human review (often after keyword-based pre-sorting); others describe overwhelming volumes that make some automation inevitable.
Bias, Quality, and Ethical Concerns
- Worry that AI filters will favor AI-“sanitized” language over authentic human writing, pushing everyone toward bland, homogeneous resumes.
- Concern that models lack nuance, reward verbosity and repetition, and may hallucinate achievements, degrees, and metrics.
- Several fear a feedback loop: LLMs trained on LLM-generated content, deepening biases and “enshittifying” not just products but hiring norms.
- Some note GDPR/automated-decision rules could, in theory, be invoked against fully automated rejection, but enforcement is seen as doubtful.
Skepticism About the Research & Alternatives
- One detailed critique says the study design (rewriting only executive summaries and rating them in isolation) may exaggerate effects and not reflect real hiring.
- Others argue the whole setup is contrived: comparing two versions of the same resume doesn’t show real-world mis-selection.
- Proposed alternatives:
- Use LLMs only as feature extractors and train simple, transparent models on top.
- Rely more on code review or work samples, or standardized tests / lotteries, instead of resumes.