2026-01-02

Fighting Fire with Fire: Scalable Oral Exams

Cheating, Take‑Home Work, and Motivation for AI Oral Exams

Many see the core problem as take‑home work becoming trivial to complete with LLMs; thoughtful submissions often don’t reflect a student’s own understanding.
Some hiring anecdotes mirror this: candidates submit polished take‑home work they can’t later explain.
Supporters of the experiment frame AI‑run oral exams as a way to (a) tie assessment to each student’s project, and (b) force real‑time reasoning that’s harder to outsource to an LLM/friend.

Student Experience, Stress, and “Dehumanization”

Commenters highlight that most students in the article preferred written exams and found the AI oral exam much more stressful.
Many call the experience dehumanizing or disrespectful, especially given high tuition: paying six figures to be interrogated by a synthetic voice feels like professor abdication.
Others note oral exams are inherently stressful but argue that pressure is part of real‑world expectations; several people from countries with longstanding oral‑exam traditions report both benefits and harms, especially for anxious or non‑extroverted students.

Validity, Fairness, and Technical Concerns

Several worry that LLMs are non‑deterministic “black boxes” whose converging scores may be precise but not necessarily accurate or unbiased.
There’s skepticism that LLM‑driven questioning truly assesses understanding, especially when students can potentially route answers through their own AI (voice, teleprompters, hidden devices).
Some are concerned about bias against certain speech patterns, IRB/ethics oversight, and the lack of robust validation of grading quality beyond LLM self‑agreement.

Scalability vs. Human Teaching

One camp argues oral exams scale fine with TAs and reasonable staffing; the barrier is institutional priorities (admin, sports, amenities) rather than feasibility.
Others, especially from high‑load teaching environments or online programs, say hand‑graded or human‑oral assessments don’t scale with current enrollment and workloads; AI is seen as a survival tool.

Alternative Approaches

Revert to in‑person, invigilated written exams (often handwritten) and accept that as the “AI‑proof” baseline.
Use oral exams, but with human examiners, at least for a subset (e.g., high grades or project defenses).
Allow AI freely and curve grades so “LLM‑level” work is the floor; evaluate added value on top.
Focus on culture and enforcement: treat AI plagiarism like serious cheating with real penalties, instead of redesigning everything around it.

Larger Reflections on Education

Some see the entire arms race (students using AI, teachers countering with AI) as emblematic of universities drifting toward credential vending and “customer” mentality.
Others are cautiously optimistic about AI as a personalized teaching tool, but view using it as a high‑stakes examiner as premature and misaligned with educational goals.

Related topics