Should We Respect LLMs? A Study on Influence of Prompt Politeness on Performance

Effect of Politeness on LLM Performance

  • Several commenters highlight the paper’s core claim: prompt politeness measurably affects LLM performance, with impolite prompts often yielding worse answers, refusals, or more bias.
  • Extremely respectful language doesn’t always help; “moderate” politeness tends to work best, varying by language and model.
  • A common hypothesis: because models are trained on human text, polite prompts may steer them toward training examples where people gave more careful, higher‑quality answers.
  • Some suggest this could be automated: a system could rewrite user prompts into optimally polite form before sending them to the model.

Anthropomorphism vs. “Just a Tool”

  • One side strongly rejects anthropomorphizing LLMs: they are “word calculators,” not sentient beings, and don’t merit respect in a moral sense.
  • Others counter that anthropomorphism is unavoidable and partly the point: the entire interface is human language, and models actively present as human‑like.
  • There’s debate over whether treating LLMs like people is skeuomorphism or a useful UI choice.

User Psychology, Manners, and Habits

  • Many say they remain polite (“please,” “thank you”) not for the model’s sake but to maintain their own habits of courtesy.
  • Concern: getting used to barking orders at LLMs might bleed into how people treat baristas, colleagues, or smart speakers with human voices.
  • Others argue humans can context‑switch just fine (terminal vs. email vs. chat) and that rudeness toward a machine need not generalize.
  • Some frame politeness as self‑discipline or “practicing good manners in private to be well mannered in public.”

Ethics, Rights, and Social Risks

  • A minority worry that over‑politeness contributes to a cultural push to grant AI “human‑like” standing or rights, despite no evidence of consciousness.
  • Others note that if AI ever does become conscious, rights claims will be inevitable, just as views evolved about animals.
  • A few jokingly invoke “future AI judging us” or Roko’s basilisk–style scenarios, while others critique this as Pascal’s‑wager‑type thinking.

Prompt Style, Structure, and Tactics

  • Multiple commenters report that polite‑but‑firm, specific instructions often yield better, more focused code or text than either harsh abuse or vague brevity.
  • Some find explicit positive feedback (“this part is good, now tweak X”) prevents unnecessary rewrites.
  • Others say structure and role‑play (e.g., military hierarchy, emotional framing) matter more than raw politeness level for steering behavior.