2025-12-24

Asterisk AI Voice Agent

Perceived Uses & Abuse Potential

Many commenters immediately associate an Asterisk AI voice agent with more spam, scams, and “horrible customer support lines.”
A counter-use is proposed: running it as a honeypot to waste spam callers’ time (e.g., “Lenny”-style scripts).
Some fear businesses will use such systems to justify cutting remaining human support staff.

Caller Experience & Interface Preferences

Strong dislike for current voice-driven IVRs: unreliable in noisy environments, socially awkward to speak commands in public, and inconsistent UX.
Several prefer simple DTMF menus; many systems still accept keypad input even when pushing voice.
One practitioner notes that when IVR trees get complex, callers just mash “0” and demand a human; AI intent capture can be “less bad” than long menus in that scenario.

Legitimate Use Cases vs. “No Value”

People in the industry report large fractions of calls are simple or already self-serviceable (password resets, FAQ-style questions, “did you power cycle it?”, checking amenities) where an AI agent could help.
Concrete positive examples: dealership service line where an AI instantly books oil changes instead of putting callers on hold; potential hands-free tools for field staff or drivers.
Skeptics respond that if something is simple enough for AI, it should be a web/app flow instead; many only call when self-service has already failed or can’t be trusted.

Ethics, Deception & Expectations

Debate over background noise/“on brand” ambience: some see it as harmless branding; others call it deceptive, especially when it nudges callers to think it’s a human.
Strong sentiment that phone calls implicitly promise a human; using human-like voices without clear disclosure is framed as a “switcheroo” or scam.
Others push back: users mainly want to express needs in natural language; they don’t necessarily care about human vs machine if problems get solved.

Latency, Turn-Taking & Technical Challenges

Concern about the repo’s stated 2–3s latencies being “rage inducing.” Multiple commenters say SOTA can be sub-second, even ~300ms, with 2s+ causing hangups.
Practitioners report 500–1000ms as common and acceptable today, with major effort going into interruption handling and turn detection rather than raw speed.
Techniques discussed: streaming partial LLM output, chunking at punctuation, using fast TTS on short fragments, canceling/resyncing when the user interrupts, and blending with “thinking” sounds to mask latency.

Stacks, Deployment & VAD

Twilio integration: possible via SIP trunks or Twilio MediaStream WebSockets.
Pipecat mentioned as an open-source framework with many integrations (STT/LLM/TTS, turn detection model, state machine library); compared against proprietary players like Vapi/Retell/Sierra.
Deployment complexity is a pain point; some prefer Cloudflare Workers + Durable Objects with external STT (AssemblyAI/Deepgram with built-in VAD) and LLM/TTS for low-latency, low-cost scaling.
Discussion touches on where to keep conversation state (e.g., in Durable Objects) and compatibility with OpenAI-style realtime APIs.

Asterisk-Specific Notes & Nostalgia

Commenters are pleased to see Asterisk back on HN; default music-on-hold is widely recognized.
One person asks how to correlate CDRs with voicemail recordings for a unified dashboard; others suggest using channel vars, voicemail metadata files, or AGI/ARI logging.

Repository Style & Trust

Several note the GitHub repo looks heavily AI-generated: emoji-heavy headings, AI-like commit logs, Cursor traces.
This style triggers distrust for some: they assume docs may not be carefully reviewed, making them hesitant to rely on the project.

Overall Sentiment

Split roughly between:
- Enthusiasm from builders and some users who see real operational value in handling simple, high-volume calls.
- Deep skepticism from others who see little benefit to customers, expect more hostile support experiences, and are uncomfortable with human-mimicking automation on the phone.

Related topics