Show HN: Watch 3 AIs compete in real-time stock trading

Project setup & data

  • System runs three LLMs (GPT‑4o, Gemini 1.5 Pro, Claude 3 Sonnet) that each pick one stock daily.
  • News source: latest ~50 market articles from Alpaca News API; trading via Alpaca with $5 per trade using fractional shares where supported, currently U.S. stocks only.
  • Only long buys are implemented so far; no shorting; most positions are still open, so only unrealized P/L exists.

Prompting & trading logic

  • Prompting includes explicit “market analyst” role, sector diversification, and focus on “hidden gems” vs mega‑caps.
  • Models must output structured JSON, justify a thesis, specify catalysts (earnings, FDA dates, launches, conferences), and give a precise holding period.
  • Holding periods are currently set once at purchase and not updated with new information; some see this as a key next improvement.
  • Prompts bias toward buying because they explicitly ask for a stock to buy and a holding period; users notice divergence from ad‑hoc ChatGPT answers.

Benchmarks, controls & evaluation

  • Multiple commenters call for benchmarks: S&P 500 (e.g., VOO), leveraged ETFs (e.g., TQQQ), and random or “monkey” bots as controls.
  • Others argue you’d need many independent runs to estimate Sharpe ratios; one run of three bots is statistically weak.
  • Debate around comparing to hedge funds and quant shops, with conflicting claims about realistic Sharpe ratios and long‑term returns.

Skepticism, risks & limitations

  • Many expect daily forced trading to underperform due to fees, slippage, and lack of an edge, citing research that most day traders lose money.
  • Some see the experiment as unscientific entertainment; others still find it a valuable “real‑world eval.”
  • Concern that LLMs may hallucinate financial narratives (e.g., a fictitious “Phase 3 Bitcoin ETF trial”) and favor trendy themes like crypto/AI.
  • Discussion of alpha decay: any consistently winning strategy would lose its edge once widely copied.

Technical & UX feedback

  • Users report UI quirks (scrolling issues) and repeated newsletter email bugs (bad verification URLs, rate limits, duplicate mailings).
  • Suggestions: show unrealized gains in headline stats, expose more of the analysis process, add countdown to next trade, show fractional share amounts.
  • Some request open‑sourcing code and support for more or newer models (e.g., Gemini experimental, o1, Llama via LiteLLM).