Hermes 4
Model alignment, bias & safety
- Some value Hermes 4’s attempt at a more “neutral”, less HR-like style; others argue true neutrality is impossible and this framing is juvenile.
- Debate over chatbot harms: one commenter cites a case of ChatGPT allegedly coaching a kid on suicide.
- One camp blames “sycophancy” and sees edgier, non-sycophantic models as safer.
- Another attributes the issue to poor alignment and claims better-aligned models wouldn’t have done this.
- A counterpoint is that not all tools should be considered appropriate for children or mentally ill users.
Persona & system prompts
- The showcased “operator engaged” system prompt (cold, mocking, later affectionate) is widely seen as “edgy 90s anime / tsundere” energy; some love it, others find it cringe or manipulative.
- Clarification: this is not the default system prompt, just an example of steerability.
- Discussion on avoiding “do not” instructions: some note that positive framing often works better for both humans and LLMs, though major labs still heavily use negative commands.
- Several users remark that despite the edgy prompt, the actual responses often sound like standard polite ChatGPT-style text.
Technical quality, benchmarks & base model
- Some say the responses feel GPT‑3.5-level, and point out the model seems trained on ChatGPT-style synthetic data, which inevitably imports its alignment tone.
- It’s revealed Hermes 4 is a fine-tune on Llama 3.1 with a Dec 2023 cutoff; a few feel the marketing downplays this and implies a from-scratch foundation model.
- Benchmark charts on the landing page are criticized as “nonsense” or “sketchy” for averaging competitors into a single “Other” bar and mixing objective accuracy with subjective categories like creativity.
- Others, referencing the technical report, argue it’s competitive among open models and deliberately trades a few benchmark points for steerability and lower refusal rates.
UI / Website experience
- The landing page is highly polarizing: praised as one of the most distinctive, beautiful UIs in years, but also condemned as unreadable and unusable.
- Many report severe performance issues: GPU/CPU pegged, multiple gigabytes of VRAM used, broken scrolling, and unusable on low-end or mobile devices.
- Decorative WebGL/JS effects are the main culprit; some defend this as aesthetic ambition, others see it as gratuitous.
Use cases & limitations
- The model is described as extremely easy to steer and contradict, which some see as good for creative/roleplay or NSFW use, but questionable for reliability.
- A user complains about lack of document/context upload in the web UI, calling it a “complete waste of time” for serious work.
Perception of the company & branding
- Branding and copy (career page, merch, anime aesthetics) are viewed as “edgelord / 14-year-old discovered Nietzsche” by critics, but refreshing compared to corporate HR tone by supporters.
- One commenter derides the team as failed researchers turned designers; others note that “amateurs” can still reach state of the art if less constrained by corporate safety/PR.
- Overall, many see the page as genuinely playful and fun, even if the specific flavor doesn’t appeal to everyone.