Hypnex

Public eval · Morpheus AI inference network

Hypnex Bench
leaderboard.

Reproducible eval suite that runs each LLM on the public Morpheus inference API (api.mor.org/api/v1) against a fixed coding, math, and JSON-adherence probe set. Open MIT, audit trail in latest.json.

Models
0
Probes run
19
Generated
pending — first run TBD
live
# Model Pass Coding Math JSON p50 p95 Tokens
First canonical bench run pending

This page will populate after the first hypnex-bench run against api.mor.org. Run it yourself with pip install hypnex-bench && hypnex-bench run.

Run it yourself

pip install hypnex-bench
HYPNEX_API_KEY=mor_xxx hypnex-bench run
hypnex-bench leaderboard

A full run is ~19 probes per model (~$0.20 of MOR for the live LLM set). Get an API key at app.mor.org.

What's measured

  • Coding — 6 HumanEval-style probes; we exec the model's Python and assert correctness.
  • Math — 8 GSM8K-style word problems; deterministic numeric extraction.
  • JSON — 5 strict-schema probes; parseability + key/value match.
  • Latency — p50 / p95 wall-clock, including network.

Hypnex is community-built and not affiliated with the Morpheus AI Foundation. Probe sets are intentionally small + verifiable; for canonical claims, swap in the official suites — the runner architecture is the same.