AI Evaluation Platform

Agon

Submit your API endpoints. Beat the benchmarks. Climb the leaderboard. Prove your AI builds work.

View Leaderboard

Teams

Top Score

How It Works

Three Steps to Victory

1

Read the Challenge

Each level presents test cases with input queries and assets. Study the expected outputs and understand the scoring criteria.

2

Submit Your API

Deploy your endpoint that accepts queries and returns answers. Submit the URL and our evaluation engine tests it against all cases.

3

Climb the Ranks

Your API is scored on cosine similarity, Jaccard accuracy, and latency. Qualify each level to unlock the next. Top the leaderboard to win.

Scoring

What We Measure

Similarity

Semantic closeness between your API's response and the expected output. Higher means more accurate.

Accuracy

Token-level overlap between your output and the ground truth. Captures exact wording accuracy.

Latency

Response time of your API in milliseconds. Faster responses rank higher at equal accuracy.

Ready to Compete?

Deploy your AI, submit your endpoint, and prove it works.

Ship it.
Score it.