Autonomous Test Code Generation That Actually Works
Early's Repository Agent generated a year's worth of
tests in hours on a real open-source project, and we
measured the quality with EQS: Early Quality Score
1,876
tests
76%
coverage
53%
91%
tests
EQS
29%
91%

Why Test Coverage Isn’t Enough
No prompts, no human edits, just one autonomous Cl run
Results: What the
Agent Did on ts-morph
No prompts, no human edits, just one autonomous Cl run

EQS: One Number for Real Test Quality
What EQS Measures
Code Coverage
How much of the code do the tests execute
Mutation Score
How well tests catch real faults
Method-Scope Coverage (Early’s metric)
How many public methods have full, direct test coverage


Scan
Analyzes the repo and finds testable code
Plan
Write comprehensive
test cases
Generate& Fix
Implement the code and fix failures
Measure
Report coverage,
history, progress
Why This Benchmark Matters
Scale
Scale test code generations faster than code generation
Quality
Proves that high coverage with the agent correlates with high mutation score and high EQS
Confidence
Gives engineering leaders a data-backed way to trust Al-generated tests.
The Coding Corner
Every PR tested. Every Repository Guarded.
Automate high-quality test code generation, organization-wide











