AI Setup Score
Rate your AI coding setup from 0 to 100
How good is your AI coding setup? Most developers have no idea. Config files might be missing, skills might reference paths that don't exist, and MCP servers might be misconfigured. Caliber's caliber score command rates your AI setup on a 0-100 scale with a letter grade (A-F), so you know exactly where you stand.
What the score measures
The Caliber score evaluates five categories, each weighted by its impact on AI tool effectiveness:
| Category | Points | What it checks |
|---|---|---|
| Files & Setup | 25 pts | Are config files, skills, and MCPs present? |
| Quality | 25 pts | Benchmarked against SkillsBench for content quality |
| Grounding | 20 pts | Do references match real paths in your codebase? |
| Accuracy | 15 pts | Do commands and file paths actually exist? |
| Freshness & Safety | 10 pts | No leaked secrets, permissions set, configs updated recently |
Each category is scored independently, then summed to produce your total score. The letter grade maps directly: A (90-100), B (70-89), C (50-69), D (30-49), F (0-29).
Before and after Caliber
Most projects start with a poor AI setup score, even if they already use AI coding tools daily. Here's what we typically see:
- Before Caliber: scores of 20-40 (D-F range). Missing config files, no skills defined, MCP servers not configured, stale instructions that reference deleted files
- After Caliber bootstrap: scores of 85-95 (A range). All configs generated, skills benchmarked, references grounded against real code, secrets excluded, permissions set
The jump from D to A happens because caliber bootstrap doesn't just create files — it validates every reference, benchmarks quality against SkillsBench, and ensures nothing is stale or insecure.
How to score your project
Two commands to go from unscored to optimized:
$ npx @rely-ai/caliber bootstrap
# Generates configs, skills, CLAUDE.md, .cursorrules, AGENTS.md
$ caliber score
# Output: Score 92/100 (A) — Files: 25/25, Quality: 23/25, Grounding: 19/20, Accuracy: 15/15, Safety: 10/10
Standardize AI readiness across your team
The score isn't just for individual developers. It gives teams a measurable way to track AI setup quality:
- Set a team threshold — require a minimum score (e.g., 70) before merging PRs that modify AI configs
- Run in CI — add
caliber score --cito your pipeline to catch config drift automatically - Track over time — monitor how your score changes as the project evolves and new developers join
Related guides
- AI setup — configure all your AI coding tools with one command
- Claude Code setup — CLAUDE.md, skills, AGENTS.md, MCP servers
- AI skills — reusable task definitions for AI coding agents
Frequently asked questions
How is the Caliber score calculated?
Caliber scores your AI setup on a 0-100 scale across five categories: Files & Setup (25 pts) checks whether config files, skills, and MCPs are present; Quality (25 pts) benchmarks your configs against SkillsBench; Grounding (20 pts) verifies that references in your configs match real paths in your codebase; Accuracy (15 pts) confirms that commands and file paths actually exist; and Freshness & Safety (10 pts) checks for leaked secrets, correct permissions, and recently updated configs. The total maps to a letter grade from A to F.
What score should my team aim for?
Most projects without deliberate AI setup score between 20 and 40 (D-F range). After running caliber bootstrap, scores typically jump to 85-95 (A range). We recommend a minimum score of 70 (B) for teams that rely on AI coding tools daily. You can run caliber score in CI to enforce a threshold and prevent config drift over time.
Can I run caliber score in CI/CD?
Yes. Run 'npx @rely-ai/caliber score --ci' in your pipeline to get a machine-readable output with the numeric score and letter grade. You can set a minimum threshold (e.g., --min 70) and the command will exit with a non-zero code if the score falls below it. This helps teams enforce AI setup quality as part of their standard checks.
Score your AI setup in 30 seconds.
Get Started