Skip to main content

AI Setup Score

Rate your AI coding setup from 0 to 100

How good is your AI coding setup? Most developers have no idea. Config files might be missing, skills might reference paths that don't exist, and MCP servers might be misconfigured. Caliber's caliber score command rates your AI setup on a 0-100 scale with a letter grade (A-F), so you know exactly where you stand.

What the score measures

The Caliber score evaluates five categories, each weighted by its impact on AI tool effectiveness:

CategoryPointsWhat it checks
Files & Setup25 ptsAre config files, skills, and MCPs present?
Quality25 ptsBenchmarked against SkillsBench for content quality
Grounding20 ptsDo references match real paths in your codebase?
Accuracy15 ptsDo commands and file paths actually exist?
Freshness & Safety10 ptsNo leaked secrets, permissions set, configs updated recently

Each category is scored independently, then summed to produce your total score. The letter grade maps directly: A (90-100), B (70-89), C (50-69), D (30-49), F (0-29).

Before and after Caliber

Most projects start with a poor AI setup score, even if they already use AI coding tools daily. Here's what we typically see:

The jump from D to A happens because caliber bootstrap doesn't just create files — it validates every reference, benchmarks quality against SkillsBench, and ensures nothing is stale or insecure.

How to score your project

Two commands to go from unscored to optimized:

$ npx @rely-ai/caliber bootstrap

# Generates configs, skills, CLAUDE.md, .cursorrules, AGENTS.md

$ caliber score

# Output: Score 92/100 (A) — Files: 25/25, Quality: 23/25, Grounding: 19/20, Accuracy: 15/15, Safety: 10/10

Standardize AI readiness across your team

The score isn't just for individual developers. It gives teams a measurable way to track AI setup quality:

Related guides

Frequently asked questions

How is the Caliber score calculated?

Caliber scores your AI setup on a 0-100 scale across five categories: Files & Setup (25 pts) checks whether config files, skills, and MCPs are present; Quality (25 pts) benchmarks your configs against SkillsBench; Grounding (20 pts) verifies that references in your configs match real paths in your codebase; Accuracy (15 pts) confirms that commands and file paths actually exist; and Freshness & Safety (10 pts) checks for leaked secrets, correct permissions, and recently updated configs. The total maps to a letter grade from A to F.

What score should my team aim for?

Most projects without deliberate AI setup score between 20 and 40 (D-F range). After running caliber bootstrap, scores typically jump to 85-95 (A range). We recommend a minimum score of 70 (B) for teams that rely on AI coding tools daily. You can run caliber score in CI to enforce a threshold and prevent config drift over time.

Can I run caliber score in CI/CD?

Yes. Run 'npx @rely-ai/caliber score --ci' in your pipeline to get a machine-readable output with the numeric score and letter grade. You can set a minimum threshold (e.g., --min 70) and the command will exit with a non-zero code if the score falls below it. This helps teams enforce AI setup quality as part of their standard checks.

Score your AI setup in 30 seconds.

Get Started