AI Setup Score

Rate your AI coding setup from 0 to 100

How good is your AI coding setup? Most developers have no idea. Config files might be missing, skills might reference paths that don't exist, and MCP servers might be misconfigured. Caliber's caliber score command rates your AI setup on a 0-100 scale with a letter grade (A-F), so you know exactly where you stand.

What the score measures

The Caliber score evaluates five categories, each weighted by its impact on AI tool effectiveness:

Category	Points	What it checks
Files & Setup	25 pts	Are config files, skills, and MCPs present?
Quality	25 pts	Benchmarked against SkillsBench for content quality
Grounding	20 pts	Do references match real paths in your codebase?
Accuracy	15 pts	Do commands and file paths actually exist?
Freshness & Safety	10 pts	No leaked secrets, permissions set, configs updated recently

Each category is scored independently, then summed to produce your total score. The letter grade maps directly: A (90-100), B (70-89), C (50-69), D (30-49), F (0-29).

Before and after Caliber

Most projects start with a poor AI setup score, even if they already use AI coding tools daily. Here's what we typically see:

Before Caliber: scores of 20-40 (D-F range). Missing config files, no skills defined, MCP servers not configured, stale instructions that reference deleted files
After Caliber bootstrap: scores of 85-95 (A range). All configs generated, skills benchmarked, references grounded against real code, secrets excluded, permissions set

The jump from D to A happens because caliber bootstrap doesn't just create files - it validates every reference, benchmarks quality against SkillsBench, and ensures nothing is stale or insecure.

How to score your project

Two commands to go from unscored to optimized:

$ npx @rely-ai/caliber bootstrap

# Generates configs, skills, CLAUDE.md, .cursorrules, AGENTS.md

$ caliber score

# Output: Score 92/100 (A) - Files: 25/25, Quality: 23/25, Grounding: 19/20, Accuracy: 15/15, Safety: 10/10

Standardize AI readiness across your team

The score isn't just for individual developers. It gives teams a measurable way to track AI setup quality:

Set a team threshold - require a minimum score (e.g., 70) before merging PRs that modify AI configs
Run in CI - add caliber score --ci to your pipeline to catch config drift automatically
Track over time - monitor how your score changes as the project evolves and new developers join

Related guides

AI setup - configure all your AI coding tools with one command
Claude Code setup - CLAUDE.md, skills, AGENTS.md, MCP servers
AI skills - reusable task definitions for AI coding agents

Frequently asked questions

How is the Caliber score calculated?

Caliber scores your AI setup on a 0-100 scale across five categories: Files & Setup (25 pts) checks whether config files, skills, and MCPs are present; Quality (25 pts) benchmarks your configs against SkillsBench; Grounding (20 pts) verifies that references in your configs match real paths in your codebase; Accuracy (15 pts) confirms that commands and file paths actually exist; and Freshness & Safety (10 pts) checks for leaked secrets, correct permissions, and recently updated configs. The total maps to a letter grade from A to F.

What score should my team aim for?

Most projects without deliberate AI setup score between 20 and 40 (D-F range). After running caliber bootstrap, scores typically jump to 85-95 (A range). We recommend a minimum score of 70 (B) for teams that rely on AI coding tools daily. You can run caliber score in CI to enforce a threshold and prevent config drift over time.

Can I run caliber score in CI/CD?

Yes. Run 'npx @rely-ai/caliber score --ci' in your pipeline to get a machine-readable output with the numeric score and letter grade. You can set a minimum threshold (e.g., --min 70) and the command will exit with a non-zero code if the score falls below it. This helps teams enforce AI setup quality as part of their standard checks.

Score your AI setup in 30 seconds.

Get Started