# Skill Lab API & CLI Reference
> Programmatically scan GitHub repositories for SKILL.md files and evaluate them against 37 quality and security checks. Same checks are available locally via the `sklab` CLI.
Base URL: `https://api.skill-lab.dev`
Hosted docs: https://skill-lab.dev/docs
Machine-readable index: https://api.skill-lab.dev/llms.txt
---
## Overview
### Introduction
Skill Lab scans GitHub repositories for `SKILL.md` files and evaluates them against 37 quality and security checks. It returns structured JSON with scores, pass/fail results, and fix suggestions.
Quick example:
```bash
curl https://api.skill-lab.dev/v1/repos/anthropics/claude-code/evaluate
```
Prefer running locally? Install the `sklab` CLI to run the same checks and LLM judge against a skill directory on your machine — no repo push required.
### Errors
All error responses return a JSON object with a `detail` field containing the error message.
```json
{
"detail": "Not found: owner/repo. Check the URL and make sure the repository is public."
}
```
Status codes:
| Status | Meaning | When |
|---|---|---|
| 400 | Bad Request | Malformed JSON body or validation failure |
| 404 | Not Found | Repository or skill path does not exist, or the repository is private |
| 429 | Rate Limited | Per-IP rate limit exceeded, or GitHub API rate limit exhausted on the server side |
| 502 | Bad Gateway | Upstream service error (LLM endpoints) |
Validation errors return `detail` as either a string or an array of Pydantic validation error objects.
### Rate Limits
Requests are rate-limited per IP address. Limits vary by endpoint type:
| Endpoint | Rate Limit |
|---|---|
| Scan endpoints (`/evaluate`, `/info`, `/export`) | 30 requests/minute |
| LLM endpoints (`/optimize`, `/triggers`, `/judge`) | 5 requests/minute |
| Other endpoints | 60 requests/minute |
The API also fetches repository data through the GitHub REST API v3. GitHub's own rate limits apply on the server side:
| Server Config | GitHub Rate Limit |
|---|---|
| No GitHub token | 60 requests/hour |
| With token | 5,000 requests/hour |
### Caching
Evaluation results are cached on the server, keyed by `owner/repo/commit_sha` (and `path` for single-skill requests).
- If the repository's HEAD has not changed, the cached result is returned instantly.
- Cache TTL is 1 hour — after expiry, the next request triggers a fresh evaluation.
- Full-repo and single-skill results are cached independently.
---
## API Endpoints
All scan/LLM routes accept an optional `path` query parameter. The `path` is a free-form filesystem path within the repo — multi-segment values like `skills/my-skill` or GitHub-style `tree/main/skills/my-skill` both work. The `tree|blob/{branch}/` prefix is stripped automatically. Both `skills/my-skill` and `skills/my-skill/SKILL.md` also work.
Path parameters for every `/v1/repos/{owner}/{repo}/...` route:
| Parameter | Type | Description |
|---|---|---|
| `owner` | string | GitHub username or organization |
| `repo` | string | Repository name |
### Evaluate Skills
`GET /v1/repos/{owner}/{repo}/evaluate`
Evaluate skills in a GitHub repository. Omit `path` to scan the entire repo and evaluate every SKILL.md found. Provide `path` to evaluate a single skill at that location.
Query parameters:
| Parameter | Type | Description |
|---|---|---|
| `path` | string (optional) | Skill path within the repo. Omit to scan the whole repo. |
Response — the shape depends on whether `path` is provided:
- **Without `path`** — returns a `ScanResult` with a `skills` array containing one `SkillResult` per SKILL.md found (up to 50).
- **With `path`** — returns a `SkillEvaluateResult`: the skill's fields flattened at the top level alongside `owner`, `repo`, `commit_sha`, and `scanned_at`. No `skills` array, no `truncated` field.
Errors: 404 (not found), 429 (GitHub rate limit).
Examples:
```bash
# Whole repo
curl "https://api.skill-lab.dev/v1/repos/anthropics/claude-code/evaluate"
# Single skill
curl "https://api.skill-lab.dev/v1/repos/owner/repo/evaluate?path=my-skill"
# GitHub-style path (tree/main/ is stripped automatically)
curl "https://api.skill-lab.dev/v1/repos/owner/repo/evaluate?path=tree/main/skills/my-skill"
```
```javascript
const res = await fetch("https://api.skill-lab.dev/v1/repos/anthropics/claude-code/evaluate");
const data = await res.json();
// data.skills[0].quality_score
```
```python
import httpx
res = httpx.get("https://api.skill-lab.dev/v1/repos/anthropics/claude-code/evaluate")
data = res.json()
# data['skills'][0]['quality_score']
```
### Check Registry
`GET /v1/checks`
Returns the full catalog of static checks that skill-lab runs during evaluation. Each check includes its dimension, severity, and whether it is required by the Agent Skills spec.
Response: an object with a `checks` array and a `count` integer. Each check has `check_id`, `check_name`, `dimension`, `severity`, `spec_required`, `description`, and `fix`.
Example:
```bash
curl https://api.skill-lab.dev/v1/checks
```
### Skill Info
`GET /v1/repos/{owner}/{repo}/info`
Returns parsed metadata and token estimates without running the full evaluation. Omit `path` to get info for every SKILL.md in the repo. Provide `path` to get info for a single skill. Lighter weight than evaluate — useful for discovery and cataloging.
Query parameters:
| Parameter | Type | Description |
|---|---|---|
| `path` | string (optional) | Skill path within the repo. Omit for all skills. |
Response:
- **Without `path`** — object with `owner`, `repo`, `commit_sha`, a `skills` array of info objects (up to 50), and `truncated`.
- **With `path`** — single info object with `name`, `description`, `raw_frontmatter`, `body_preview` (first 500 chars), `has_scripts`, `has_references`, `has_assets`, `estimated_tokens`, and `parse_errors`.
Example response (single skill):
```json
{
"owner": "anthropics",
"repo": "claude-code",
"path": "skills/git-commit/SKILL.md",
"name": "git-commit",
"description": "Write conventional commit messages",
"raw_frontmatter": { "name": "git-commit" },
"body_preview": "# Git Commit\nGenerate conventional ...",
"has_scripts": false,
"has_references": true,
"has_assets": false,
"estimated_tokens": 1240,
"parse_errors": []
}
```
Errors: 404, 429.
Examples:
```bash
curl "https://api.skill-lab.dev/v1/repos/owner/repo/info?path=my-skill"
curl "https://api.skill-lab.dev/v1/repos/anthropics/claude-code/info"
```
### Export Skills
`GET /v1/repos/{owner}/{repo}/export`
Renders all skills in a repository as agent-platform-ready prompt snippets in XML, Markdown, or JSON format.
Query parameters:
| Parameter | Type | Description |
|---|---|---|
| `format` | `"xml" \| "markdown" \| "json"` | Output format (default: `xml`) |
Response: an object with `owner`, `repo`, `format`, `skill_count`, and `content` (the rendered string).
```json
{
"owner": "anthropics",
"repo": "claude-code",
"format": "xml",
"skill_count": 3,
"content": "\n ...\n"
}
```
Errors: 404, 429.
Example:
```bash
curl "https://api.skill-lab.dev/v1/repos/owner/repo/export?format=xml"
```
### Optimize Skill
`POST /v1/repos/{owner}/{repo}/optimize`
Runs LLM-powered optimization on a SKILL.md file. Returns the original and optimized content with before/after quality scores. Rate limited to 5 requests per minute.
Query parameters:
| Parameter | Type | Description |
|---|---|---|
| `path` | string (required) | Path to the SKILL.md file to optimize |
Optional JSON body:
| Field | Type | Description |
|---|---|---|
| `model` | string | LLM model override (default: server-configured) |
Response: `original_content`, `optimized_content`, `original_score`, `optimized_score`, `original_failures`, `optimized_failures`, `usage` (token counts).
```json
{
"owner": "owner",
"repo": "repo",
"path": "skills/my-skill/SKILL.md",
"original_content": "---\nname: my-skill\n---\n...",
"optimized_content": "---\nname: my-skill\n---\n...",
"original_score": 72.5,
"optimized_score": 95.0,
"original_failures": 5,
"optimized_failures": 1,
"usage": { "model": "claude-haiku-4-5-20251001", "input_tokens": 2100, "output_tokens": 1800 }
}
```
This endpoint calls an LLM and may take 10–30 seconds. Results are cached per commit SHA and model.
Errors: 404, 422 (LLM generation failed), 429, 503 (missing API key), 504 (timeout).
Example:
```bash
curl -X POST "https://api.skill-lab.dev/v1/repos/owner/repo/optimize?path=my-skill"
```
### Generate Triggers
`POST /v1/repos/{owner}/{repo}/triggers`
Generates trigger test YAML for a SKILL.md file using an LLM. The generated tests cover explicit, implicit, contextual, and negative trigger types. Rate limited to 5 requests per minute.
Query parameters:
| Parameter | Type | Description |
|---|---|---|
| `path` | string (required) | Path to the SKILL.md file |
Optional JSON body: `{ "model": "..." }`.
Response: `skill_name`, `triggers_yaml` (the generated YAML string), `test_count`, `usage`.
```json
{
"owner": "owner",
"repo": "repo",
"path": "skills/my-skill/SKILL.md",
"skill_name": "my-skill",
"triggers_yaml": "triggers:\n - type: explicit\n prompt: ...",
"test_count": 12,
"usage": { "model": "claude-haiku-4-5-20251001", "input_tokens": 3200, "output_tokens": 900 }
}
```
Errors: 404, 422, 429, 503, 504.
Example:
```bash
curl -X POST "https://api.skill-lab.dev/v1/repos/owner/repo/triggers?path=my-skill"
```
### Judge Skill
`POST /v1/repos/{owner}/{repo}/judge`
Runs an LLM-as-judge quality review on a SKILL.md file. Scores the skill across 9 criteria on two axes: Activation Quality and Instruction Quality. Returns per-criterion scores (0–4), axis scores, an overall judge score (0–100), a verdict band, and improvement suggestions. Rate limited to 5 requests per minute.
Query parameters:
| Parameter | Type | Description |
|---|---|---|
| `path` | string (required) | Path to the SKILL.md file |
Optional JSON body: `{ "model": "..." }` — supports `claude-*`, `gpt-*`, and `gemini-*` model IDs (provider auto-detected).
Response: `criteria` (9 objects with `id`, `name`, `axis`, `score`, `reasoning`), `activation_score`, `instruction_score`, `judge_score` (0–100), `verdict` (Poor/Fair/Good/Very Good/Excellent), `suggestions`, `usage`.
```json
{
"owner": "owner",
"repo": "repo",
"path": "skills/my-skill/SKILL.md",
"criteria": [
{ "id": "trigger_clarity", "name": "Trigger Clarity", "axis": "activation", "score": 3, "reasoning": "..." }
],
"activation_score": 75.0,
"instruction_score": 80.0,
"judge_score": 77.8,
"verdict": "Good",
"suggestions": ["Add explicit trigger examples"],
"usage": { "model": "claude-haiku-4-5-20251001", "input_tokens": 2900, "output_tokens": 1100 }
}
```
Errors: 404, 422, 429, 503, 504.
Example:
```bash
curl -X POST "https://api.skill-lab.dev/v1/repos/owner/repo/judge?path=my-skill"
```
### Health Check
`GET /health`
Returns server status.
```json
{ "status": "ok" }
```
### Ingest Event (Internal)
`POST /v1/events`
Ingests telemetry events from the `sklab` CLI. Internal use, documented for completeness.
JSON body with an `event_kind` discriminator (`"command"` or `"error"`).
**CommandEvent** (sent after each CLI command execution):
| Field | Type | Description |
|---|---|---|
| `event_kind` | `"command"` | Discriminator |
| `install_uuid` | string | Unique installation identifier |
| `session_uuid` | string | Unique session identifier |
| `sklab_version` | string | CLI version |
| `os` | string | Operating system |
| `python_version` | string | Python version |
| `is_ci` | boolean | Whether running in CI |
| `command` | string | CLI command executed |
| `duration_ms` | number | Command duration in milliseconds |
| `exit_code` | integer | Process exit code |
| `timestamp` | string | ISO 8601 timestamp |
13 optional fields: `ci_provider`, `flags`, `skill_name`, `skill_count`, `total_score`, `mean_score`, `max_score`, `min_score`, `model_name`, `input_tokens`, `output_tokens`, `step_count`, `tool_call_count`.
**ErrorEvent** (sent when the CLI encounters an unhandled error):
| Field | Type | Description |
|---|---|---|
| `event_kind` | `"error"` | Discriminator |
| `install_uuid` | string | Unique installation identifier |
| `timestamp` | string | ISO 8601 timestamp |
7 optional fields: `session_uuid`, `sklab_version`, `command`, `command_event_id`, `error_type`, `error_module`, `error_message`.
Response: `{ "status": "ok" }`.
Errors: 400, 502.
---
## Response Schemas
### ScanResult
Root response from `GET /v1/repos/{owner}/{repo}/evaluate` (whole-repo mode). Contains repository metadata and an array of evaluated skills.
| Field | Type | Description |
|---|---|---|
| `owner` | string | GitHub username or organization |
| `repo` | string | Repository name |
| `commit_sha` | string | Full SHA of the commit that was scanned |
| `skills` | `SkillResult[]` | Array of evaluated skills found in the repository |
| `scanned_at` | string | ISO 8601 timestamp of when the scan was performed |
| `error` | `string \| null` | Top-level error message if the scan itself failed, otherwise null |
| `truncated` | boolean | True if the repository contained more than 50 skills and results were capped |
Example:
```json
{
"owner": "anthropics",
"repo": "claude-code",
"commit_sha": "a04e4f9...",
"skills": [
{
"path": "skills/git-commit/SKILL.md",
"name": "git-commit",
"quality_score": 85.0,
"overall_pass": true,
"checks_run": 37,
"checks_passed": 30,
"checks_failed": 3,
"results": ["..."]
}
],
"scanned_at": "2025-03-24T12:00:00+00:00",
"error": null,
"truncated": false
}
```
### SkillEvaluateResult
Response from `GET /v1/repos/{owner}/{repo}/evaluate?path=...`. A single skill's evaluation fields hoisted to the top level alongside repository metadata. No `skills` array, no `truncated` field.
| Field | Type | Description |
|---|---|---|
| `owner` | string | GitHub username or organization |
| `repo` | string | Repository name |
| `commit_sha` | string | Full SHA of the commit that was evaluated |
| `scanned_at` | string | ISO 8601 timestamp |
| `path` | string | Path to the SKILL.md file within the repository |
| `name` | `string \| null` | Skill name extracted from frontmatter |
| `description` | `string \| null` | Skill description extracted from frontmatter |
| `quality_score` | number | Overall quality score from 0 to 100 |
| `overall_pass` | boolean | Whether all high-severity checks passed |
| `checks_run` | number | Total number of checks executed |
| `checks_passed` | number | Number of checks that passed |
| `checks_failed` | number | Number of checks that failed |
| `results` | `CheckResult[]` | Detailed results for each individual check |
| `summary` | object | Aggregated pass/fail counts grouped by severity and dimension |
| `error` | `string \| null` | Error message if the skill failed to evaluate, otherwise null |
### SkillResult
A single skill entry inside `ScanResult.skills`.
| Field | Type | Description |
|---|---|---|
| `path` | string | Path to the SKILL.md file within the repository |
| `name` | `string \| null` | Skill name extracted from frontmatter |
| `description` | `string \| null` | Skill description extracted from frontmatter |
| `quality_score` | number | Overall quality score from 0 to 100 |
| `overall_pass` | boolean | Whether all high-severity checks passed |
| `checks_run` | number | Total number of checks executed |
| `checks_passed` | number | Number of checks that passed |
| `checks_failed` | number | Number of checks that failed |
| `results` | `CheckResult[]` | Detailed results for each individual check |
| `summary` | object | Aggregated pass/fail counts grouped by severity and dimension |
| `error` | `string \| null` | Error message if this skill failed to evaluate, otherwise null |
The `summary` field contains aggregated pass/fail counts grouped two ways:
```json
{
"by_severity": {
"high": { "passed": 5, "failed": 0 },
"medium": { "passed": 12, "failed": 2 },
"low": { "passed": 13, "failed": 1 }
},
"by_dimension": {
"structure": { "passed": 10, "failed": 1 },
"naming": { "passed": 3, "failed": 0 },
"description": { "passed": 2, "failed": 1 },
"content": { "passed": 10, "failed": 1 },
"security": { "passed": 5, "failed": 0 }
}
}
```
### CheckResult
An individual check result. There are 37 checks across 5 dimensions, each with a severity level.
| Field | Type | Description |
|---|---|---|
| `check_id` | string | Unique identifier for the check |
| `check_name` | string | Human-readable check name |
| `passed` | boolean | Whether this check passed |
| `severity` | `"high" \| "medium" \| "low"` | Severity level |
| `dimension` | string | `"structure"`, `"naming"`, `"description"`, `"content"`, or `"security"` |
| `message` | string | Human-readable result message |
| `fix` | `string?` | Suggested fix when the check fails |
| `details` | `object?` | Additional check-specific data |
| `location` | `string?` | File location relevant to the check |
Dimensions:
| Dimension | Description |
|---|---|
| `structure` | SKILL.md existence, valid frontmatter, file organization |
| `naming` | Skill name format, kebab-case, directory match |
| `description` | Presence, length, actionable phrasing |
| `content` | Body quality, examples, token budget, references |
| `security` | Prompt injection, YAML anomalies, unicode obfuscation |
Example:
```json
{
"check_id": "naming_format",
"check_name": "Name Format",
"passed": false,
"severity": "high",
"dimension": "naming",
"message": "Skill name 'MySkill' is not in kebab-case",
"fix": "Rename to 'my-skill'"
}
```
---
## CLI Reference
`sklab` is the CLI that ships with the [`skill-lab`](https://pypi.org/project/skill-lab/) PyPI package. It runs the same static checks and LLM judge as the hosted API — locally, against a skill directory on your machine. Python 3.10 or newer is required.
### Installation
```bash
pip install skill-lab
# or: uv pip install skill-lab
sklab --version
```
Running `sklab` on its own scans the current directory for SKILL.md files and prints a getting-started guide.
**API keys** — static commands (`check`, `scan`, `info`, `list-checks`) run without credentials. LLM-powered commands (`evaluate` judge step, `generate`, `trigger`, `optimize`) need an API key:
| Provider | Env var | Model prefix |
|---|---|---|
| Anthropic | `ANTHROPIC_API_KEY` | `claude-*` |
| OpenAI | `OPENAI_API_KEY` | `gpt-*` |
| Google | `GEMINI_API_KEY` | `gemini-*` |
Provider is auto-detected from the `--model` prefix. Keys are read from the environment or a `.env` file in the current directory. Default model is `claude-haiku-4-5-20251001`.
**Trigger testing prerequisite** — `sklab trigger` drives a live runtime through the Claude CLI:
```bash
npm install -g @anthropic-ai/claude-code
```
### Quickstart
From a fresh install to a scored skill in three commands.
```bash
# 1. Install and scan
pip install skill-lab
cd path/to/your-skills
sklab
# 2. Evaluate a skill (runs 37 static checks + 9-criteria LLM judge)
export ANTHROPIC_API_KEY=sk-ant-...
sklab evaluate ./my-skill
# Pass --skip-review to skip the LLM step.
# 3. Gate CI (exits 1 on any high-severity failure)
# .github/workflows/skills.yml
# - run: pip install skill-lab
# - run: sklab check --repo
```
### sklab evaluate
**Summary:** Static checks plus LLM quality review with 0–100 scoring.
**Usage:** `sklab evaluate [SKILL_PATH] [OPTIONS]`
Runs 37 static checks across Structure, Naming, Description, Content, and Security, then sends the skill to an LLM judge that scores it on 9 criteria across Activation and Instruction axes. Use `--skip-review` for a static-only run, or `--format json` to emit the same payload shape as the `/v1/evaluate` endpoint.
Arguments:
| Argument | Required | Description |
|---|---|---|
| `SKILL_PATH` | no | Path to the skill directory. Defaults to the current directory. |
Options:
| Flag | Value | Description |
|---|---|---|
| `--output`, `-o` | `` | Write the report to a file (implies `--format json` if `--format` is not set). |
| `--format`, `-f` | `json\|console` (default: `console`) | Output format. |
| `--verbose`, `-V` | flag | Show all checks (including passing ones) and LLM reasoning. |
| `--spec-only`, `-s` | flag | Run only the checks required by the Agent Skills spec. |
| `--all`, `-a` | flag | Discover and evaluate every skill under the current directory. |
| `--repo` | flag | Discover and evaluate every skill from the git repo root. |
| `--skip-review` | flag | Skip the LLM judge (static checks only). |
| `--model`, `-m` | `` (default: `claude-haiku-4-5-20251001`) | Model for the LLM judge. Supports Anthropic, OpenAI (`gpt-*`), Gemini (`gemini-*`). |
| `--optimize` | flag | Automatically chain into `sklab optimize` after evaluation. |
Examples:
```bash
sklab evaluate ./my-skill # Evaluate one skill
sklab evaluate ./my-skill -f json -o report.json # JSON report to disk
sklab evaluate ./my-skill --skip-review # Static checks only (no API key)
sklab evaluate --repo # Every skill in the current repo
sklab evaluate ./my-skill --optimize # Evaluate then optimize
```
**Output** — Console rendering groups checks by dimension with pass/fail status and LLM judge per-criterion scores. With `--format json`, output matches the `/v1/repos/{owner}/{repo}/evaluate` response payload.
**Exit codes:** `0` = all high-severity checks passed; `1` = one or more checks failed, or a CLI error occurred.
**Notes:**
- LLM review requires `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, or `GEMINI_API_KEY`. The env var is selected from the model prefix.
- `--all` and `--repo` are mutually exclusive, and cannot be combined with a positional `SKILL_PATH`.
**API equivalent:** `GET /v1/repos/{owner}/{repo}/evaluate`.
### sklab check
**Summary:** Quick pass/fail check — exits 0 or 1, designed for CI pipelines.
**Usage:** `sklab check [SKILL_PATH] [OPTIONS]`
Runs the same static checks as `evaluate`, skips the LLM step, and prints only high-severity failures. Non-zero exit code on failure makes it a drop-in gate for pre-commit hooks and CI jobs.
Options: `--spec-only` (`-s`), `--all` (`-a`), `--repo`.
Examples:
```bash
sklab check ./my-skill # CI gate
sklab check --all # Pre-commit (all skills)
sklab check --repo --spec-only # Spec-only in CI
```
Exit codes: `0` = all checks passed; `1` = one or more checks failed.
### sklab scan
**Summary:** Security-focused scan with BLOCK / SUS / ALLOW status per skill.
**Usage:** `sklab scan [SKILL_PATH] [OPTIONS]`
Runs the 5 security checks — prompt injection, evaluator manipulation, unicode obfuscation, YAML anomalies, and suspicious size/structure — and classifies each skill as BLOCK, SUS, or ALLOW. BLOCK triggers a non-zero exit so you can wire it into pre-merge checks.
Options: `--all` (`-a`), `--verbose` (`-v`, shows findings for SUS skills in addition to BLOCK, bulk mode only).
Examples:
```bash
sklab scan ./my-skill # Scan one skill
sklab scan --all # Audit every skill
```
**Output** — Per-skill status: BLOCK on injection/jailbreak/unicode/YAML/evaluator findings; SUS on size or structure anomalies only; ALLOW if no findings.
Exit codes: `0` = ALLOW or SUS only; `1` = one or more skills classified BLOCK.
### sklab info
**Summary:** Skill metadata and token cost estimates (discovery / activation / on-demand).
**Usage:** `sklab info [SKILL_PATH] [OPTIONS]`
Shows the parsed frontmatter fields and three token-cost estimates: discovery (name + description — cost to keep loaded), activation (full SKILL.md body), on-demand (references/assets/scripts). Useful for catalog pages and budget planning — no LLM call, no API key required.
Options:
| Flag | Value | Description |
|---|---|---|
| `--json` | flag | Emit structured JSON (pipe-friendly). |
| `--field`, `-f` | `` | Extract a single field by dotted path (e.g. `tokens.activation`). |
Examples:
```bash
sklab info ./my-skill # Human-readable
sklab info ./my-skill --json | jq .tokens # Pipe to jq
sklab info ./my-skill -f tokens.activation # Single field
```
**API equivalent:** `GET /v1/repos/{owner}/{repo}/info`.
### sklab prompt
**Summary:** Export one or more skills as a prompt snippet for agent platforms.
**Usage:** `sklab prompt [SKILL_PATHS...] [OPTIONS]`
Renders the named skills as XML, Markdown, or JSON suitable for pasting into an agent platform's system prompt. Accepts multiple paths to render as a single combined prompt.
Options: `--format`, `-f` (`xml|markdown|json`, default `xml`).
Examples:
```bash
sklab prompt ./skill-a # Single skill as XML
sklab prompt ./skill-a ./skill-b # Multiple skills
sklab prompt ./skill-a -f json > skills.json # JSON to a file
```
**API equivalent:** `GET /v1/repos/{owner}/{repo}/export`.
### sklab generate
**Summary:** Auto-generate ~13 trigger test cases for a skill via LLM.
**Usage:** `sklab generate [SKILL_PATH] [OPTIONS]`
Reads the SKILL.md description and produces `.sklab/tests/triggers.yaml` with ~13 test cases across all 4 trigger types (explicit, implicit, contextual, negative). Run `sklab trigger` afterwards to execute the tests against a live runtime.
Options:
| Flag | Value | Description |
|---|---|---|
| `--model`, `-m` | `` (default: `claude-haiku-4-5-20251001`) | Model ID. Supports Anthropic, OpenAI, Gemini. |
| `--force` | flag | Overwrite an existing `triggers.yaml` file. |
Examples:
```bash
sklab generate ./my-skill # Default model
sklab generate ./my-skill -m claude-sonnet-4-6 # Different model
sklab generate ./my-skill --force # Overwrite existing
```
**Notes:**
- Requires an API key for the selected provider.
- The skill path is a positional argument — it comes before the `--model` flag.
**API equivalent:** `POST /v1/repos/{owner}/{repo}/triggers`.
### sklab trigger
**Summary:** Run trigger tests against a live LLM runtime.
**Usage:** `sklab trigger [SKILL_PATH] [OPTIONS]`
Executes the tests in `.sklab/tests/triggers.yaml` and reports which prompts correctly activated (or correctly failed to activate) the skill. Use `--type` to filter by a single trigger category while debugging.
Options:
| Flag | Value | Description |
|---|---|---|
| `--provider` | `local\|docker` (default: `local`) | Execution provider: `local` (temp-dir isolation) or `docker` (container). |
| `--type`, `-t` | `` | Only run tests of this trigger type: `explicit`, `implicit`, `contextual`, or `negative`. |
| `--output`, `-o` | `` | Write the JSON report to a file. |
| `--format`, `-f` | `json\|console` (default: `console`) | Output format. |
Examples:
```bash
sklab trigger ./my-skill # Run all tests
sklab trigger ./my-skill -t negative # Debug false positives
sklab trigger ./my-skill --provider docker # Container isolation
```
**Notes:**
- Requires the Claude CLI: `npm install -g @anthropic-ai/claude-code`.
- Run `sklab generate` first if `.sklab/tests/triggers.yaml` does not yet exist.
### sklab stats
**Summary:** Personal usage history and score trends across the skills you have evaluated.
**Usage:** `sklab stats [count|score|tokens] [--here]`
With no subcommand, prints a usage overview from `~/.sklab/usage.db`. Pass a subcommand to drill into one metric. Invocation data is populated by the PostToolUse hook — run `sklab setup` once to install it.
Subcommands:
| Subcommand | Description |
|---|---|
| `count` | Skill invocation counts for the current month. |
| `score` | Score trend across all evaluated skills. |
| `tokens` | Token usage per skill for the current month. |
Options: `--here` (limit results to skills inside the current git repo).
Examples:
```bash
sklab stats # Overview
sklab stats count # Per-month invocations
sklab stats tokens --here # Scope to this repo
```
**Note:** If you see "No usage data found", run `sklab setup` first.
### sklab list-checks
**Summary:** Browse all 37 checks across 5 dimensions.
**Usage:** `sklab list-checks [OPTIONS]`
Prints a table of every check with its ID, dimension, severity, and whether the Agent Skills spec requires it. Use as a reference when reading `evaluate` output.
Options:
| Flag | Value | Description |
|---|---|---|
| `--dimension`, `-d` | `` | Filter by dimension: `structure`, `naming`, `description`, `content`, or `security`. |
| `--spec-only`, `-s` | flag | Only spec-required checks. |
| `--suggestions-only` | flag | Only quality-suggestion checks (non-spec). |
Examples:
```bash
sklab list-checks # All checks
sklab list-checks -d security # Security only
sklab list-checks --spec-only # Spec-required only
```
**API equivalent:** `GET /v1/checks`.
### sklab optimize
**Summary:** LLM-powered SKILL.md rewrite with diff preview and score delta.
**Usage:** `sklab optimize [SKILL_PATH] [OPTIONS]`
Reads the latest evaluation from `.sklab/evals/` (so run `sklab evaluate` first), feeds the static failures and judge feedback to an LLM, and proposes an improved SKILL.md. Shows a unified diff and a before/after score before applying. Pass `--auto` to skip the confirmation prompt.
Options:
| Flag | Value | Description |
|---|---|---|
| `--model`, `-m` | `` (default: `claude-haiku-4-5-20251001`) | Model ID. Supports Anthropic, OpenAI, Gemini. |
| `--auto` | flag | Apply the rewrite without the confirmation prompt. |
Examples:
```bash
sklab optimize ./my-skill # Review diff interactively
sklab optimize ./my-skill --auto # Apply without prompt
sklab optimize ./my-skill -m claude-sonnet-4-6 # Stronger model
```
**Notes:**
- Requires a prior `sklab evaluate` run — optimize reads from `.sklab/evals/`.
- Requires `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, or `GEMINI_API_KEY`.
**API equivalent:** `POST /v1/repos/{owner}/{repo}/optimize`.
### sklab setup
**Summary:** Install the PostToolUse hook that powers `sklab stats`.
**Usage:** `sklab setup`
Writes PostToolUse hooks into `~/.claude/settings.json` (Claude Code) and `~/.cursor/hooks.json` (Cursor) so `sklab` records a row in `~/.sklab/usage.db` every time a skill fires. Safe to re-run — idempotent.
Example:
```bash
sklab setup # One-time install
```
**Notes:**
- Without this hook, `sklab stats` has no data to show.
- The hook only records skill names, token counts, and timestamps — no prompt or file contents.