Run
One-shot lifecycle command that chains init → baseline → spawn → eval → merge in a single invocation.
Install
Quick install
npx skills add https://github.com/alirezarezvani/claude-skills/tree/main/engineering/agenthub/skills/runnpx skills add alirezarezvani/claude-skills --skill run --agent claude-codenpx skills add alirezarezvani/claude-skills --skill run --agent cursornpx skills add alirezarezvani/claude-skills --skill run --agent codexnpx skills add alirezarezvani/claude-skills --skill run --agent opencodenpx skills add alirezarezvani/claude-skills --skill run --agent github-copilotnpx skills add alirezarezvani/claude-skills --skill run --agent windsurfMore install options
Shorthand — useful for multi-skill repos:
npx skills add alirezarezvani/claude-skills --skill runManual — clone the repo and drop the folder into your agent's skills directory:
git clone https://github.com/alirezarezvani/claude-skills.gitcp -r claude-skills/engineering/agenthub/skills/run ~/.claude/skills//hub:run — One-Shot Lifecycle
Run the full AgentHub lifecycle in one command: initialize, capture baseline, spawn agents, evaluate results, and merge the winner.
Usage
/hub:run --task "Reduce p50 latency" --agents 3 \
--eval "pytest bench.py --json" --metric p50_ms --direction lower \
--template optimizer
/hub:run --task "Refactor auth module" --agents 2 --template refactorer
/hub:run --task "Cover untested utils" --agents 3 \
--eval "pytest --cov=utils --cov-report=json" --metric coverage_pct --direction higher \
--template test-writer
/hub:run --task "Write 3 email subject lines for spring sale campaign" --agents 3 --judge
Parameters
| Parameter | Required | Description |
|-----------|----------|-------------|
| --task | Yes | Task description for agents |
| --agents | No | Number of parallel agents (default: 3) |
| --eval | No | Eval command to measure results (skip for LLM judge mode) |
| --metric | No | Metric name to extract from eval output (required if --eval given) |
| --direction | No | lower or higher — which direction is better (required if --metric given) |
| --template | No | Agent template: optimizer, refactorer, test-writer, bug-fixer |
What It Does
Execute these steps sequentially:
Step 1: Initialize
Run /hub:init with the provided arguments:
python {skill_path}/scripts/hub_init.py \
--task "{task}" --agents {N} \
[--eval "{eval_cmd}"] [--metric {metric}] [--direction {direction}]
Display the session ID to the user.
Step 2: Capture Baseline
If --eval was provided:
- Run the eval command in the current working directory
- Extract the metric value from stdout
- Display:
Baseline captured: {metric} = {value} - Append
baseline: {value}to.agenthub/sessions/{session-id}/config.yaml
If no --eval was provided, skip this step.
Step 3: Spawn Agents
Run /hub:spawn with the session ID.
If --template was provided, use the template dispatch prompt from references/agent-templates.md instead of the default dispatch prompt. Pass the eval command, metric, and baseline to the template variables.
Launch all agents in a single message with multiple Agent tool calls (true parallelism).
Step 4: Wait and Monitor
After spawning, inform the user that agents are running. When all agents complete (Agent tool returns results):
- Display a brief summary of each agent's work
- Proceed to evaluation
Step 5: Evaluate
Run /hub:eval with the session ID:
- If
--evalwas provided: metric-based ranking withresult_ranker.py - If no
--eval: LLM judge mode (coordinator reads diffs and ranks)
If baseline was captured, pass --baseline {value} to result_ranker.py so deltas are shown.
Display the ranked results table.
Step 6: Confirm and Merge
Present the results to the user and ask for confirmation:
Agent-2 is the winner (128ms, -52ms from baseline).
Merge agent-2's branch? [Y/n]
If confirmed, run /hub:merge. If declined, inform the user they can:
/hub:merge --agent agent-{N}to pick a different winner/hub:eval --judgeto re-evaluate with LLM judge- Inspect branches manually
Critical Rules
- Sequential execution — each step depends on the previous
- Stop on failure — if any step fails, report the error and stop
- User confirms merge — never auto-merge without asking
- Template is optional — without
--template, agents use the default dispatch prompt from/hub:spawn
SKILL.md source
---
name: run
description: One-shot lifecycle command that chains init → baseline → spawn → eval → merge in a single invocation.
---
# /hub:run — One-Shot Lifecycle
Run the full AgentHub lifecycle in one command: initialize, capture baseline, spawn agents, evaluate results, and merge the winner.
## Usage
```
/hub:run --task "Reduce p50 latency" --agents 3 \
--eval "pytest bench.py --json" --metric p50_ms --direction lower \
--template optimizer
/hub:run --task "Refactor auth module" --agents 2 --template refactorer
/hub:run --task "Cover untested utils" --agents 3 \
--eval "pytest --cov=utils --cov-report=json" --metric coverage_pct --direction higher \
--template test-writer
/hub:run --task "Write 3 email subject lines for spring sale campaign" --agents 3 --judge
```
## Parameters
| Parameter | Required | Description |
|-----------|----------|-------------|
| `--task` | Yes | Task description for agents |
| `--agents` | No | Number of parallel agents (default: 3) |
| `--eval` | No | Eval command to measure results (skip for LLM judge mode) |
| `--metric` | No | Metric name to extract from eval output (required if `--eval` given) |
| `--direction` | No | `lower` or `higher` — which direction is better (required if `--metric` given) |
| `--template` | No | Agent template: `optimizer`, `refactorer`, `test-writer`, `bug-fixer` |
## What It Does
Execute these steps sequentially:
### Step 1: Initialize
Run `/hub:init` with the provided arguments:
```bash
python {skill_path}/scripts/hub_init.py \
--task "{task}" --agents {N} \
[--eval "{eval_cmd}"] [--metric {metric}] [--direction {direction}]
```
Display the session ID to the user.
### Step 2: Capture Baseline
If `--eval` was provided:
1. Run the eval command in the current working directory
2. Extract the metric value from stdout
3. Display: `Baseline captured: {metric} = {value}`
4. Append `baseline: {value}` to `.agenthub/sessions/{session-id}/config.yaml`
If no `--eval` was provided, skip this step.
### Step 3: Spawn Agents
Run `/hub:spawn` with the session ID.
If `--template` was provided, use the template dispatch prompt from `references/agent-templates.md` instead of the default dispatch prompt. Pass the eval command, metric, and baseline to the template variables.
Launch all agents in a single message with multiple Agent tool calls (true parallelism).
### Step 4: Wait and Monitor
After spawning, inform the user that agents are running. When all agents complete (Agent tool returns results):
1. Display a brief summary of each agent's work
2. Proceed to evaluation
### Step 5: Evaluate
Run `/hub:eval` with the session ID:
- If `--eval` was provided: metric-based ranking with `result_ranker.py`
- If no `--eval`: LLM judge mode (coordinator reads diffs and ranks)
If baseline was captured, pass `--baseline {value}` to `result_ranker.py` so deltas are shown.
Display the ranked results table.
### Step 6: Confirm and Merge
Present the results to the user and ask for confirmation:
```
Agent-2 is the winner (128ms, -52ms from baseline).
Merge agent-2's branch? [Y/n]
```
If confirmed, run `/hub:merge`. If declined, inform the user they can:
- `/hub:merge --agent agent-{N}` to pick a different winner
- `/hub:eval --judge` to re-evaluate with LLM judge
- Inspect branches manually
## Critical Rules
- **Sequential execution** — each step depends on the previous
- **Stop on failure** — if any step fails, report the error and stop
- **User confirms merge** — never auto-merge without asking
- **Template is optional** — without `--template`, agents use the default dispatch prompt from `/hub:spawn`
Related skills 6
caveman
Ultra-compressed communication mode. Cuts token usage ~75% by speaking like caveman while keeping full technical accuracy. Supports intensity levels: lite, full (default), ultra, wenyan-lite, wenyan-full, wenyan-ultra. Use when user says "caveman mode", "talk like caveman", "use caveman", "less tokens", "be brief", or invokes /caveman. Also auto-triggers when token efficiency is requested.
secure-linux-web-hosting
Use when setting up, hardening, or reviewing a cloud server for self-hosting, including DNS, SSH, firewalls, Nginx, static-site hosting, reverse-proxying an app, HTTPS with Let's Encrypt or ACME clients, safe HTTP-to-HTTPS redirects, or optional post-launch network tuning such as BBR.
readme-i18n
Use when the user wants to translate a repository README, make a repo multilingual, localize docs, add a language switcher, internationalize the README, or update localized README variants in a GitHub-style repository.
lark-shared
Use when first setting up lark-cli, running auth login, switching user/bot identity (--as), handling permission denied or scope errors, needing to update lark-cli, or seeing _notice in JSON output.
improve-codebase-architecture
Find deepening opportunities in a codebase, informed by the domain language in CONTEXT.md and the decisions in docs/adr/. Use when the user wants to improve architecture, find refactoring opportunities, consolidate tightly-coupled modules, or make a codebase more testable and AI-navigable.
paper-context-resolver
Optional RigorPilot helper for README-first deep learning repo reproduction. Use only when the README and repository files leave a narrow reproduction-critical gap and the task is to resolve a specific paper detail such as dataset split, preprocessing, evaluation protocol, checkpoint mapping, or runtime assumption from primary paper sources while recording conflicts. Do not use for general paper summary, repo scanning, environment setup, command execution, title-only paper lookup, or replacin...