NEW Browse AI tools across categories — updated daily. See what's new →
★ Featured Development

Run

One-shot lifecycle command that chains init → baseline → spawn → eval → merge in a single invocation.

Version1.0.0
LicenseMIT
Token count~917
UpdatedJun 4, 2026

Install

Quick install

via npx skills · works with 57+ agents
npx skills add https://github.com/alirezarezvani/claude-skills/tree/main/engineering/agenthub/skills/run
Or pick agent:
npx skills add alirezarezvani/claude-skills --skill run --agent claude-code
npx skills add alirezarezvani/claude-skills --skill run --agent cursor
npx skills add alirezarezvani/claude-skills --skill run --agent codex
npx skills add alirezarezvani/claude-skills --skill run --agent opencode
npx skills add alirezarezvani/claude-skills --skill run --agent github-copilot
npx skills add alirezarezvani/claude-skills --skill run --agent windsurf
More install options

Shorthand — useful for multi-skill repos:

npx skills add alirezarezvani/claude-skills --skill run

Manual — clone the repo and drop the folder into your agent's skills directory:

git clone https://github.com/alirezarezvani/claude-skills.git
cp -r claude-skills/engineering/agenthub/skills/run ~/.claude/skills/
How to use: Once installed, ask your agent to "use the run skill" or describe what you want (e.g. "One-shot lifecycle command that chains init → baseline → spawn → eval → merge in"). Requires Node.js 18+.

/hub:run — One-Shot Lifecycle

Run the full AgentHub lifecycle in one command: initialize, capture baseline, spawn agents, evaluate results, and merge the winner.

Usage

/hub:run --task "Reduce p50 latency" --agents 3 \
  --eval "pytest bench.py --json" --metric p50_ms --direction lower \
  --template optimizer

/hub:run --task "Refactor auth module" --agents 2 --template refactorer

/hub:run --task "Cover untested utils" --agents 3 \
  --eval "pytest --cov=utils --cov-report=json" --metric coverage_pct --direction higher \
  --template test-writer

/hub:run --task "Write 3 email subject lines for spring sale campaign" --agents 3 --judge

Parameters

| Parameter | Required | Description |
|-----------|----------|-------------|
| --task | Yes | Task description for agents |
| --agents | No | Number of parallel agents (default: 3) |
| --eval | No | Eval command to measure results (skip for LLM judge mode) |
| --metric | No | Metric name to extract from eval output (required if --eval given) |
| --direction | No | lower or higher — which direction is better (required if --metric given) |
| --template | No | Agent template: optimizer, refactorer, test-writer, bug-fixer |

What It Does

Execute these steps sequentially:

Step 1: Initialize

Run /hub:init with the provided arguments:

python {skill_path}/scripts/hub_init.py \
  --task "{task}" --agents {N} \
  [--eval "{eval_cmd}"] [--metric {metric}] [--direction {direction}]

Display the session ID to the user.

Step 2: Capture Baseline

If --eval was provided:

  1. Run the eval command in the current working directory
  2. Extract the metric value from stdout
  3. Display: Baseline captured: {metric} = {value}
  4. Append baseline: {value} to .agenthub/sessions/{session-id}/config.yaml

If no --eval was provided, skip this step.

Step 3: Spawn Agents

Run /hub:spawn with the session ID.

If --template was provided, use the template dispatch prompt from references/agent-templates.md instead of the default dispatch prompt. Pass the eval command, metric, and baseline to the template variables.

Launch all agents in a single message with multiple Agent tool calls (true parallelism).

Step 4: Wait and Monitor

After spawning, inform the user that agents are running. When all agents complete (Agent tool returns results):

  1. Display a brief summary of each agent's work
  2. Proceed to evaluation

Step 5: Evaluate

Run /hub:eval with the session ID:

  • If --eval was provided: metric-based ranking with result_ranker.py
  • If no --eval: LLM judge mode (coordinator reads diffs and ranks)

If baseline was captured, pass --baseline {value} to result_ranker.py so deltas are shown.

Display the ranked results table.

Step 6: Confirm and Merge

Present the results to the user and ask for confirmation:

Agent-2 is the winner (128ms, -52ms from baseline).
Merge agent-2's branch? [Y/n]

If confirmed, run /hub:merge. If declined, inform the user they can:


  • /hub:merge --agent agent-{N} to pick a different winner

  • /hub:eval --judge to re-evaluate with LLM judge

  • Inspect branches manually

Critical Rules

  • Sequential execution — each step depends on the previous
  • Stop on failure — if any step fails, report the error and stop
  • User confirms merge — never auto-merge without asking
  • Template is optional — without --template, agents use the default dispatch prompt from /hub:spawn

SKILL.md source

---
name: run
description: One-shot lifecycle command that chains init → baseline → spawn → eval → merge in a single invocation.
---

# /hub:run — One-Shot Lifecycle

Run the full AgentHub lifecycle in one command: initialize, capture baseline, spawn agents, evaluate results, and merge the winner.

## Usage

```
/hub:run --task "Reduce p50 latency" --agents 3 \
  --eval "pytest bench.py --json" --metric p50_ms --direction lower \
  --template optimizer

/hub:run --task "Refactor auth module" --agents 2 --template refactorer

/hub:run --task "Cover untested utils" --agents 3 \
  --eval "pytest --cov=utils --cov-report=json" --metric coverage_pct --direction higher \
  --template test-writer

/hub:run --task "Write 3 email subject lines for spring sale campaign" --agents 3 --judge
```

## Parameters

| Parameter | Required | Description |
|-----------|----------|-------------|
| `--task` | Yes | Task description for agents |
| `--agents` | No | Number of parallel agents (default: 3) |
| `--eval` | No | Eval command to measure results (skip for LLM judge mode) |
| `--metric` | No | Metric name to extract from eval output (required if `--eval` given) |
| `--direction` | No | `lower` or `higher` — which direction is better (required if `--metric` given) |
| `--template` | No | Agent template: `optimizer`, `refactorer`, `test-writer`, `bug-fixer` |

## What It Does

Execute these steps sequentially:

### Step 1: Initialize

Run `/hub:init` with the provided arguments:

```bash
python {skill_path}/scripts/hub_init.py \
  --task "{task}" --agents {N} \
  [--eval "{eval_cmd}"] [--metric {metric}] [--direction {direction}]
```

Display the session ID to the user.

### Step 2: Capture Baseline

If `--eval` was provided:

1. Run the eval command in the current working directory
2. Extract the metric value from stdout
3. Display: `Baseline captured: {metric} = {value}`
4. Append `baseline: {value}` to `.agenthub/sessions/{session-id}/config.yaml`

If no `--eval` was provided, skip this step.

### Step 3: Spawn Agents

Run `/hub:spawn` with the session ID.

If `--template` was provided, use the template dispatch prompt from `references/agent-templates.md` instead of the default dispatch prompt. Pass the eval command, metric, and baseline to the template variables.

Launch all agents in a single message with multiple Agent tool calls (true parallelism).

### Step 4: Wait and Monitor

After spawning, inform the user that agents are running. When all agents complete (Agent tool returns results):

1. Display a brief summary of each agent's work
2. Proceed to evaluation

### Step 5: Evaluate

Run `/hub:eval` with the session ID:

- If `--eval` was provided: metric-based ranking with `result_ranker.py`
- If no `--eval`: LLM judge mode (coordinator reads diffs and ranks)

If baseline was captured, pass `--baseline {value}` to `result_ranker.py` so deltas are shown.

Display the ranked results table.

### Step 6: Confirm and Merge

Present the results to the user and ask for confirmation:

```
Agent-2 is the winner (128ms, -52ms from baseline).
Merge agent-2's branch? [Y/n]
```

If confirmed, run `/hub:merge`. If declined, inform the user they can:
- `/hub:merge --agent agent-{N}` to pick a different winner
- `/hub:eval --judge` to re-evaluate with LLM judge
- Inspect branches manually

## Critical Rules

- **Sequential execution** — each step depends on the previous
- **Stop on failure** — if any step fails, report the error and stop
- **User confirms merge** — never auto-merge without asking
- **Template is optional** — without `--template`, agents use the default dispatch prompt from `/hub:spawn`

Related skills 6

caveman

★ Featured

Ultra-compressed communication mode. Cuts token usage ~75% by speaking like caveman while keeping full technical accuracy. Supports intensity levels: lite, full (default), ultra, wenyan-lite, wenyan-full, wenyan-ultra. Use when user says "caveman mode", "talk like caveman", "use caveman", "less tokens", "be brief", or invokes /caveman. Also auto-triggers when token efficiency is requested.

juliusbrussee 167k
Development

secure-linux-web-hosting

★ Featured

Use when setting up, hardening, or reviewing a cloud server for self-hosting, including DNS, SSH, firewalls, Nginx, static-site hosting, reverse-proxying an app, HTTPS with Let's Encrypt or ACME clients, safe HTTP-to-HTTPS redirects, or optional post-launch network tuning such as BBR.

xixu-me 155k
Development

readme-i18n

★ Featured

Use when the user wants to translate a repository README, make a repo multilingual, localize docs, add a language switcher, internationalize the README, or update localized README variants in a GitHub-style repository.

xixu-me 155k
Development

lark-shared

★ Featured

Use when first setting up lark-cli, running auth login, switching user/bot identity (--as), handling permission denied or scope errors, needing to update lark-cli, or seeing _notice in JSON output.

larksuite 155k
Development

improve-codebase-architecture

★ Featured

Find deepening opportunities in a codebase, informed by the domain language in CONTEXT.md and the decisions in docs/adr/. Use when the user wants to improve architecture, find refactoring opportunities, consolidate tightly-coupled modules, or make a codebase more testable and AI-navigable.

mattpocock 151k
Development

paper-context-resolver

★ Featured

Optional RigorPilot helper for README-first deep learning repo reproduction. Use only when the README and repository files leave a narrow reproduction-critical gap and the task is to resolve a specific paper detail such as dataset split, preprocessing, evaluation protocol, checkpoint mapping, or runtime assumption from primary paper sources while recording conflicts. Do not use for general paper summary, repo scanning, environment setup, command execution, title-only paper lookup, or replacin...

lllllllama 127k
Development