Cross Eval
/cs:cross-eval <memo> — Multi-model consensus on a board memo or strategy brief. Claude + Codex + Gemini cross-review with graceful degradation.
Install
Quick install
npx skills add https://github.com/alirezarezvani/claude-skills/tree/main/c-level-advisor/c-level-agents/skills/cross-evalnpx skills add alirezarezvani/claude-skills --skill cross-eval --agent claude-codenpx skills add alirezarezvani/claude-skills --skill cross-eval --agent cursornpx skills add alirezarezvani/claude-skills --skill cross-eval --agent codexnpx skills add alirezarezvani/claude-skills --skill cross-eval --agent opencodenpx skills add alirezarezvani/claude-skills --skill cross-eval --agent github-copilotnpx skills add alirezarezvani/claude-skills --skill cross-eval --agent windsurfMore install options
Shorthand — useful for multi-skill repos:
npx skills add alirezarezvani/claude-skills --skill cross-evalManual — clone the repo and drop the folder into your agent's skills directory:
git clone https://github.com/alirezarezvani/claude-skills.gitcp -r claude-skills/c-level-advisor/c-level-agents/skills/cross-eval ~/.claude/skills//cs:cross-eval — Multi-Model Consensus
Command: /cs:cross-eval <memo-or-brief>
Runs the same memo through multiple model providers and reconciles divergences. Use for high-stakes, irreversible decisions where single-model bias is too costly: M&A, major fundraises, layoffs, strategic pivots, regulatory commitments.
Adapted from gstack's /codex cross-review pattern, generalized to business memos instead of code PRs.
When to Run
- Before signing a term sheet
- Before announcing a layoff
- Before committing to a regulated market
- Before any decision where reversing costs > 6 months of company time
- When the boardroom vote was split or had a CRITICAL dissent
Models Used (graceful degradation)
The command tries to invoke each available model in order:
- Claude (primary, always available) — the boardroom's native voice
- Codex / OpenAI (if
OPENAI_API_KEYorcodexCLI available) - Gemini (if
GEMINI_API_KEYorgeminiCLI available)
If only Claude is available, the command runs Claude-only with adversarial mode — same model, different prompt seeds — and clearly labels the output as single-model.
Workflow
- Read the memo / brief
- Probe environment for available model CLIs / API keys
- For each available model:
- Send the memo with this prompt prefix:
- Collect three independent reviews
- Reconcile: where do they agree? Where do they diverge?
- Surface the divergences as questions for the founder
Output Format
Saved to ~/.claude/cross-eval/YYYY-MM-DD-<slug>.md:
# Cross-Eval: <memo title>
**Date:** YYYY-MM-DD
**Memo reviewed:** <link>
**Models invoked:** Claude / Codex / Gemini (or noted fallbacks)
## Vote Tally
| Model | Vote | Confidence |
|---|---|---|
| Claude | APPROVE | High |
| Codex | DEFER | Med |
| Gemini | APPROVE | Low |
## Consensus Concerns (≥2 models flagged)
1. <concern> — flagged by Claude + Codex
2. <concern> — flagged by all 3
## Divergent Concerns (1 model flagged)
- <Codex only:> <concern> — worth a second look
- <Gemini only:> <concern> — likely noise, but check
## Consensus Supports (≥2 models endorsed)
1. <support>
2. <support>
## Recommendation
- 🟢 GO if 2+ models APPROVE and no CRITICAL concerns from any model
- 🟡 PAUSE if any model is DEFER or any concern is CRITICAL
- 🔴 STOP if 2+ models REJECT
## Open Questions for Founder
1. <question raised by divergence>
2. <question raised by divergence>
Why This Matters
Single-model recommendations have systematic biases. Claude trends helpful and may under-weight risk. Codex (OpenAI) trends more cautious on emerging-market and regulatory topics. Gemini trends more cautious on technical scale claims. Disagreement is signal, not noise.
This is the safety net before irreversibility — not a replacement for outside counsel or a real board.
Graceful Degradation
If only Claude is available:
**Models available:** Claude only
**Mode:** ADVERSARIAL — running 3 independent Claude passes with different system prompts:
1. Standard reviewer
2. Devil's advocate (must find 3 critical concerns)
3. Steelman (must find 3 strongest reasons to approve)
This is weaker than true multi-model. Treat the result as suggestive, not conclusive.
Routing
/cs:decide— if consensus is GO/cs:freeze— if consensus is PAUSE/cs:boardroom(re-run) — if consensus is STOP
Related
- Skills: [
board-meeting](../../../skills/board-meeting/SKILL.md), [executive-mentor](../../../executive-mentor/) - Inspiration: gstack's
/codexcross-review pattern (adapted to business memos)
---
Version: 1.0.0
SKILL.md source
---
name: cross-eval
description: /cs:cross-eval <memo> — Multi-model consensus on a board memo or strategy brief. Claude + Codex + Gemini cross-review with graceful degradation.
---
# /cs:cross-eval — Multi-Model Consensus
**Command:** `/cs:cross-eval <memo-or-brief>`
Runs the same memo through multiple model providers and reconciles divergences. Use for **high-stakes, irreversible decisions** where single-model bias is too costly: M&A, major fundraises, layoffs, strategic pivots, regulatory commitments.
Adapted from gstack's `/codex` cross-review pattern, generalized to **business memos** instead of code PRs.
## When to Run
- Before signing a term sheet
- Before announcing a layoff
- Before committing to a regulated market
- Before any decision where reversing costs > 6 months of company time
- When the boardroom vote was split or had a CRITICAL dissent
## Models Used (graceful degradation)
The command tries to invoke each available model in order:
1. **Claude** (primary, always available) — the boardroom's native voice
2. **Codex / OpenAI** (if `OPENAI_API_KEY` or `codex` CLI available)
3. **Gemini** (if `GEMINI_API_KEY` or `gemini` CLI available)
If only Claude is available, the command runs **Claude-only with adversarial mode** — same model, different prompt seeds — and clearly labels the output as single-model.
## Workflow
1. Read the memo / brief
2. Probe environment for available model CLIs / API keys
3. For each available model:
- Send the memo with this prompt prefix:
> "You are an independent C-suite reviewer. The following is a board memo from another company's boardroom. Identify the top 3 concerns, the top 3 supports, and your vote (APPROVE / REJECT / DEFER). Do not deferentially agree — assume the memo's reasoning is flawed until proven otherwise."
4. Collect three independent reviews
5. Reconcile: where do they agree? Where do they diverge?
6. Surface the divergences as questions for the founder
## Output Format
Saved to `~/.claude/cross-eval/YYYY-MM-DD-<slug>.md`:
```markdown
# Cross-Eval: <memo title>
**Date:** YYYY-MM-DD
**Memo reviewed:** <link>
**Models invoked:** Claude / Codex / Gemini (or noted fallbacks)
## Vote Tally
| Model | Vote | Confidence |
|---|---|---|
| Claude | APPROVE | High |
| Codex | DEFER | Med |
| Gemini | APPROVE | Low |
## Consensus Concerns (≥2 models flagged)
1. <concern> — flagged by Claude + Codex
2. <concern> — flagged by all 3
## Divergent Concerns (1 model flagged)
- <Codex only:> <concern> — worth a second look
- <Gemini only:> <concern> — likely noise, but check
## Consensus Supports (≥2 models endorsed)
1. <support>
2. <support>
## Recommendation
- 🟢 GO if 2+ models APPROVE and no CRITICAL concerns from any model
- 🟡 PAUSE if any model is DEFER or any concern is CRITICAL
- 🔴 STOP if 2+ models REJECT
## Open Questions for Founder
1. <question raised by divergence>
2. <question raised by divergence>
```
## Why This Matters
Single-model recommendations have systematic biases. Claude trends helpful and may under-weight risk. Codex (OpenAI) trends more cautious on emerging-market and regulatory topics. Gemini trends more cautious on technical scale claims. Disagreement is signal, not noise.
This is the **safety net before irreversibility** — not a replacement for outside counsel or a real board.
## Graceful Degradation
If only Claude is available:
```markdown
**Models available:** Claude only
**Mode:** ADVERSARIAL — running 3 independent Claude passes with different system prompts:
1. Standard reviewer
2. Devil's advocate (must find 3 critical concerns)
3. Steelman (must find 3 strongest reasons to approve)
This is weaker than true multi-model. Treat the result as suggestive, not conclusive.
```
## Routing
- `/cs:decide` — if consensus is GO
- `/cs:freeze` — if consensus is PAUSE
- `/cs:boardroom` (re-run) — if consensus is STOP
## Related
- Skills: [`board-meeting`](../../../skills/board-meeting/SKILL.md), [`executive-mentor`](../../../executive-mentor/)
- Inspiration: gstack's `/codex` cross-review pattern (adapted to business memos)
---
**Version:** 1.0.0
Related skills 6
to-prd
Turn the current conversation context into a PRD and publish it to the project issue tracker. Use when user wants to create a PRD from the current context.
to-issues
Break a plan, spec, or PRD into independently-grabbable issues on the project issue tracker using tracer-bullet vertical slices. Use when user wants to convert a plan into issues, create implementation tickets, or break down work into issues.
C Level Agents
Founder-mode executive team. 8 cs-* C-suite agents (CFO, CMO, CRO, CPO, COO, CHRO, CISO, Chief of Staff) and 17 /cs:* slash commands for forcing-question office hours, multi-role boardroom delibera...
Boardroom
/cs:boardroom <brief> — 6-phase multi-role deliberation across the C-suite with Phase 2 isolation, critic pre-screen, and synthesis. Outputs a board memo.
Brief
/cs:brief <topic> — Generate a one-page strategy brief from an office-hours intake. First step in the strategic sprint pipeline.
Caio Review
/cs:caio-review <plan> — Eval-demanding Chief AI Officer interrogation of any plan that involves AI: model selection, risk classification, cost economics, or AI hiring.