NEW Browse AI tools across categories — updated daily. See what's new →

Execute And Judge Loop

Single-task execution with LLM-as-Judge verification in an iterative loop, supporting auto-retry and strict orchestrator role separation

AuthorNeoLabHQ
Version1.0.0
LicenseMIT
Token count~565
UpdatedJun 5, 2026

Install

Quick install

via npx skills · works with 57+ agents
npx skills add https://github.com/NeoLabHQ/context-engineering-kit/tree/master/plugins/sadd/skills/do-and-judge
Or pick agent:
npx skills add NeoLabHQ/context-engineering-kit --skill "Execute and Judge Loop" --agent claude-code
npx skills add NeoLabHQ/context-engineering-kit --skill "Execute and Judge Loop" --agent cursor
npx skills add NeoLabHQ/context-engineering-kit --skill "Execute and Judge Loop" --agent codex
npx skills add NeoLabHQ/context-engineering-kit --skill "Execute and Judge Loop" --agent opencode
npx skills add NeoLabHQ/context-engineering-kit --skill "Execute and Judge Loop" --agent github-copilot
npx skills add NeoLabHQ/context-engineering-kit --skill "Execute and Judge Loop" --agent windsurf
More install options

Shorthand — useful for multi-skill repos:

npx skills add NeoLabHQ/context-engineering-kit --skill "Execute and Judge Loop"

Manual — clone the repo and drop the folder into your agent's skills directory:

git clone https://github.com/NeoLabHQ/context-engineering-kit.git
cp -r context-engineering-kit/plugins/sadd/skills/do-and-judge ~/.claude/skills/
How to use: Once installed, ask your agent to "use the Execute and Judge Loop skill" or describe what you want (e.g. "Single-task execution with LLM-as-Judge verification in an iterative loop, suppo"). Requires Node.js 18+.

Execute and Judge Loop

Single-task execution with LLM-as-Judge verification in an iterative loop, supporting auto-retry and strict orchestrator role separation

What is it?
Single-task execution with LLM-as-Judge verification in an iterative loop, supporting auto-retry and strict orchestrator role separation Built for use cases involving execute-verify, llm-as-judge, retry-loop, quality-gate, orchestration.

How to use it?

Install this skill in your Claude environment to enhance execute and judge loop capabilities. Once installed, Claude will automatically apply the skill's guidelines when relevant tasks are detected. You can also explicitly invoke it by referencing its name in your prompts.

The full source and documentation is available on GitHub.

Key Features

  • Single-task execution with LLM-as-Judge verification in an iterative loop, supporting auto-retry and strict orchestrator role separation
  • Seamless integration with Claude's development workflow
  • Comprehensive guidelines and best practices for execute and judge loopView on GitHub

GitHub Stats

StarsForksLast UpdateAuthorNeoLabHQLicenseGPL-3.0Version1.0.0

Categories

AI & MLDeveloper Tools

Tags

execute-verifyllm-as-judgeretry-loopquality-gateorchestration

Features

Related Skills

More from AI & ML

Agent Evaluation Framework

Comprehensive Claude Code agent evaluation framework with multi-dimensional scoring, LLM-as-Judge mode, and research-backed performance variance analysis

433NeoLabHQAI & MLDeveloper Tools00

Self-Reflection Framework

Iterative self-improvement system with task complexity grading, strict quality gatekeeper role, confidence thresholds, and verification checklists

433NeoLabHQAI & MLDeveloper Tools00

Multi-Perspective Critique

Multi-perspective review system using Multi-Agent Debate and LLM-as-Judge patterns with 3 specialized judges, debate rounds, and consensus building

433NeoLabHQAI & MLDeveloper Tools00

---

Source: https://github.com/NeoLabHQ/context-engineering-kit/tree/master/plugins/sadd/skills/do-and-judge
Author: NeoLabHQ
License: https://www.gnu.org/licenses/gpl-3.0.html
GitHub Stars: 433
Tags: execute-verify, llm-as-judge, retry-loop, quality-gate, orchestration

SKILL.md source

---
name: Execute and Judge Loop
description: Single-task execution with LLM-as-Judge verification in an iterative loop, supporting auto-retry and strict orchestrator role separation
---

# Execute and Judge Loop

Single-task execution with LLM-as-Judge verification in an iterative loop, supporting auto-retry and strict orchestrator role separation

What is it?
Single-task execution with LLM-as-Judge verification in an iterative loop, supporting auto-retry and strict orchestrator role separation Built for use cases involving execute-verify, llm-as-judge, retry-loop, quality-gate, orchestration.

## How to use it?
Install this skill in your Claude environment to enhance execute and judge loop capabilities. Once installed, Claude will automatically apply the skill's guidelines when relevant tasks are detected. You can also explicitly invoke it by referencing its name in your prompts.

The full source and documentation is available on GitHub.

## Key Features

* Single-task execution with LLM-as-Judge verification in an iterative loop, supporting auto-retry and strict orchestrator role separation
* Seamless integration with Claude's development workflow
* Comprehensive guidelines and best practices for execute and judge loopView on GitHub

### GitHub Stats
StarsForksLast UpdateAuthorNeoLabHQLicenseGPL-3.0Version1.0.0

### Categories
AI & MLDeveloper Tools

### Tags
execute-verifyllm-as-judgeretry-loopquality-gateorchestration

### Features

## Related Skills
More from AI & ML

### Agent Evaluation Framework
Comprehensive Claude Code agent evaluation framework with multi-dimensional scoring, LLM-as-Judge mode, and research-backed performance variance analysis

433NeoLabHQAI & MLDeveloper Tools00

### Self-Reflection Framework
Iterative self-improvement system with task complexity grading, strict quality gatekeeper role, confidence thresholds, and verification checklists

433NeoLabHQAI & MLDeveloper Tools00

### Multi-Perspective Critique
Multi-perspective review system using Multi-Agent Debate and LLM-as-Judge patterns with 3 specialized judges, debate rounds, and consensus building

433NeoLabHQAI & MLDeveloper Tools00

---

**Source**: https://github.com/NeoLabHQ/context-engineering-kit/tree/master/plugins/sadd/skills/do-and-judge
**Author**: NeoLabHQ
**License**: https://www.gnu.org/licenses/gpl-3.0.html
**GitHub Stars**: 433
**Tags**: execute-verify, llm-as-judge, retry-loop, quality-gate, orchestration

Related skills 6

running-claude-code-via-litellm-copilot

★ Featured

Use when routing Claude Code through a local LiteLLM proxy to GitHub Copilot, reducing direct Anthropic spend, configuring ANTHROPIC_BASE_URL or ANTHROPIC_MODEL overrides, or troubleshooting Copilot proxy setup failures such as model-not-found, no localhost traffic, or GitHub 401/403 auth errors.

xixu-me 155k
AI & ML

skills-cli

★ Featured

Use when users ask to discover, install, list, check, update, remove, back up, restore, sync, or initialize Agent Skills, mention `bunx skills`, `npx skills`, `skills.sh`, or `skills-lock.json`, ask "find a skill for X", or want help extending agent capabilities with installable skills.

xixu-me 155k
AI & ML

repo-intake-and-plan

★ Featured

Narrow RigorPilot helper for README-first deep learning repo reproduction. Use when the task is specifically to scan a repository, read the README and common project files, extract documented commands, classify inference, evaluation, and training candidates, and return the smallest trustworthy reproduction plan to the main orchestrator. Do not use for environment setup, asset download, command execution, final reporting, paper lookup, or end-to-end orchestration.

lllllllama 127k
AI & ML

image-to-video

★ Featured

Animate any still image on RunComfy — this skill is a smart router that matches the user's intent to the right i2v model in the RunComfy catalog. Picks HappyHorse 1.0 I2V (Arena #1, native audio, identity preservation) for general animations, Wan 2.7 with `audio_url` for custom-voiceover lip-sync, or Seedance 2.0 Pro for multi-modal animation from image + reference video + reference audio. Bundles each model's documented prompting patterns so the caller gets sharper output without burning ite...

agentspace-so 121k
AI & ML

video-edit

★ Featured

Edit existing video on RunComfy — this skill is a smart router that matches the user's intent to the right edit model in the RunComfy catalog. Picks Wan 2.7 Edit-Video (general restyle / background swap / packaging swap, identity + motion preservation), Kling 2.6 Pro Motion Control (transfer precise motion from a reference video to a target character), or Lucy Edit Restyle (lightweight identity-stable restyle / outfit swap). Bundles each model's documented prompting patterns so the skill gets...

agentspace-so 121k
AI & ML

nano-banana-2

★ Featured

Generate images with Google Nano Banana 2 (Gemini-family flash-tier text-to-image) on RunComfy — bundled with the model's documented prompting patterns so the skill gets sharper output than naive prompting against the same model. Documents Nano Banana 2's strengths (rapid iteration, in-image typography rendering, predictable framing, optional web-grounded context), the resolution-tier pricing, the safety-tolerance dial, and when to route to Nano Banana Pro / GPT Image 2 / Flux 2 / Seedream in...

agentspace-so 121k
AI & ML