★ Featured Development

Feature Flags Architect

Version2.9.0

LicenseMIT

Token count~2,429

UpdatedJun 4, 2026

Use when adding, retiring, or auditing feature flags. Triggers on "add a flag", "ship behind a flag", "rollout plan", "kill switch", "stale flags", "flag debt", "LaunchDarkly", "GrowthBook", "Statsig", "Unleash", "Flipt", or any progressive-delivery question. Ships flag debt scanner, rollout planner, and kill-switch auditor (all stdlib Python), 4 references on flag taxonomy + provider trade-offs + rollout strategies + lifecycle, plus a /flag-cleanup slash command.

Install

Quick install

via npx skills · works with 57+ agents

npx skills add https://github.com/alirezarezvani/claude-skills/tree/main/engineering/skills/feature-flags-architect

Or pick agent:

npx skills add alirezarezvani/claude-skills --skill feature-flags-architect --agent claude-code

npx skills add alirezarezvani/claude-skills --skill feature-flags-architect --agent cursor

npx skills add alirezarezvani/claude-skills --skill feature-flags-architect --agent codex

npx skills add alirezarezvani/claude-skills --skill feature-flags-architect --agent opencode

npx skills add alirezarezvani/claude-skills --skill feature-flags-architect --agent github-copilot

npx skills add alirezarezvani/claude-skills --skill feature-flags-architect --agent windsurf

More install options

Shorthand — useful for multi-skill repos:

npx skills add alirezarezvani/claude-skills --skill feature-flags-architect

Manual — clone the repo and drop the folder into your agent's skills directory:

git clone https://github.com/alirezarezvani/claude-skills.git

cp -r claude-skills/engineering/skills/feature-flags-architect ~/.claude/skills/

How to use: Once installed, ask your agent to "use the feature-flags-architect skill" or describe what you want (e.g. "Use when adding, retiring, or auditing feature flags. Triggers on "add a flag","). Requires Node.js 18+.

Feature Flags Architect

End-to-end discipline for feature flags: classify them, ship them, ramp them, and retire them. Most teams treat flags as throwaway if-statements; this skill treats them as a controlled lifecycle with measurable debt.

When to use

Adding a new flag and need a rollout plan
Auditing a codebase for stale or orphaned flags
Choosing a flag provider (LaunchDarkly vs GrowthBook vs Statsig vs Unleash vs Flipt vs build-your-own)
Designing a kill-switch path for a risky launch
Cleaning up flag debt before a release freeze
Reviewing whether a feature should ship behind a flag at all

Core principle: flags are a lifecycle, not an `if`

request → design → ship → ramp → cleanup → archive

Flags that skip cleanup become debt: dead branches, stale defaults, untested code paths, unbounded blast radius. The three scripts in this skill enforce the lifecycle.

Quick start

# 1. Audit the repo for flag debt
python scripts/flag_debt_scanner.py --repo . --max-age-days 90

# 2. Plan a progressive rollout for a new flag
python scripts/rollout_planner.py --population 100000 --target-percent 100 --duration-days 14 --strategy ring

# 3. Verify every flag has a documented kill switch
python scripts/kill_switch_audit.py --repo . --flag-doc docs/feature-flags.md

The 4 flag types (taxonomy)

Different flag types have different lifespans and ownership. Misclassifying creates debt.

| Type | Purpose | Typical lifespan | Owner | Cleanup trigger |
|---|---|---|---|---|
| Release | Hide unfinished features in production | days–weeks | Eng | 100% rollout reached |
| Experiment | A/B test variants | weeks | Product/Marketing | Test concluded; winner picked |
| Operational | Circuit breakers, perf toggles, kill switches | months–years | Eng/SRE | Replaced by autoscaling/feature retirement |
| Permission | Entitlements per user/account/plan | years (permanent) | Product | Plan/role removed |

Only Release and Experiment flags should be on a debt-scanner watchlist. Operational and Permission flags are by design long-lived. See references/flag_taxonomy.md for decision tree.

The 3 Python tools

All three are stdlib-only. Run with --help.

`flag_debt_scanner.py`

Finds flags older than --max-age-days with low usage, suggesting candidates for cleanup.

python scripts/flag_debt_scanner.py --repo . --max-age-days 90 --format text
python scripts/flag_debt_scanner.py --repo . --max-age-days 60 --format json > debt.json

Detection heuristic:

Walk --repo for code references matching common flag-call patterns:

flag("..."), isFlagEnabled("..."), featureFlag("..."), getFlag("...")

client.variation("...", ...), unleash.isEnabled("..."), growthbook.feature("...")

For each unique flag identifier, find the oldest commit that introduced it (git log --diff-filter=A -S <name>).

Flag as DEBT if introduced > --max-age-days ago AND used in ≤--min-uses places.

Outputs flag name, age in days, file references, suggested action. JSON mode is CI-friendly.

`rollout_planner.py`

Generates a phased rollout schedule from population size, target percent, duration, and strategy.

python scripts/rollout_planner.py --population 100000 --target-percent 100 --duration-days 14 --strategy ring
python scripts/rollout_planner.py --population 50000 --target-percent 25 --duration-days 7 --strategy linear
python scripts/rollout_planner.py --population 1000000 --target-percent 100 --duration-days 30 --strategy log

Strategies:

ring: 1% → 5% → 25% → 50% → 100%, evenly spaced. Default for risky launches.

linear: constant rate per day. Default for medium-risk.

log: rapid early, slow tail. Default for low-risk launches with confidence.

cohort: by named cohort (internal → beta → free → paid → all).

Outputs a markdown table with date, percent, expected user count, abort criteria, and verification step per phase.

`kill_switch_audit.py`

Cross-references code-discovered flags against documentation to verify each has a kill switch path written down.

python scripts/kill_switch_audit.py --repo . --flag-doc docs/feature-flags.md
python scripts/kill_switch_audit.py --repo . --flag-doc runbooks/flags.md --format json

What it checks:

Every code-discovered flag has an entry in --flag-doc

Each entry declares: owner, type, kill-switch trigger, monitoring dashboard

Reports flags missing documentation (FAIL) or missing fields (WARN)

Use as a pre-merge gate before any new flag ships.

Provider chooser (5 + DIY)

| Provider | Best for | Pricing model | Lock-in risk | OSS option |
|---|---|---|---|---|
| LaunchDarkly | Enterprise, complex targeting, audit/compliance | Per-MAU, expensive | High | No |
| GrowthBook | Mid-market, A/B testing focused, OSS-friendly | Per-MAU + OSS | Low | Yes (self-host) |
| Statsig | Growth/product teams, advanced experimentation | Free tier + per-MAU | Medium | No |
| Unleash | OSS-first, self-hosted, dev-friendly | OSS + Enterprise | Low | Yes |
| Flipt | Lightweight, k8s-native, simple needs | OSS-only | None | Yes |
| DIY | <100 flags, no targeting, full control | None | None | N/A |

Decision rules:

<50 flags + no targeting → DIY with config file or env vars

Need analytics + experimentation → Statsig or GrowthBook

Compliance/SOC2 audit logs required → LaunchDarkly

Self-hosting required (data residency / air-gapped) → Unleash or Flipt

See references/provider_comparison.md for detail.

Workflows

Workflow 1: Ship a new feature behind a flag

1. Classify: which of the 4 flag types?
   → Release (most common for engineering work)
2. Run rollout_planner.py to design the ramp
3. Add flag entry to docs/feature-flags.md BEFORE writing code:
   - name, owner, type, kill-switch trigger, dashboard URL
4. Write the code with the flag
5. Run kill_switch_audit.py — must pass before merge
6. Deploy at 0%; verify kill switch works
7. Execute rollout schedule; abort if abort criteria met
8. At 100% for 7+ days: remove flag, delete dead branch, archive doc entry

Workflow 2: Quarterly flag cleanup

1. Run flag_debt_scanner.py --repo . --max-age-days 90 > debt.md
2. For each flagged item:
   a. Confirm it reached 100% (or was killed)
   b. Find the issue/PR that introduced it; verify owner agrees to remove
   c. Delete dead branches; remove flag config
   d. Run kill_switch_audit.py — should now show one fewer flag
3. Update CHANGELOG: "Removed N stale flags"

Workflow 3: Choose a provider

1. Estimate flag count (current + 12-month projection)
2. Required features:
   - Targeting rules (user, account, geo, %)?
   - A/B testing + stats?
   - Audit log / SOC2?
   - Self-hosting / data residency?
3. Pricing budget (MAU * cost-per-MAU)
4. See provider_comparison.md decision tree
5. Build a 30-day proof-of-concept before signing

Workflow 4: Design a kill switch

1. Identify the failure modes:
   - Latency spike (which threshold?)
   - Error rate spike (which threshold?)
   - Business metric regression (which threshold?)
2. Wire each to an abort:
   - Manual: dashboard link + on-call playbook
   - Automated: alert threshold flips flag back to 0%
3. Test the kill switch in staging BEFORE production rollout
4. Document in flag-doc; pass kill_switch_audit.py

References

references/flag_taxonomy.md — 4 types, decision tree, ownership, lifespan
references/provider_comparison.md — LaunchDarkly / GrowthBook / Statsig / Unleash / Flipt / DIY trade-offs
references/rollout_strategies.md — ring / linear / log / cohort / geo, abort criteria, monitoring
references/flag_lifecycle.md — request → design → ship → ramp → cleanup → archive

Slash command

/flag-cleanup — Run the full cleanup workflow on the current repo: scan for debt, generate a removal plan, audit kill switches.

Asset templates

assets/flag_request_template.md — fill-in form for new flag requests (name, owner, type, kill switch, rollout plan)

Anti-patterns

Permanent flag with if (FLAG_FOO) 50 places — should be a Permission flag with a runtime config, not a Release flag
Flag with no owner — when the original engineer leaves, no one cleans it up
No kill switch documented — when the feature breaks, no one knows how to disable it
A/B test that ran 6 months — pick a winner; running indefinitely is debt
Flags as feature toggles for cosmetic changes — ship via deploy, not flag

Verifiable success

A team using this skill should achieve:

100% of new flags pass kill_switch_audit.py at merge time

flag_debt_scanner.py --max-age-days 90 returns ≤5 stale flags repo-wide

Every flag has a documented owner, type, and kill switch

Mean time to retire a Release flag: <60 days from 100% rollout

SKILL.md source

---
name: feature-flags-architect
description: Use when adding, retiring, or auditing feature flags. Triggers on "add a flag", "ship behind a flag", "rollout plan", "kill switch", "stale flags", "flag debt", "LaunchDarkly", "GrowthBook", "Stats...
---

# Feature Flags Architect

End-to-end discipline for feature flags: classify them, ship them, ramp them, and retire them. Most teams treat flags as throwaway `if`-statements; this skill treats them as a controlled lifecycle with measurable debt.

## When to use

- Adding a new flag and need a rollout plan
- Auditing a codebase for stale or orphaned flags
- Choosing a flag provider (LaunchDarkly vs GrowthBook vs Statsig vs Unleash vs Flipt vs build-your-own)
- Designing a kill-switch path for a risky launch
- Cleaning up flag debt before a release freeze
- Reviewing whether a feature should ship behind a flag at all

## Core principle: flags are a lifecycle, not an `if`

```
request → design → ship → ramp → cleanup → archive
```

Flags that skip cleanup become debt: dead branches, stale defaults, untested code paths, unbounded blast radius. The three scripts in this skill enforce the lifecycle.

## Quick start

```bash
# 1. Audit the repo for flag debt
python scripts/flag_debt_scanner.py --repo . --max-age-days 90

# 2. Plan a progressive rollout for a new flag
python scripts/rollout_planner.py --population 100000 --target-percent 100 --duration-days 14 --strategy ring

# 3. Verify every flag has a documented kill switch
python scripts/kill_switch_audit.py --repo . --flag-doc docs/feature-flags.md
```

## The 4 flag types (taxonomy)

Different flag types have different lifespans and ownership. Misclassifying creates debt.

| Type | Purpose | Typical lifespan | Owner | Cleanup trigger |
|---|---|---|---|---|
| **Release** | Hide unfinished features in production | days–weeks | Eng | 100% rollout reached |
| **Experiment** | A/B test variants | weeks | Product/Marketing | Test concluded; winner picked |
| **Operational** | Circuit breakers, perf toggles, kill switches | months–years | Eng/SRE | Replaced by autoscaling/feature retirement |
| **Permission** | Entitlements per user/account/plan | years (permanent) | Product | Plan/role removed |

Only Release and Experiment flags should be on a debt-scanner watchlist. Operational and Permission flags are by design long-lived. See `references/flag_taxonomy.md` for decision tree.

## The 3 Python tools

All three are stdlib-only. Run with `--help`.

### `flag_debt_scanner.py`

Finds flags older than `--max-age-days` with low usage, suggesting candidates for cleanup.

```bash
python scripts/flag_debt_scanner.py --repo . --max-age-days 90 --format text
python scripts/flag_debt_scanner.py --repo . --max-age-days 60 --format json > debt.json
```

**Detection heuristic:**
1. Walk `--repo` for code references matching common flag-call patterns:
   - `flag("...")`, `isFlagEnabled("...")`, `featureFlag("...")`, `getFlag("...")`
   - `client.variation("...", ...)`, `unleash.isEnabled("...")`, `growthbook.feature("...")`
2. For each unique flag identifier, find the oldest commit that introduced it (`git log --diff-filter=A -S <name>`).
3. Flag as DEBT if introduced > `--max-age-days` ago AND used in ≤`--min-uses` places.

Outputs flag name, age in days, file references, suggested action. JSON mode is CI-friendly.

### `rollout_planner.py`

Generates a phased rollout schedule from population size, target percent, duration, and strategy.

```bash
python scripts/rollout_planner.py --population 100000 --target-percent 100 --duration-days 14 --strategy ring
python scripts/rollout_planner.py --population 50000 --target-percent 25 --duration-days 7 --strategy linear
python scripts/rollout_planner.py --population 1000000 --target-percent 100 --duration-days 30 --strategy log
```

**Strategies:**
- `ring`: 1% → 5% → 25% → 50% → 100%, evenly spaced. Default for risky launches.
- `linear`: constant rate per day. Default for medium-risk.
- `log`: rapid early, slow tail. Default for low-risk launches with confidence.
- `cohort`: by named cohort (internal → beta → free → paid → all).

Outputs a markdown table with date, percent, expected user count, abort criteria, and verification step per phase.

### `kill_switch_audit.py`

Cross-references code-discovered flags against documentation to verify each has a kill switch path written down.

```bash
python scripts/kill_switch_audit.py --repo . --flag-doc docs/feature-flags.md
python scripts/kill_switch_audit.py --repo . --flag-doc runbooks/flags.md --format json
```

**What it checks:**
1. Every code-discovered flag has an entry in `--flag-doc`
2. Each entry declares: owner, type, kill-switch trigger, monitoring dashboard
3. Reports flags missing documentation (FAIL) or missing fields (WARN)

Use as a pre-merge gate before any new flag ships.

## Provider chooser (5 + DIY)

| Provider | Best for | Pricing model | Lock-in risk | OSS option |
|---|---|---|---|---|
| **LaunchDarkly** | Enterprise, complex targeting, audit/compliance | Per-MAU, expensive | High | No |
| **GrowthBook** | Mid-market, A/B testing focused, OSS-friendly | Per-MAU + OSS | Low | Yes (self-host) |
| **Statsig** | Growth/product teams, advanced experimentation | Free tier + per-MAU | Medium | No |
| **Unleash** | OSS-first, self-hosted, dev-friendly | OSS + Enterprise | Low | Yes |
| **Flipt** | Lightweight, k8s-native, simple needs | OSS-only | None | Yes |
| **DIY** | <100 flags, no targeting, full control | None | None | N/A |

Decision rules:
- <50 flags + no targeting → DIY with config file or env vars
- Need analytics + experimentation → Statsig or GrowthBook
- Compliance/SOC2 audit logs required → LaunchDarkly
- Self-hosting required (data residency / air-gapped) → Unleash or Flipt
- See `references/provider_comparison.md` for detail.

## Workflows

### Workflow 1: Ship a new feature behind a flag

```
1. Classify: which of the 4 flag types?
   → Release (most common for engineering work)
2. Run rollout_planner.py to design the ramp
3. Add flag entry to docs/feature-flags.md BEFORE writing code:
   - name, owner, type, kill-switch trigger, dashboard URL
4. Write the code with the flag
5. Run kill_switch_audit.py — must pass before merge
6. Deploy at 0%; verify kill switch works
7. Execute rollout schedule; abort if abort criteria met
8. At 100% for 7+ days: remove flag, delete dead branch, archive doc entry
```

### Workflow 2: Quarterly flag cleanup

```
1. Run flag_debt_scanner.py --repo . --max-age-days 90 > debt.md
2. For each flagged item:
   a. Confirm it reached 100% (or was killed)
   b. Find the issue/PR that introduced it; verify owner agrees to remove
   c. Delete dead branches; remove flag config
   d. Run kill_switch_audit.py — should now show one fewer flag
3. Update CHANGELOG: "Removed N stale flags"
```

### Workflow 3: Choose a provider

```
1. Estimate flag count (current + 12-month projection)
2. Required features:
   - Targeting rules (user, account, geo, %)?
   - A/B testing + stats?
   - Audit log / SOC2?
   - Self-hosting / data residency?
3. Pricing budget (MAU * cost-per-MAU)
4. See provider_comparison.md decision tree
5. Build a 30-day proof-of-concept before signing
```

### Workflow 4: Design a kill switch

```
1. Identify the failure modes:
   - Latency spike (which threshold?)
   - Error rate spike (which threshold?)
   - Business metric regression (which threshold?)
2. Wire each to an abort:
   - Manual: dashboard link + on-call playbook
   - Automated: alert threshold flips flag back to 0%
3. Test the kill switch in staging BEFORE production rollout
4. Document in flag-doc; pass kill_switch_audit.py
```

## References

- `references/flag_taxonomy.md` — 4 types, decision tree, ownership, lifespan
- `references/provider_comparison.md` — LaunchDarkly / GrowthBook / Statsig / Unleash / Flipt / DIY trade-offs
- `references/rollout_strategies.md` — ring / linear / log / cohort / geo, abort criteria, monitoring
- `references/flag_lifecycle.md` — request → design → ship → ramp → cleanup → archive

## Slash command

`/flag-cleanup` — Run the full cleanup workflow on the current repo: scan for debt, generate a removal plan, audit kill switches.

## Asset templates

- `assets/flag_request_template.md` — fill-in form for new flag requests (name, owner, type, kill switch, rollout plan)

## Anti-patterns

- **Permanent flag with `if (FLAG_FOO)` 50 places** — should be a Permission flag with a runtime config, not a Release flag
- **Flag with no owner** — when the original engineer leaves, no one cleans it up
- **No kill switch documented** — when the feature breaks, no one knows how to disable it
- **A/B test that ran 6 months** — pick a winner; running indefinitely is debt
- **Flags as feature toggles for cosmetic changes** — ship via deploy, not flag

## Verifiable success

A team using this skill should achieve:
- 100% of new flags pass `kill_switch_audit.py` at merge time
- `flag_debt_scanner.py --max-age-days 90` returns ≤5 stale flags repo-wide
- Every flag has a documented owner, type, and kill switch
- Mean time to retire a Release flag: <60 days from 100% rollout

Development

Feature Flags Architect

Install

Quick install

Feature Flags Architect

When to use

Core principle: flags are a lifecycle, not an if

Quick start

The 4 flag types (taxonomy)

The 3 Python tools

flag_debt_scanner.py

rollout_planner.py

kill_switch_audit.py

Provider chooser (5 + DIY)

Workflows

Workflow 1: Ship a new feature behind a flag

Workflow 2: Quarterly flag cleanup

Workflow 3: Choose a provider

Workflow 4: Design a kill switch

References

Slash command

Asset templates

Anti-patterns

Verifiable success

SKILL.md source

Related skills 6

caveman

secure-linux-web-hosting

readme-i18n

lark-shared

improve-codebase-architecture

paper-context-resolver

Core principle: flags are a lifecycle, not an `if`

`flag_debt_scanner.py`

`rollout_planner.py`

`kill_switch_audit.py`