NEW Browse AI tools across categories — updated daily. See what's new →

Store

Choose how and where to store football data. Use when the user asks about database choices, file formats, cloud storage, data pipelines, or how to organise their football data project. Also covers ...

Version1.0.0
LicenseMIT
Token count~1,294
UpdatedJun 5, 2026

Choose how and where to store football data. Use when the user asks about database choices, file formats, cloud storage, data pipelines, or how to organise their football data project. Also covers publishing and sharing outputs (Streamlit, Observable, GitHub Pages).

Install

Quick install

via npx skills · works with 57+ agents
npx skills add https://github.com/withqwerty/nutmeg
Or pick agent:
npx skills add withqwerty/nutmeg --agent claude-code
npx skills add withqwerty/nutmeg --agent cursor
npx skills add withqwerty/nutmeg --agent codex
npx skills add withqwerty/nutmeg --agent opencode
npx skills add withqwerty/nutmeg --agent github-copilot
npx skills add withqwerty/nutmeg --agent windsurf
More install options

Shorthand — useful for multi-skill repos:

npx skills add withqwerty/nutmeg

Manual — clone the repo and drop the folder into your agent's skills directory:

git clone https://github.com/withqwerty/nutmeg.git
cp -r nutmeg ~/.claude/skills/
How to use: Once installed, ask your agent to "use the Store skill" or describe what you want (e.g. "Choose how and where to store football data. Use when the user asks about databa"). Requires Node.js 18+.

Store

Choose how and where to store football data. Use when the user asks about database choices, file formats, cloud storage, data pipelines, or how to organise their football data project. Also covers publishing and sharing outputs (Streamlit, Observable, GitHub Pages).

---
name: nutmeg-store
description: "Choose how and where to store football data. Use when the user asks about database choices, file formats, cloud storage, data pipelines, or how to organise their football data project. Also covers publishing and sharing outputs (Streamlit, Observable, GitHub Pages)."
argument-hint: "[storage question or 'publish']"
allowed-tools: ["Read", "Write", "Bash", "AskUserQuestion", "mcp__football-docs__search_docs"]
---

Store

Help the user choose storage formats, locations, and publishing methods for their football data.

Accuracy

Read and follow docs/accuracy-guardrail.md before answering any question about provider-specific facts (IDs, endpoints, schemas, coordinates, rate limits). Always use search_docs — never guess from training data.

First: check profile

Read .nutmeg.user.md. If it doesn't exist, tell the user to run /nutmeg first.

Storage format decision tree

Small projects (< 100MB, single user)

| Format | Best for | Tools |
|--------|---------|-------|
| JSON | Raw event data, API responses | Any language |
| CSV | Tabular stats, easy to share | Spreadsheets, pandas, R |
| Parquet | Columnar analytics, fast queries | polars, pandas, DuckDB, Arrow |
| SQLite | Relational queries, multiple tables | Any language, DB browser tools |

Recommendation: Start with JSON for raw data, Parquet for processed data.

Medium projects (100MB - 10GB)

| Format | Best for | Notes |
|--------|---------|-------|
| Parquet files | Analytics workloads | 5-10x smaller than JSON, fast columnar reads |
| DuckDB | SQL analytics on local files | Queries Parquet/CSV directly, no server needed |
| SQLite | Relational data with joins | Single file, portable, ACID compliant |

Recommendation: Parquet for storage, DuckDB for querying.

Large projects (> 10GB, multiple users)

| Solution | Best for | Cost |
|----------|---------|------|
| PostgreSQL | Production apps, complex queries | Free (self-hosted) or ~$7/mo (Railway, Supabase) |
| BigQuery | Massive analytical queries | Free tier: 1TB/mo queries |
| Cloudflare R2 | Object storage (raw files) | Free tier: 10GB storage |
| S3 / GCS | Object storage at scale | ~$0.023/GB/mo |

Directory structure

Recommend this structure for football data projects:

project/
  data/
    raw/                  # Untouched API/scrape responses
      statsbomb/
        events/
        matches.json
      fbref/
        2024/
    processed/            # Cleaned, transformed data
      events.parquet
      shots.parquet
      passes.parquet
    derived/              # Computed metrics
      xg_model.parquet
      passing_networks/
  notebooks/              # Analysis notebooks
  scripts/                # Data pipeline scripts
  outputs/                # Charts, reports, exports
  .env                    # API keys (gitignored)
  .nutmeg.user.md         # Nutmeg profile

Publishing and sharing

Interactive dashboards

| Platform | Language | Cost | Notes |
|----------|---------|------|-------|
| Streamlit | Python | Free (community cloud) | Most popular for football analytics. Deploy from GitHub |
| Observable | JavaScript | Free tier | Great for D3.js visualisations. Notebooks + Framework |
| Shiny | R | Free (shinyapps.io, 25 hrs/mo) | R ecosystem integration |
| Gradio | Python | Free (HuggingFace Spaces) | Quick ML model demos |

Static sites

| Platform | Notes |
|----------|-------|
| GitHub Pages | Free. Good for static charts (D3, matplotlib exports) |
| Cloudflare Pages | Free. Faster, more features than GH Pages |
| Vercel | Free tier. Good for Next.js/Astro sites |

Sharing data

| Method | Best for |
|--------|---------|
| GitHub repo | Small datasets (< 100MB), code + data together |
| GitHub Releases | Larger files (up to 2GB per release) |
| Kaggle Datasets | Community sharing, discoverable, free |
| HuggingFace Datasets | ML-focused, versioned, free |

Social media / content

| Output | Tool | Notes |
|--------|------|-------|
| Static charts | matplotlib, ggplot2, D3.js | Export as PNG/SVG |
| Animated charts | matplotlib.animation, D3 transitions | Export as GIF/MP4 |
| Twitter/X threads | Chart images + alt text | Accessibility matters |
| Blog posts | Markdown + embedded charts | GitHub Pages, Medium, Substack |

Cost awareness

Based on the user's .nutmeg.user.md goals, flag costs:

  • Exploration/learning: Everything can be free. StatsBomb open data + Jupyter/Colab + GitHub Pages.
  • Content creation: Streamlit Community Cloud is free. Cloudflare Pages is free.
  • Professional: Budget for API access ($100-1000+/mo for Opta/StatsBomb commercial).
  • Product: Database hosting ($7-50/mo), consider data licensing costs separately.

---

Source: https://github.com/withqwerty/nutmeg
Author: withqwerty
Discovered via: skillsdirectory.com
Genre: data

SKILL.md source

---
name: Store
description: Choose how and where to store football data. Use when the user asks about database choices, file formats, cloud storage, data pipelines, or how to organise their football data project. Also covers ...
---

# Store

Choose how and where to store football data. Use when the user asks about database choices, file formats, cloud storage, data pipelines, or how to organise their football data project. Also covers publishing and sharing outputs (Streamlit, Observable, GitHub Pages).

---
name: nutmeg-store
description: "Choose how and where to store football data. Use when the user asks about database choices, file formats, cloud storage, data pipelines, or how to organise their football data project. Also covers publishing and sharing outputs (Streamlit, Observable, GitHub Pages)."
argument-hint: "[storage question or 'publish']"
allowed-tools: ["Read", "Write", "Bash", "AskUserQuestion", "mcp__football-docs__search_docs"]
---

# Store

Help the user choose storage formats, locations, and publishing methods for their football data.

## Accuracy

Read and follow `docs/accuracy-guardrail.md` before answering any question about provider-specific facts (IDs, endpoints, schemas, coordinates, rate limits). Always use `search_docs` — never guess from training data.
## First: check profile

Read `.nutmeg.user.md`. If it doesn't exist, tell the user to run `/nutmeg` first.

## Storage format decision tree

### Small projects (< 100MB, single user)

| Format | Best for | Tools |
|--------|---------|-------|
| JSON | Raw event data, API responses | Any language |
| CSV | Tabular stats, easy to share | Spreadsheets, pandas, R |
| Parquet | Columnar analytics, fast queries | polars, pandas, DuckDB, Arrow |
| SQLite | Relational queries, multiple tables | Any language, DB browser tools |

**Recommendation:** Start with JSON for raw data, Parquet for processed data.

### Medium projects (100MB - 10GB)

| Format | Best for | Notes |
|--------|---------|-------|
| Parquet files | Analytics workloads | 5-10x smaller than JSON, fast columnar reads |
| DuckDB | SQL analytics on local files | Queries Parquet/CSV directly, no server needed |
| SQLite | Relational data with joins | Single file, portable, ACID compliant |

**Recommendation:** Parquet for storage, DuckDB for querying.

### Large projects (> 10GB, multiple users)

| Solution | Best for | Cost |
|----------|---------|------|
| PostgreSQL | Production apps, complex queries | Free (self-hosted) or ~$7/mo (Railway, Supabase) |
| BigQuery | Massive analytical queries | Free tier: 1TB/mo queries |
| Cloudflare R2 | Object storage (raw files) | Free tier: 10GB storage |
| S3 / GCS | Object storage at scale | ~$0.023/GB/mo |

## Directory structure

Recommend this structure for football data projects:

```
project/
  data/
    raw/                  # Untouched API/scrape responses
      statsbomb/
        events/
        matches.json
      fbref/
        2024/
    processed/            # Cleaned, transformed data
      events.parquet
      shots.parquet
      passes.parquet
    derived/              # Computed metrics
      xg_model.parquet
      passing_networks/
  notebooks/              # Analysis notebooks
  scripts/                # Data pipeline scripts
  outputs/                # Charts, reports, exports
  .env                    # API keys (gitignored)
  .nutmeg.user.md         # Nutmeg profile
```

## Publishing and sharing

### Interactive dashboards

| Platform | Language | Cost | Notes |
|----------|---------|------|-------|
| Streamlit | Python | Free (community cloud) | Most popular for football analytics. Deploy from GitHub |
| Observable | JavaScript | Free tier | Great for D3.js visualisations. Notebooks + Framework |
| Shiny | R | Free (shinyapps.io, 25 hrs/mo) | R ecosystem integration |
| Gradio | Python | Free (HuggingFace Spaces) | Quick ML model demos |

### Static sites

| Platform | Notes |
|----------|-------|
| GitHub Pages | Free. Good for static charts (D3, matplotlib exports) |
| Cloudflare Pages | Free. Faster, more features than GH Pages |
| Vercel | Free tier. Good for Next.js/Astro sites |

### Sharing data

| Method | Best for |
|--------|---------|
| GitHub repo | Small datasets (< 100MB), code + data together |
| GitHub Releases | Larger files (up to 2GB per release) |
| Kaggle Datasets | Community sharing, discoverable, free |
| HuggingFace Datasets | ML-focused, versioned, free |

### Social media / content

| Output | Tool | Notes |
|--------|------|-------|
| Static charts | matplotlib, ggplot2, D3.js | Export as PNG/SVG |
| Animated charts | matplotlib.animation, D3 transitions | Export as GIF/MP4 |
| Twitter/X threads | Chart images + alt text | Accessibility matters |
| Blog posts | Markdown + embedded charts | GitHub Pages, Medium, Substack |

## Cost awareness

Based on the user's `.nutmeg.user.md` goals, flag costs:

- **Exploration/learning:** Everything can be free. StatsBomb open data + Jupyter/Colab + GitHub Pages.
- **Content creation:** Streamlit Community Cloud is free. Cloudflare Pages is free.
- **Professional:** Budget for API access ($100-1000+/mo for Opta/StatsBomb commercial).
- **Product:** Database hosting ($7-50/mo), consider data licensing costs separately.


---

**Source**: https://github.com/withqwerty/nutmeg
**Author**: withqwerty
**Discovered via**: skillsdirectory.com
**Genre**: data

Related skills 6

caveman

★ Featured

Ultra-compressed communication mode. Cuts token usage ~75% by speaking like caveman while keeping full technical accuracy. Supports intensity levels: lite, full (default), ultra, wenyan-lite, wenyan-full, wenyan-ultra. Use when user says "caveman mode", "talk like caveman", "use caveman", "less tokens", "be brief", or invokes /caveman. Also auto-triggers when token efficiency is requested.

juliusbrussee 167k
Development

secure-linux-web-hosting

★ Featured

Use when setting up, hardening, or reviewing a cloud server for self-hosting, including DNS, SSH, firewalls, Nginx, static-site hosting, reverse-proxying an app, HTTPS with Let's Encrypt or ACME clients, safe HTTP-to-HTTPS redirects, or optional post-launch network tuning such as BBR.

xixu-me 155k
Development

readme-i18n

★ Featured

Use when the user wants to translate a repository README, make a repo multilingual, localize docs, add a language switcher, internationalize the README, or update localized README variants in a GitHub-style repository.

xixu-me 155k
Development

lark-shared

★ Featured

Use when first setting up lark-cli, running auth login, switching user/bot identity (--as), handling permission denied or scope errors, needing to update lark-cli, or seeing _notice in JSON output.

larksuite 155k
Development

improve-codebase-architecture

★ Featured

Find deepening opportunities in a codebase, informed by the domain language in CONTEXT.md and the decisions in docs/adr/. Use when the user wants to improve architecture, find refactoring opportunities, consolidate tightly-coupled modules, or make a codebase more testable and AI-navigable.

mattpocock 151k
Development

paper-context-resolver

★ Featured

Optional RigorPilot helper for README-first deep learning repo reproduction. Use only when the README and repository files leave a narrow reproduction-critical gap and the task is to resolve a specific paper detail such as dataset split, preprocessing, evaluation protocol, checkpoint mapping, or runtime assumption from primary paper sources while recording conflicts. Do not use for general paper summary, repo scanning, environment setup, command execution, title-only paper lookup, or replacin...

lllllllama 127k
Development