NEW Browse AI tools across categories — updated daily. See what's new →

Nextflow Development

Run nf-core bioinformatics pipelines (rnaseq, sarek, atacseq) on sequencing data. Use when analyzing RNA-seq, WGS/WES, or ATAC-seq data—either local FASTQs or…

Authoranthropic
Version1.0.0
LicenseMIT
Token count~2,366
UpdatedJun 5, 2026

Install

Quick install

via npx skills · works with 57+ agents
npx skills add https://github.com/anthropics/life-sciences/tree/HEAD/nextflow-development
Or pick agent:
npx skills add anthropics/life-sciences --skill nextflow-development --agent claude-code
npx skills add anthropics/life-sciences --skill nextflow-development --agent cursor
npx skills add anthropics/life-sciences --skill nextflow-development --agent codex
npx skills add anthropics/life-sciences --skill nextflow-development --agent opencode
npx skills add anthropics/life-sciences --skill nextflow-development --agent github-copilot
npx skills add anthropics/life-sciences --skill nextflow-development --agent windsurf
More install options

Shorthand — useful for multi-skill repos:

npx skills add anthropics/life-sciences --skill nextflow-development

Manual — clone the repo and drop the folder into your agent's skills directory:

git clone https://github.com/anthropics/life-sciences.git
cp -r life-sciences/nextflow-development ~/.claude/skills/
How to use: Once installed, ask your agent to "use the nextflow-development skill" or describe what you want (e.g. "Run nf-core bioinformatics pipelines (rnaseq, sarek, atacseq) on sequencing data"). Requires Node.js 18+.

nextflow-development

Run nf-core bioinformatics pipelines (rnaseq, sarek, atacseq) on sequencing data. Use when analyzing RNA-seq, WGS/WES, or ATAC-seq data—either local FASTQs or…

nextflow-developmentby anthropic

Run nf-core bioinformatics pipelines (rnaseq, sarek, atacseq) on sequencing data. Use when analyzing RNA-seq, WGS/WES, or ATAC-seq data—either local FASTQs or…

npx skills add https://github.com/anthropics/life-sciences --skill nextflow-developmentDownload ZIPGitHub

nf-core Pipeline Deployment

Run nf-core bioinformatics pipelines on local or public sequencing data.

Target users: Bench scientists and researchers without specialized bioinformatics training who need to run large-scale omics analyses—differential expression, variant calling, or chromatin accessibility analysis.

Workflow Checklist

`- [ ] Step 0: Acquire data (if from GEO/SRA)
- [ ] Step 1: Environment check (MUST pass)
- [ ] Step 2: Select pipeline (confirm with user)
- [ ] Step 3: Run test profile (MUST pass)
- [ ] Step 4: Create samplesheet
- [ ] Step 5: Configure & run (confirm genome with user)
- [ ] Step 6: Verify outputs
`

Step 0: Acquire Data (GEO/SRA Only)

Skip this step if user has local FASTQ files.

For public datasets, fetch from GEO/SRA first. See references/geo-sra-acquisition.md for the full workflow.

Quick start:

`# 1. Get study info
python scripts/sra_geo_fetch.py info GSE110004

# 2. Download (interactive mode)
python scripts/sra_geo_fetch.py download GSE110004 -o ./fastq -i

# 3. Generate samplesheet
python scripts/sra_geo_fetch.py samplesheet GSE110004 --fastq-dir ./fastq -o samplesheet.csv
`

DECISION POINT: After fetching study info, confirm with user:

  • Which sample subset to download (if multiple data types)
  • Suggested genome and pipeline

Then continue to Step 1.

Step 1: Environment Check

Run first. Pipeline will fail without passing environment.

`python scripts/check_environment.py
`

All critical checks must pass. If any fail, provide fix instructions:

Docker issues

ProblemFixNot installedInstall from https://docs.docker.com/get-docker/Permission deniedsudo usermod -aG docker $USER then re-loginDaemon not runningsudo systemctl start docker

Nextflow issues

ProblemFixNot installedcurl -s https://get.nextflow.io | bash && mv nextflow ~/bin/Version < 23.04nextflow self-update

Java issues

ProblemFixNot installed / < 11sudo apt install openjdk-11-jdk
Do not proceed until all checks pass. For HPC/Singularity, see references/troubleshooting.md.

Step 2: Select Pipeline

DECISION POINT: Confirm with user before proceeding.

Data TypePipelineVersionGoalRNA-seqrnaseq3.22.2Gene expressionWGS/WESsarek3.7.1Variant callingATAC-seqatacseq2.1.2Chromatin accessibility
Auto-detect from data:

`python scripts/detect_data_type.py /path/to/data
`

For pipeline-specific details:

  • references/pipelines/rnaseq.md
  • references/pipelines/sarek.md
  • references/pipelines/atacseq.md

Step 3: Run Test Profile

Validates environment with small data. MUST pass before real data.

`nextflow run nf-core/<pipeline> -r <version> -profile test,docker --outdir test_output
`

PipelineCommandrnaseqnextflow run nf-core/rnaseq -r 3.22.2 -profile test,docker --outdir test_rnaseqsareknextflow run nf-core/sarek -r 3.7.1 -profile test,docker --outdir test_sarekatacseqnextflow run nf-core/atacseq -r 2.1.2 -profile test,docker --outdir test_atacseq
Verify:

`ls test_output/multiqc/multiqc_report.html
grep "Pipeline completed successfully" .nextflow.log
`

If test fails, see references/troubleshooting.md.

Step 4: Create Samplesheet

Generate automatically

`python scripts/generate_samplesheet.py /path/to/data <pipeline> -o samplesheet.csv
`

The script:

  • Discovers FASTQ/BAM/CRAM files
  • Pairs R1/R2 reads
  • Infers sample metadata
  • Validates before writing

For sarek: Script prompts for tumor/normal status if not auto-detected.

Validate existing samplesheet

`python scripts/generate_samplesheet.py --validate samplesheet.csv <pipeline>
`

Samplesheet formats

rnaseq:

`sample,fastq_1,fastq_2,strandedness
SAMPLE1,/abs/path/R1.fq.gz,/abs/path/R2.fq.gz,auto
`

sarek:

`patient,sample,lane,fastq_1,fastq_2,status
patient1,tumor,L001,/abs/path/tumor_R1.fq.gz,/abs/path/tumor_R2.fq.gz,1
patient1,normal,L001,/abs/path/normal_R1.fq.gz,/abs/path/normal_R2.fq.gz,0
`

atacseq:

`sample,fastq_1,fastq_2,replicate
CONTROL,/abs/path/ctrl_R1.fq.gz,/abs/path/ctrl_R2.fq.gz,1
`

Step 5: Configure & Run

5a. Check genome availability

`python scripts/manage_genomes.py check <genome>
# If not installed:
python scripts/manage_genomes.py download <genome>
`

Common genomes: GRCh38 (human), GRCh37 (legacy), GRCm39 (mouse), R64-1-1 (yeast), BDGP6 (fly)

5b. Decision points

DECISION POINT: Confirm with user:

  • Genome: Which reference to use
  • Pipeline-specific options:
  • rnaseq: aligner (star_salmon recommended, hisat2 for low memory)
  • sarek: tools (haplotypecaller for germline, mutect2 for somatic)
  • atacseq: read_length (50, 75, 100, or 150)

5c. Run pipeline

`nextflow run nf-core/<pipeline> \
-r <version> \
-profile docker \
--input samplesheet.csv \
--outdir results \
--genome <genome> \
-resume
`

Key flags:

  • -r: Pin version
  • -profile docker: Use Docker (or singularity for HPC)
  • --genome: iGenomes key
  • -resume: Continue from checkpoint

Resource limits (if needed):

`--max_cpus 8 --max_memory '32.GB' --max_time '24.h'
`

Step 6: Verify Outputs

Check completion

`ls results/multiqc/multiqc_report.html
grep "Pipeline completed successfully" .nextflow.log
`

Key outputs by pipeline

rnaseq:

  • results/star_salmon/salmon.merged.gene_counts.tsv - Gene counts
  • results/star_salmon/salmon.merged.gene_tpm.tsv - TPM values

sarek:

results/variant_calling// - VCF files

  • results/preprocessing/recalibrated/ - BAM files

atacseq:

  • results/macs2/narrowPeak/ - Peak calls
  • results/bwa/mergedLibrary/bigwig/ - Coverage tracks

Quick Reference

For common exit codes and fixes, see references/troubleshooting.md.

Resume failed run

`nextflow run nf-core/<pipeline> -resume
`

References

  • references/geo-sra-acquisition.md - Downloading public GEO/SRA data
  • references/troubleshooting.md - Common issues and fixes
  • references/installation.md - Environment setup
  • references/pipelines/rnaseq.md - RNA-seq pipeline details
  • references/pipelines/sarek.md - Variant calling details
  • references/pipelines/atacseq.md - ATAC-seq details

Disclaimer

This skill is provided as a prototype example demonstrating how to integrate nf-core bioinformatics pipelines into Claude Code for automated analysis workflows. The current implementation supports three pipelines (rnaseq, sarek, and atacseq), serving as a foundation that enables the community to expand support to the full set of nf-core pipelines.

It is intended for educational and research purposes and should not be considered production-ready without appropriate validation for your specific use case. Users are responsible for ensuring their computing environment meets pipeline requirements and for verifying analysis results.

Anthropic does not guarantee the accuracy of bioinformatics outputs, and users should follow standard practices for validating computational analyses. This integration is not officially endorsed by or affiliated with the nf-core community.

Attribution

When publishing results, cite the appropriate pipeline. Citations are available in each nf-core repository's CITATIONS.md file (e.g., https://github.com/nf-core/rnaseq/blob/3.22.2/CITATIONS.md).

Licenses

  • nf-core pipelines: MIT License (https://nf-co.re/about)
  • Nextflow: Apache License, Version 2.0 (https://www.nextflow.io/about-us.html)
  • NCBI SRA Toolkit: Public Domain (https://github.com/ncbi/sra-tools/blob/master/LICENSE)

More skills from anthropic

comps-analysisby anthropicALWAYS follow this data source hierarchy:analyzing-financial-statementsby anthropicThis skill calculates key financial ratios and metrics from financial statement data for investment analysisapplying-brand-guidelinesby anthropicThis skill applies consistent corporate branding and styling to all generated documents including colors, fonts, layouts, and messagingcookbook-auditby anthropicAudit an Anthropic Cookbook notebook based on a rubric. Use whenever a notebook review or audit is requested.creating-financial-modelsby anthropicThis skill provides an advanced financial modeling suite with DCF analysis, sensitivity testing, Monte Carlo simulations, and scenario planning for investment…action-creatorby anthropicCreates user-specific one-click action templates that execute email operations when clicked in the chat interface. Use when user wants reusable actions for…docxby anthropicComprehensive document creation, editing, and analysis with support for tracked changes, comments, formatting preservation, and text extraction. When Claude…executive-briefingby anthropicTransforms research findings into executive-ready briefings. Automatically activated when user mentions 'executive', 'briefing', 'C-suite', 'board',…

---

Source: https://github.com/anthropics/life-sciences/tree/HEAD/nextflow-development
Author: anthropic
Discovered via: mcpservers.org

SKILL.md source

---
name: nextflow-development
description: Run nf-core bioinformatics pipelines (rnaseq, sarek, atacseq) on sequencing data. Use when analyzing RNA-seq, WGS/WES, or ATAC-seq data—either local FASTQs or…
---

# nextflow-development

Run nf-core bioinformatics pipelines (rnaseq, sarek, atacseq) on sequencing data. Use when analyzing RNA-seq, WGS/WES, or ATAC-seq data—either local FASTQs or…

# nextflow-developmentby anthropic
Run nf-core bioinformatics pipelines (rnaseq, sarek, atacseq) on sequencing data. Use when analyzing RNA-seq, WGS/WES, or ATAC-seq data—either local FASTQs or…

`npx skills add https://github.com/anthropics/life-sciences --skill nextflow-development`Download ZIPGitHub

## nf-core Pipeline Deployment

Run nf-core bioinformatics pipelines on local or public sequencing data.

Target users: Bench scientists and researchers without specialized bioinformatics training who need to run large-scale omics analyses—differential expression, variant calling, or chromatin accessibility analysis.

## Workflow Checklist

```
`- [ ] Step 0: Acquire data (if from GEO/SRA)
- [ ] Step 1: Environment check (MUST pass)
- [ ] Step 2: Select pipeline (confirm with user)
- [ ] Step 3: Run test profile (MUST pass)
- [ ] Step 4: Create samplesheet
- [ ] Step 5: Configure & run (confirm genome with user)
- [ ] Step 6: Verify outputs
`
```

## Step 0: Acquire Data (GEO/SRA Only)

Skip this step if user has local FASTQ files.

For public datasets, fetch from GEO/SRA first. See references/geo-sra-acquisition.md for the full workflow.

Quick start:

```
`# 1. Get study info
python scripts/sra_geo_fetch.py info GSE110004

# 2. Download (interactive mode)
python scripts/sra_geo_fetch.py download GSE110004 -o ./fastq -i

# 3. Generate samplesheet
python scripts/sra_geo_fetch.py samplesheet GSE110004 --fastq-dir ./fastq -o samplesheet.csv
`
```

DECISION POINT: After fetching study info, confirm with user:

* Which sample subset to download (if multiple data types)

* Suggested genome and pipeline

Then continue to Step 1.

## Step 1: Environment Check

Run first. Pipeline will fail without passing environment.

```
`python scripts/check_environment.py
`
```

All critical checks must pass. If any fail, provide fix instructions:

### Docker issues

ProblemFixNot installedInstall from https://docs.docker.com/get-docker/Permission denied`sudo usermod -aG docker $USER` then re-loginDaemon not running`sudo systemctl start docker`

### Nextflow issues

ProblemFixNot installed`curl -s https://get.nextflow.io | bash && mv nextflow ~/bin/`Version < 23.04`nextflow self-update`

### Java issues

ProblemFixNot installed / < 11`sudo apt install openjdk-11-jdk`
Do not proceed until all checks pass. For HPC/Singularity, see references/troubleshooting.md.

## Step 2: Select Pipeline

DECISION POINT: Confirm with user before proceeding.

Data TypePipelineVersionGoalRNA-seq`rnaseq`3.22.2Gene expressionWGS/WES`sarek`3.7.1Variant callingATAC-seq`atacseq`2.1.2Chromatin accessibility
Auto-detect from data:

```
`python scripts/detect_data_type.py /path/to/data
`
```

For pipeline-specific details:

* references/pipelines/rnaseq.md

* references/pipelines/sarek.md

* references/pipelines/atacseq.md

## Step 3: Run Test Profile

Validates environment with small data. MUST pass before real data.

```
`nextflow run nf-core/<pipeline> -r <version> -profile test,docker --outdir test_output
`
```

PipelineCommandrnaseq`nextflow run nf-core/rnaseq -r 3.22.2 -profile test,docker --outdir test_rnaseq`sarek`nextflow run nf-core/sarek -r 3.7.1 -profile test,docker --outdir test_sarek`atacseq`nextflow run nf-core/atacseq -r 2.1.2 -profile test,docker --outdir test_atacseq`
Verify:

```
`ls test_output/multiqc/multiqc_report.html
grep "Pipeline completed successfully" .nextflow.log
`
```

If test fails, see references/troubleshooting.md.

## Step 4: Create Samplesheet

### Generate automatically

```
`python scripts/generate_samplesheet.py /path/to/data <pipeline> -o samplesheet.csv
`
```

The script:

* Discovers FASTQ/BAM/CRAM files

* Pairs R1/R2 reads

* Infers sample metadata

* Validates before writing

For sarek: Script prompts for tumor/normal status if not auto-detected.

### Validate existing samplesheet

```
`python scripts/generate_samplesheet.py --validate samplesheet.csv <pipeline>
`
```

### Samplesheet formats

rnaseq:

```
`sample,fastq_1,fastq_2,strandedness
SAMPLE1,/abs/path/R1.fq.gz,/abs/path/R2.fq.gz,auto
`
```

sarek:

```
`patient,sample,lane,fastq_1,fastq_2,status
patient1,tumor,L001,/abs/path/tumor_R1.fq.gz,/abs/path/tumor_R2.fq.gz,1
patient1,normal,L001,/abs/path/normal_R1.fq.gz,/abs/path/normal_R2.fq.gz,0
`
```

atacseq:

```
`sample,fastq_1,fastq_2,replicate
CONTROL,/abs/path/ctrl_R1.fq.gz,/abs/path/ctrl_R2.fq.gz,1
`
```

## Step 5: Configure & Run

### 5a. Check genome availability

```
`python scripts/manage_genomes.py check <genome>
# If not installed:
python scripts/manage_genomes.py download <genome>
`
```

Common genomes: GRCh38 (human), GRCh37 (legacy), GRCm39 (mouse), R64-1-1 (yeast), BDGP6 (fly)

### 5b. Decision points

DECISION POINT: Confirm with user:

* Genome: Which reference to use

* Pipeline-specific options:

* rnaseq: aligner (star_salmon recommended, hisat2 for low memory)

* sarek: tools (haplotypecaller for germline, mutect2 for somatic)

* atacseq: read_length (50, 75, 100, or 150)

### 5c. Run pipeline

```
`nextflow run nf-core/<pipeline> \
-r <version> \
-profile docker \
--input samplesheet.csv \
--outdir results \
--genome <genome> \
-resume
`
```

Key flags:

* `-r`: Pin version

* `-profile docker`: Use Docker (or `singularity` for HPC)

* `--genome`: iGenomes key

* `-resume`: Continue from checkpoint

Resource limits (if needed):

```
`--max_cpus 8 --max_memory '32.GB' --max_time '24.h'
`
```

## Step 6: Verify Outputs

### Check completion

```
`ls results/multiqc/multiqc_report.html
grep "Pipeline completed successfully" .nextflow.log
`
```

### Key outputs by pipeline

rnaseq:

* `results/star_salmon/salmon.merged.gene_counts.tsv` - Gene counts

* `results/star_salmon/salmon.merged.gene_tpm.tsv` - TPM values

sarek:

* `results/variant_calling/*/` - VCF files

* `results/preprocessing/recalibrated/` - BAM files

atacseq:

* `results/macs2/narrowPeak/` - Peak calls

* `results/bwa/mergedLibrary/bigwig/` - Coverage tracks

## Quick Reference

For common exit codes and fixes, see references/troubleshooting.md.

### Resume failed run

```
`nextflow run nf-core/<pipeline> -resume
`
```

## References

* references/geo-sra-acquisition.md - Downloading public GEO/SRA data

* references/troubleshooting.md - Common issues and fixes

* references/installation.md - Environment setup

* references/pipelines/rnaseq.md - RNA-seq pipeline details

* references/pipelines/sarek.md - Variant calling details

* references/pipelines/atacseq.md - ATAC-seq details

## Disclaimer

This skill is provided as a prototype example demonstrating how to integrate nf-core bioinformatics pipelines into Claude Code for automated analysis workflows. The current implementation supports three pipelines (rnaseq, sarek, and atacseq), serving as a foundation that enables the community to expand support to the full set of nf-core pipelines.

It is intended for educational and research purposes and should not be considered production-ready without appropriate validation for your specific use case. Users are responsible for ensuring their computing environment meets pipeline requirements and for verifying analysis results.

Anthropic does not guarantee the accuracy of bioinformatics outputs, and users should follow standard practices for validating computational analyses. This integration is not officially endorsed by or affiliated with the nf-core community.

## Attribution

When publishing results, cite the appropriate pipeline. Citations are available in each nf-core repository's CITATIONS.md file (e.g., https://github.com/nf-core/rnaseq/blob/3.22.2/CITATIONS.md).

## Licenses

* nf-core pipelines: MIT License (https://nf-co.re/about)

* Nextflow: Apache License, Version 2.0 (https://www.nextflow.io/about-us.html)

* NCBI SRA Toolkit: Public Domain (https://github.com/ncbi/sra-tools/blob/master/LICENSE)

## More skills from anthropic
comps-analysisby anthropicALWAYS follow this data source hierarchy:analyzing-financial-statementsby anthropicThis skill calculates key financial ratios and metrics from financial statement data for investment analysisapplying-brand-guidelinesby anthropicThis skill applies consistent corporate branding and styling to all generated documents including colors, fonts, layouts, and messagingcookbook-auditby anthropicAudit an Anthropic Cookbook notebook based on a rubric. Use whenever a notebook review or audit is requested.creating-financial-modelsby anthropicThis skill provides an advanced financial modeling suite with DCF analysis, sensitivity testing, Monte Carlo simulations, and scenario planning for investment…action-creatorby anthropicCreates user-specific one-click action templates that execute email operations when clicked in the chat interface. Use when user wants reusable actions for…docxby anthropicComprehensive document creation, editing, and analysis with support for tracked changes, comments, formatting preservation, and text extraction. When Claude…executive-briefingby anthropicTransforms research findings into executive-ready briefings. Automatically activated when user mentions 'executive', 'briefing', 'C-suite', 'board',…

---

**Source**: https://github.com/anthropics/life-sciences/tree/HEAD/nextflow-development
**Author**: anthropic
**Discovered via**: mcpservers.org

Related skills 6

caveman

★ Featured

Ultra-compressed communication mode. Cuts token usage ~75% by speaking like caveman while keeping full technical accuracy. Supports intensity levels: lite, full (default), ultra, wenyan-lite, wenyan-full, wenyan-ultra. Use when user says "caveman mode", "talk like caveman", "use caveman", "less tokens", "be brief", or invokes /caveman. Also auto-triggers when token efficiency is requested.

juliusbrussee 167k
Development

secure-linux-web-hosting

★ Featured

Use when setting up, hardening, or reviewing a cloud server for self-hosting, including DNS, SSH, firewalls, Nginx, static-site hosting, reverse-proxying an app, HTTPS with Let's Encrypt or ACME clients, safe HTTP-to-HTTPS redirects, or optional post-launch network tuning such as BBR.

xixu-me 155k
Development

readme-i18n

★ Featured

Use when the user wants to translate a repository README, make a repo multilingual, localize docs, add a language switcher, internationalize the README, or update localized README variants in a GitHub-style repository.

xixu-me 155k
Development

lark-shared

★ Featured

Use when first setting up lark-cli, running auth login, switching user/bot identity (--as), handling permission denied or scope errors, needing to update lark-cli, or seeing _notice in JSON output.

larksuite 155k
Development

improve-codebase-architecture

★ Featured

Find deepening opportunities in a codebase, informed by the domain language in CONTEXT.md and the decisions in docs/adr/. Use when the user wants to improve architecture, find refactoring opportunities, consolidate tightly-coupled modules, or make a codebase more testable and AI-navigable.

mattpocock 151k
Development

paper-context-resolver

★ Featured

Optional RigorPilot helper for README-first deep learning repo reproduction. Use only when the README and repository files leave a narrow reproduction-critical gap and the task is to resolve a specific paper detail such as dataset split, preprocessing, evaluation protocol, checkpoint mapping, or runtime assumption from primary paper sources while recording conflicts. Do not use for general paper summary, repo scanning, environment setup, command execution, title-only paper lookup, or replacin...

lllllllama 127k
Development