Content & Writing

Article Extractor

Extract full article text and metadata from web pages

Authormichalparkola

Version1.0.0

LicenseMIT

Token count~517

UpdatedJun 5, 2026

Install

Quick install

via npx skills · works with 57+ agents

npx skills add https://github.com/michalparkola/tapestry-skills-for-claude-code/tree/main/article-extractor

Or pick agent:

npx skills add michalparkola/tapestry-skills-for-claude-code --skill article-extractor --agent claude-code

npx skills add michalparkola/tapestry-skills-for-claude-code --skill article-extractor --agent cursor

npx skills add michalparkola/tapestry-skills-for-claude-code --skill article-extractor --agent codex

npx skills add michalparkola/tapestry-skills-for-claude-code --skill article-extractor --agent opencode

npx skills add michalparkola/tapestry-skills-for-claude-code --skill article-extractor --agent github-copilot

npx skills add michalparkola/tapestry-skills-for-claude-code --skill article-extractor --agent windsurf

More install options

Shorthand — useful for multi-skill repos:

npx skills add michalparkola/tapestry-skills-for-claude-code --skill article-extractor

Manual — clone the repo and drop the folder into your agent's skills directory:

git clone https://github.com/michalparkola/tapestry-skills-for-claude-code.git

cp -r tapestry-skills-for-claude-code/article-extractor ~/.claude/skills/

How to use: Once installed, ask your agent to "use the article-extractor skill" or describe what you want (e.g. "Extract full article text and metadata from web pages"). Requires Node.js 18+.

article-extractor

Extract full article text and metadata from web pages

What is it?
A Claude Code skill for extracting full article text and metadata from web pages. It strips away navigation, ads, sidebars, and other non-content elements to deliver clean, readable article text. Ideal for content research, archiving, and building knowledge bases from web sources.

How to use it?

When you provide a URL, the skill automatically fetches the web page, identifies the main article content, and extracts clean text along with metadata such as title, author, publish date, and description. It handles various website layouts and content management systems.

The extracted content can be used for research, summarization, or further processing within your Claude workflow.

Key Features

Clean text extraction from web articles, removing ads, navigation, and clutter
Metadata extraction including title, author, date, and description
Handles various website layouts and CMS platforms
Integrates with other Tapestry skills for content processing pipelines
Preserves article structure and formattingView on GitHub

GitHub Stats

StarsForksLast UpdateAuthormichalparkolaLicenseMITVersion1.0.0

Features

Related Skills

NotebookLM Integration

Lets Claude Code chat directly with NotebookLM for source-grounded answers based exclusively on uploaded documents

3.5kPleasePromptoProductivityCommunication00

Internal Communications

Write internal communications like status reports, newsletters, and FAQs

5.3kAnthropicProductivityCommunication00

Slack GIF Creator

Create animated GIFs optimized for Slack's size constraints

5.3kAnthropicCreativeCommunication00

---

Source: https://github.com/michalparkola/tapestry-skills-for-claude-code/tree/main/article-extractor
Author: michalparkola
License: https://opensource.org/licenses/MIT
GitHub Stars: 237
Tags: web, document-processing, coding, writing, video

SKILL.md source

---
name: article-extractor
description: Extract full article text and metadata from web pages
---

# article-extractor

Extract full article text and metadata from web pages

What is it?
A Claude Code skill for extracting full article text and metadata from web pages. It strips away navigation, ads, sidebars, and other non-content elements to deliver clean, readable article text. Ideal for content research, archiving, and building knowledge bases from web sources.

## How to use it?
When you provide a URL, the skill automatically fetches the web page, identifies the main article content, and extracts clean text along with metadata such as title, author, publish date, and description. It handles various website layouts and content management systems.

The extracted content can be used for research, summarization, or further processing within your Claude workflow.

## Key Features

* Clean text extraction from web articles, removing ads, navigation, and clutter
* Metadata extraction including title, author, date, and description
* Handles various website layouts and CMS platforms
* Integrates with other Tapestry skills for content processing pipelines
* Preserves article structure and formattingView on GitHub

### GitHub Stats
StarsForksLast UpdateAuthormichalparkolaLicenseMITVersion1.0.0

### Categories
Communication

### Tags
webdocument-processingcodingwritingvideo

### Features

## Related Skills
More from Communication

### NotebookLM Integration
Lets Claude Code chat directly with NotebookLM for source-grounded answers based exclusively on uploaded documents

3.5kPleasePromptoProductivityCommunication00

### Internal Communications
Write internal communications like status reports, newsletters, and FAQs

5.3kAnthropicProductivityCommunication00

### Slack GIF Creator
Create animated GIFs optimized for Slack's size constraints

5.3kAnthropicCreativeCommunication00

---

**Source**: https://github.com/michalparkola/tapestry-skills-for-claude-code/tree/main/article-extractor
**Author**: michalparkola
**License**: https://opensource.org/licenses/MIT
**GitHub Stars**: 237
**Tags**: web, document-processing, coding, writing, video

Related skills 6

caveman

★ Featured

Ultra-compressed communication mode. Cuts token usage ~75% by dropping filler, articles, and pleasantries while keeping full technical accuracy. Use when user says "caveman mode", "talk like caveman", "use caveman", "less tokens", "be brief", or invokes /caveman.

mattpocock 113k

Content & Writing

clarify

★ Featured

Improve unclear UX copy, error messages, microcopy, labels, and instructions to make interfaces easier to understand. Use when the user mentions confusing text, unclear labels, bad error messages, hard-to-follow instructions, or wanting better UX writing.

pbakaus 82k

Content & Writing