Plugin OpenClaw Ecosystem

Info Pipeline

Automated multi-source AI & tech intelligence for your agents

Overview

Info Pipeline aggregates 7 data sources across English and Chinese tech platforms every day. Each item passes through keyword filtering, relevance scoring, and deduplication before being serialized into a unified JSON schema — ready for your AI researcher agents to consume.

Data Sources

Languages (EN + ZH)

1 cmd

python main.py

1 schema

Unified Output

Capabilities

Key Features

7 Data Sources

Covers GitHub, Hacker News, Reddit, YouTube, Product Hunt, X/Twitter, and 6 Chinese tech platforms in a single run.

Keyword Filtering + Relevance Scoring

Each item is scored against a configurable keyword list. Only high-signal content passes — no noise.

Deduplication

Cross-source deduplication ensures the same story from multiple platforms appears only once in your report.

Unified Schema

All 7 collectors output the same JSON schema (title, url, source, score, summary, tags) for easy downstream processing.

Config-Driven

Everything in config.yaml — keywords, per-source limits, score thresholds, enabled platforms. Change behavior without touching code.

Markdown Reports for Agents

Pipeline outputs structured Markdown reports consumed directly by AI researcher agents for further analysis.

Architecture

How It Works

7 Sources → Collectors → Filter + Score → Dedup → Unified JSON → Markdown Report → Agent

1. Collect Each collector runs independently and fetches raw items from its platform using the parameters in config.yaml.

2. Filter & Score Items are matched against global keywords and scored for relevance. Low-signal content is discarded before it reaches storage.

3. Deduplicate URL-based and title-similarity deduplication removes duplicates across sources — the same story won't appear twice.

4. Unify Schema All surviving items are normalized into a single JSON structure: title, url, source, score, published_at, summary, tags.

5. Report A structured Markdown report is written to reports/ — directly consumable by AI researcher agents for further analysis.

CLI Usage

# Run all 7 sources

python main.py

# Run a single source

python main.py --source github

python main.py --source hackernews

python main.py --source reddit

# List available sources

python main.py --list

Output Schema (per item)

{
  "title": "...",
  "url": "https://...",
  "source": "github",
  "score": 85,
  "published_at": "2026-02-24T...",
  "summary": "...",
  "tags": ["llm", "open-source"]
}

Data Sources

7 Platforms, 2 Languages

English mainstream tech + Chinese ecosystem — both covered in a single pipeline run.

GitHub Trending

Code Repos

Hot repositories by topic (LLM, AI agent, RAG, MCP, diffusion) filtered by stars and recency.

Hacker News

Discussions

Top stories filtered by score threshold — surface what the tech community is talking about today.

Community

Multi-subreddit coverage: r/LocalLLaMA, r/MachineLearning, r/artificial, r/ChatGPT, r/ClaudeAI and more.

YouTube

Video

Latest uploads from top AI channels — Karpathy, Yannic Kilcher, Two Minute Papers, 3Blue1Brown, Fireship.

Product Hunt

Product Launches

Daily new products in AI, Developer Tools, and Productivity — filtered by votes and topic.

X / Twitter

Social

Keyword search for AI/LLM discussions from the English-language tech community.

Chinese Platforms

Multi-platform

知乎 · 36氪 · 掘金 · 少数派 · InfoQ · B站科技区 — via trends-hub MCP integration.

Coverage

What Gets Monitored

Every content type that matters for AI & tech research — all in one daily run.

GitHub Repos

Stars, forks, topics, recency

HN Discussions

Score, comments, domain

Reddit Posts

Multi-subreddit, upvote filtered

YouTube Videos

Selected AI creator channels

Product Launches

Daily PH feed, vote threshold

Tweets

Keyword search, recent timeline

Chinese Tech News

知乎 / 36氪 / 掘金 / B站 / InfoQ / 少数派

Coming Soon

Join the Waitlist

Be the first to know when Info Pipeline launches as a managed OpenClaw plugin.

Join the Waitlist

Back to Plugins