Neo: Two concrete plans (Jobs-level detail)


---
Plan 1: MCP Exposure Roadmap (Top 3 Capabilities)
Goal: Make Hermes discoverable and callable by agents via MCP endpoints. This directly addresses the “new SEO” observation.
Priority order
1. note-weaver (highest immediate value)
2. wiki ingestion pipeline (Raindrop + EPUB → wiki)
3. FlashLib clustering skill (new)
Implementation steps
- Phase 1 (note-weaver)
- Expose the existing note-weaver logic as an MCP tool.
- Required parameters: source_path (file or URL), target_vault (default: ~/wiki), tags (optional).
- Return: structured markdown + list of auto-generated links + Connection Scores.
- Add progress streaming so long-running ingestions report status.
- Phase 2 (wiki ingestion)
- Create a dedicated MCP endpoint ingest-to-wiki that wraps the full pipeline (fetch → extract → note-weaver → cross-link).
- Support both single items and batch collections.
- Expose metadata (source, date, original tags) so agents can make informed decisions.
- Phase 3 (FlashLib clustering)
- Once the clustering skill exists, expose cluster-corpus and get-cluster-summary as MCP tools.
- Allow agents to request thematic grouping of any collection and receive hub-page suggestions.
Decision points
- Authentication: start with local-only (stdio) then add token-based if needed.
- Tool naming: use clear, agent-friendly names (note_weaver, ingest_to_wiki, cluster_documents).
- Documentation: every endpoint must include a one-paragraph description + example call so agents can self-discover.
Timeline target: First endpoint (note-weaver) live within one focused session.
---
Plan 2: Wiki Knowledge Base Upgrade via FlashLib
Goal: Turn your wiki from a collection of pages into a clustered, queryable knowledge graph with thematic hubs, duplicate detection, and drift tracking.
Core workflow
1. Embedding generation
- Use your existing embedder on all wiki pages + Raindrop items.
- Store embeddings alongside metadata (title, source, date, tags).
2. Clustering with FlashLib
- Run FlashKMeans (or the library’s recommended clustering method) on the full corpus.
- Target cluster count: start with 30–80 (tunable).
- Output: cluster ID + membership list for every document.
3. Post-processing
- Thematic hubs: For each cluster, auto-generate a hub page with:
- Cluster name (LLM-generated from top documents)
- Top 5 representative excerpts
- List of all member pages with links
- Duplicate detection: Flag documents with cosine similarity > 0.92 within the same cluster.
- Drift tracking: Re-run clustering monthly and compare cluster membership changes over time.
4. Integration
- New skill: wiki-cluster (or extend note-weaver).
- One-command: wiki-cluster --collection raindrop --output hubs/
- Results written as markdown files that become first-class wiki pages.
Decision points
- Embedding model: reuse whatever you already use for note-weaver to keep consistency.
- Cluster count strategy: start with elbow method or silhouette score, then allow manual override.
- Update cadence: nightly for new items, full re-cluster monthly.
- Privacy: all processing stays local.
Expected outcome
Your wiki becomes visibly more powerful — agents (and you) can navigate by theme instead of manual search, duplicates are auto-flagged, and knowledge drift becomes measurable.
---
Next step
Which plan do you want to start with, or shall we run both in parallel? I can immediately produce the first skill file or MCP endpoint spec once you choose the entry point.
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pinned