Claude CLI + Obsidian

8 Karpathy-Style AI Agent Use Cases β€” Deep Research Report

🧠 AI Agents πŸ““ Obsidian πŸ–₯️ CLI-First πŸ—„οΈ Trino + Snowflake πŸ“Š Data Intelligence πŸ“… April 18, 2026

πŸ“‹ Table of Contents

  1. 🧠 Personal Knowledge RAG β€” Claude as a second brain over your Obsidian vault
  2. πŸ—„οΈ Data Lake / DB Query Agent β€” Natural language SQL on Trino + Snowflake
  3. 🌐 Research Assistant β€” Web search β†’ synthesized notes β†’ knowledge graph
  4. πŸ’» Codebase Explorer β€” Map, explain, and document any codebase
  5. πŸ“‹ Meeting Intelligence β€” Transcript β†’ decisions + tasks β†’ linked task graph
  6. ✍️ Writing Pipeline β€” Claude drafts from your Obsidian context, in your voice
  7. πŸ“Š BI Dashboard + Analyst β€” Live SQL β†’ trend analysis β†’ Obsidian notes
  8. πŸ“š Learning System β€” Papers β†’ summaries β†’ flashcards β†’ spaced repetition

Executive Summary

This report covers 8 high-value use cases for building a personal AI agent stack using Claude CLI (Karpathy-style: minimal, CLI-first, tool-calling loop) wired to Obsidian as the knowledge base, with external data connectors (Trino, Snowflake, web search, git) feeding context in.

Each use case is analyzed across: implementation approach, agentic loop pattern, core prompts, Obsidian vault structure, required tools, security model, and complexity. All 8 are implementable today with existing tools.

Data Agent focus: Trino + Snowflake connectors are deeply covered in Use Case 2 (Data Lake/DB Agent) and Use Case 7 (BI Dashboard), with schema knowledge base patterns, efficient SQL generation, and least-privilege security.


🧠

Personal Knowledge RAG

Claude as a second brain over your Obsidian vault β€” reads, writes, links, and resumes across sessions

Setup: Medium Time to first result: 30min–2hr

What It Is

A local AI knowledge management stack pairing Obsidian (as a Zettelkasten vault) with a CLI agent that reads/writes notes, queries semantic memory via Dataview, and autonomously maintains bidirectional links. The agent treats the vault as its memory layer β€” creating, linking, and surfacing notes on behalf of the user.

Unlike cloud-based "Second Brain" tools, everything stays local. Obsidian vaults are plain Markdown files β€” no vendor lock-in, no subscription, no data leaving your machine.

Karpathy CLI Approach

The Karpathy pattern: vault as working directory β†’ Claude reads/writes Markdown files directly. Tool access via ripgrep (search), glob, and Dataview HTTP queries. Three wiring options:

  • Claude Code CLI β€” claude --dir ~/vault with project config pointing at vault root
  • Custom Python agent β€” llm API calls with filesystem tools
  • MCP server β€” Obsidian Local REST API plugin enables structured Dataview queries from CLI

Agentic Loop Pattern

β‘ User query β†’ Perceive
β‘‘vault_search / dataview_query β†’ Recall (ranked notes)
β‘’LLM synthesizes β†’ Reason
β‘£vault_write / vault_link / vault_tag β†’ Act (new note, link, update)

Tools needed: vault_read, vault_search, vault_write, vault_glob, dataview_query, vault_link, vault_tag

Core Prompts

  • System prompt: "You are a personal knowledge management agent. Your memory is an Obsidian vault. All notes are .md with YAML frontmatter. Prefer creating/updating notes over giving long answers. Link with wikilinks. Tag in frontmatter."
  • Summarize & save: "summarize my notes on X, create a permanent note"
  • Research: "I read an article about Y, save key takeaways to Inbox"
  • Query: "what do I know about Z, find and link related notes"

Obsidian Vault Structure

Inbox/
Unsorted capture, process later
Permanent/
Decontextualized evergreen notes
References/
Literature with bibliographic metadata
Projects/
Project-specific notes
Journal/
Daily notes (Daily Notes plugin)
memory/
Agent session state & context
Templates/
Templater templates
Attachments/
Images, PDFs

Obsidian Plugins

Dataview β€” frontmatter indexing & live queries Templater β€” dynamic note templates Local REST API β€” HTTP vault queries for CLI agents Daily Notes β€” built-in session journal Obsidian Git β€” vault backup

Key Challenges & Gotchas

  • Context window overflow β€” large vaults exceed LLM context. Use two-phase retrieve-then-reason with Dataview pre-filtering.
  • Wikilink case sensitivity β€” APFS (macOS) is case-insensitive; ext4 (Linux) is sensitive. Use all-lowercase kebab-case filenames.
  • Dataview index lag β€” queries may be stale if Obsidian is closed. Trigger refresh via Local REST API after bulk writes.
  • Daily note conflicts β€” agent and user writing simultaneously clobber each other. Use separate memory/sessions/ folder for agent.
  • CORS on Local REST API β€” if agent runs on different machine via SSH, tunnel or configure API auth.

Recommended Resources

β€’ Karpathy Zero To Hero β€” neural networks, philosophy basis for Karpathy agent approach
β€’ Claude Code docs β€” set CLAUDE_CODE_DIR to vault path
β€’ Obsidian Dataview plugin β€” essential for RAG retrieval layer
β€’ Obsidian Local REST API plugin β€” HTTP vault queries

πŸ—„οΈ

Data Lake / DB Query Agent

Natural language β†’ optimized SQL on Trino + Snowflake β€” Obsidian as persistent schema knowledge base

Setup: Medium Time to first result: 2–4hr

What It Is

A CLI agent (Karpathy-style Python REPL loop + Claude API) that uses Obsidian as a persistent schema/context knowledge base, connects natively to Trino (federated SQL across S3/ADLS/GCS/Hive) and Snowflake (enterprise DW), translates natural language questions into self-verified optimized SQL, and returns human-readable results with full provenance.

Trino provides federated SQL across data lake formats (Parquet, ORC, Avro, JSON) without moving data. Snowflake connector in Trino allows joining data lake + DW tables in a single query.

Karpathy CLI Architecture

Python script REPL loop wrapping trino CLI + snow CLI + Obsidian file reads:

  • System prompt: Defines persona (DB query expert), available tools (trino_query, snow_query, obsidian_search, validate_sql), behavior rules
  • Tool layer: Python functions wrapping CLI commands β€” trino --server https://host:8443 --catalog aws -f query.sql
  • Obsidian layer: obsidian_search(query) calls grep/Dataview API to find relevant schema docs
  • Two backends: Agent decides whether to use Trino or Snowflake based on question topic

10 Core Tools

ToolWhat It Does
trino_describeDESCRIBE table via Trino CLI β€” column names, types, partition keys
trino_list_tablesSHOW CATALOGS / SHOW TABLES FROM schema
trino_queryExecute SQL, return JSON/CSV results + stats
snow_describeDescribe Snowflake table metadata
snow_list_tablesList Snowflake tables/row counts
snow_queryExecute Snowflake SQL via snow CLI
obsidian_searchSearch vault for table descriptions, business definitions
obsidian_readRead specific schema note from vault
validate_sqlEXPLAIN dry-run before execution
format_resultsConvert JSON/CSV to human-readable markdown

Agentic Loop

β‘ User NL question β†’ Decide: Trino or Snowflake?
β‘‘Search Obsidian for relevant schema docs
β‘’Generate SQL with explanation
β‘£Validate SQL via dry-run (EXPLAIN)
β‘€Execute query
β‘₯Format and return results
⑦Log query to Obsidian query history

Trino Connector β€” Key Patterns

  • Predicate pushdown: Include partition key filters (e.g., WHERE date = '2024-01-01') to leverage Trino's connector-level filter optimization
  • Sampling: Use TABLESAMPLE (BERNOULLI(1)) for quick exploration before full analysis
  • File formats: ORC (default, best compression), Parquet (widely compatible), Avro, JSON, CSV
  • Auth: Certificate-based or service account β€” avoid passwords. LDAP, OAuth2, Kerberos for enterprise
  • Trino β†’ Snowflake: Native connector queries Snowflake directly; allows JOINing Snowflake dims with S3 facts in one SQL statement

Snowflake Connector β€” Key Patterns

  • Dedicated warehouse: Agent uses a dedicated warehouse (e.g., COMPUTE_WH) with auto-suspend (60s) to avoid idle costs
  • Key pair auth: More secure than password β€” store private key passphrase in env var or secrets manager
  • Read-only role: GRANT USAGE + SELECT only β€” never INSERT/UPDATE/DELETE/SYSADMIN
  • Micro-partitions: Queries benefit from predicates on clustered columns β€” align filters with clustering keys
  • Result caching: Use identical query text for repeated exploration to hit Snowflake's query result cache

Efficient SQL Patterns

  • Partition pushdown: Always include partition key predicates β€” without them, Trino scans all partitions
  • Aggregation pushdown: Let Trino/Snowflake aggregate at the storage layer β€” never pull raw rows
  • LIMIT rows early: Apply LIMIT 10000 immediately to reduce row volume before expensive operations
  • JOIN ordering: Use smaller table as build side in hash joins; LEFT JOIN for dimension tables
  • EXPLAIN first: Always run EXPLAIN before execution on large tables β€” reject if scan > 100GB or rows > 1B
  • APPROX_DISTINCT: Use for high-cardinality counts to reduce payload

Obsidian Schema Knowledge Base

/schema/
Table schemas by catalog/schema, one file per table
/metrics/
Business metric definitions + calculation SQL
/query-history/
Daily log of queries run with NL + SQL
/business-glossary/
Term definitions and data dictionary
/data-quality/
Known issues, null semantics, update frequencies

Each schema note has YAML frontmatter: type: table-schema, catalog, schema, partitionKeys, clusterKeys, rowEstimate, lastUpdated. Dataview queries find tables related to any topic.

Query Self-Verification

  • Run EXPLAIN on generated SQL before executing β€” verify table names, column references, row estimates, no Cartesian products
  • Cross-reference column names with Obsidian schema docs β€” catch typos before execution
  • Check JOIN key types match (string vs int vs date) β€” type mismatch causes silent wrong results
  • Validate WHERE predicates: no invalid date ranges, no comparisons on non-comparable types
  • Spot-check results against Obsidian sample data for semantic correctness

Security & Auth

πŸ”‘ Credential Management
Trino: certificate or key pair auth in env vars. Snowflake: key pair + passphrase in env var. Obsidian vault: encrypted path or access-controlled sync.
πŸ›‘οΈ Least-Privilege Roles
Trino: read-only on catalogs needed. Snowflake: USAGE + SELECT only, no INSERT/UPDATE/DELETE. Trino-to-Snowflake connector: dedicated read-only Snowflake user.
🚫 Column Masking
Snowflake masking policies on PII columns. Agent sees masked values unless unmasked role granted (it shouldn't be).
πŸ“ Result Size Limits
Enforce max rows returned (e.g., 10,000). Pre-aggregate at DB layer. Use APPROX_COUNT. Never pull raw fact table rows into LLM context.

Required Tools

trino-cli β€” Trino CLI JAR snowsql β€” Snowflake CLI snowflake-connector-python Dataview + Templater β€” Obsidian plugins Python wrapper (dbagent package) anthropic Python SDK

Key Challenges

  • Schema drift: Tables change but Obsidian docs go stale β€” agent should detect inconsistencies on describe and flag
  • Vault size: Large vaults (10k+ notes) can't fit in context β€” use Dataview to dynamically filter per question
  • SQL dialect differences: Trino uses HiveQL derivatives; Snowflake uses its own dialect β€” agent must route correctly
  • Snowflake warehouse cold start: ~60s wake-up cost if suspended β€” warn about this
  • Query cost: Snowflake charges per credit β€” always run EXPLAIN first and warn about expensive queries

🌐

Research Assistant

Web search + Obsidian notes synthesis β€” auto-links into a growing knowledge graph

Setup: Medium Time to first result: 15–30min

What It Is

Claude conducts deep web research, synthesizes findings into atomic markdown notes, and automatically links them into a growing Obsidian knowledge graph. Every research session feeds a compounding graph β€” future queries leverage prior work so research on related topics auto-surfaces old notes.

Core principle: every factual claim gets an inline [Source](URL) citation. Claude fetches source pages directly to verify, not just trust search snippets.

Karpathy CLI Approach

Terminal-first, no GUI automation of Obsidian β€” just direct file I/O on .md files. Claude operates as the agent brain; Obsidian is the viewer/output:

  • claude --dir ~/vault β€” vault as working directory
  • WebSearch + WebFetch built-in (no MCP needed) β€” Claude issues searches, reads top results
  • Optional: Brave Search MCP for higher-quality results with free API key

Agentic Loop

OBSERVE WebSearch + WebFetch + vault_read
THINK LLM synthesizes, identifies gaps, decides next search
ACT Write new note, Edit existing, Bash git/Dataview

Loop terminates when: question answered, Claude signals diminishing returns, or user interrupts. Subagents can run parallel research branches.

Core Prompts

  • Research system: "Always cite sources with [Source](URL). Create one note per concept (atomic). Always link to related notes with wikilinks. If note exists, extend it β€” don't duplicate."
  • Deep research: Issue search β†’ read 2-3 source pages β†’ write atomic note β†’ link to related notes β†’ add #Sources block
  • Note synthesis: Review last 5 inbox notes, identify top 3 concepts, create synthesis note connecting all items
  • Link discovery: Run Dataview for orphaned notes, search web for how orphaned concept relates to other topics
  • Session resume: Read session-log.md, summarize last thread, suggest next searches

Obsidian Vault Structure

inbox/
Raw research output, not yet organized
topics/
Curated atomic notes, one per concept
papers/
Paper summaries and annotations
people/
Notes on individuals
projects/
Multi-topic research projects
daily/
Daily research logs (YYYY-MM-DD.md)
session-log.md
Current session context at root
index.md
Hub note with Dataview queries

Session Persistence

A session-log.md at vault root records: current research question, what was searched, what was found, what remains open, suggested next steps. Updated at end of each session. On startup, Claude reads it first.

Git commits provide full history β€” git log can reconstruct past research threads. The vault becomes a fully versioned, queryable knowledge base.

Source Tracking

  • Inline citation: Every factual claim ends with [Source](https://...)
  • Frontmatter: source: https://... and sources: [...] in YAML
  • Source block: Notes end with # Sources block listing all URLs
  • Provenance: Note's created date + git commit history = full provenance trail

Key Plugins

Dataview β€” query vault as database Templater β€” dynamic note templates Recent Files β€” surface recently touched notes Natural Language Dates β€” type "tomorrow", "next week" in dates Brave Search MCP (optional) β€” higher quality search Memory MCP (optional) β€” knowledge-graph persistent memory
⚠️ Hallucination risk: Claude may hallucinate facts even with citations. Always require inline [Source](URL) for every claim. Claude should fetch source page directly for verification, not trust search snippets.

Key Challenges

  • Duplicate notes: Instruct Claude to always search vault first before creating a new note
  • Wikilink breakage: Rename handling via Obsidian's internal link index; use stable lowercase filenames
  • Dataview slowness: On 10k+ vaults, use limit clause and cache queries
  • Agent drift: Set max session length; reference session-log.md explicitly to keep on track
  • Research quality degradation: Summarize and git-commit every 10 searches to prevent drift

πŸ’»

Codebase Explorer

Claude maps and explains any large codebase β€” outputs architectural docs to Obsidian

Setup: Medium Time to first result: 15–30min

What It Is

An AI-powered codebase exploration and documentation system. Claude reads local codebases, traces execution paths, maps dependencies, explains architecture, and auto-generates interconnected documentation within an Obsidian vault β€” turning a codebase into a searchable, navigable knowledge graph with ADRs, inline explanations, and architectural overviews.

Claude Code's Explore subagent (read-only, Haiku-powered) is optimized for codebase traversal at scale β€” no Write/Edit tools, purely exploratory.

Karpathy CLI Approach

Vault as working directory β†’ Claude Code operates on local repo β†’ generated docs written to Obsidian vault at docs/vault/:

  • Claude Code Explore subagent: Read-only traversal, designed for large codebases
  • Custom SKILL.md: Drives exploration loop with structured prompts
  • Vault output: docs/vault/{project-name}/ β€” can be separate git repo or subdirectory
  • Git history: git log/blame/show enrich understanding of decisions and evolution

Agentic Loop β€” Tools

ToolPurpose
ReadRead file contents β€” primary data source, offset/limit for large files
exec (grep/rg)Search patterns: function names, imports, comments
exec (ctags)Symbol resolution β€” find where functions/types are defined
exec (git log/blame)Git history analysis β€” who changed what, when, why
exec (tree-sitter)AST parsing β€” structural extraction (optional but powerful)
Write/EditWrite documentation to Obsidian vault

Core Prompts

  • Architecture explanation: "Identify layers, key abstractions, data flow, entry/exit points. Cite file paths and line numbers."
  • Dependency tracing: "Trace dependency chain β€” what it depends on, what depends on it. Identify circular dependencies."
  • Code review: "Design patterns, bugs, performance, conventions, error handling. Rate severity Γ— confidence."
  • ADR generation: "Generate Architecture Decision Record. Base on git history analysis and code structure."
  • Git enrichment: "Who were primary authors? Major evolution points? Long-standing TODOs/FIXMEs?"

Obsidian Vault Structure

index.md
Project overview, key metrics, links
architecture.md
System-wide architectural overview
modules/
One .md per major module
adrs/
Architecture Decision Records (ADR-001.md…)
explorations/
Time-boxed exploration reports
code-graph/
Dependency graphs, call chains
search-indexes/
Dataview-generated indexes

Large Codebase Strategies

  • Hierarchical summarization: Explore module-by-module, summarize each, then synthesize at higher levels
  • Targeted exploration: Only read files relevant to query β€” never full repo at once
  • ctags TAGS file: Pre-generated, checked into repo β€” O(1) symbol-to-file lookup
  • Tree-sitter CLI: Optional but dramatically improves structural understanding
  • Exploration budget: Max 50 files read, max 5 subdirectories traversed per query

Key Plugins

Claude Code (core) Obsidian (documentation UI) ripgrep β€” fast pattern matching git β€” version control tree-sitter CLI (optional) Dataview + Templater
⚠️ AI-generated docs can be wrong: Cross-reference Claude's conclusions with git blame and actual code. Use tree-sitter for structural ground truth. Mark generated docs with "⚠️ AI-generated β€” verify assertions".

Key Challenges

  • Context overflow: Never read entire repo β€” always scope by directory or import graph
  • Stale docs: Regenerate periodically; include last-analyzed date; Dataview flags stale (>30 days)
  • Wikilink breakage: Always write .md before linking to it; use lowercase kebab filenames
  • Monorepo variants: One vault per package with root vault linking all sub-vaults

πŸ“‹

Meeting Intelligence

Transcript β†’ decisions + action items β†’ Obsidian linked task graph

Setup: Medium Time to first result: 30–60min

What It Is

Claude ingests meeting content (raw notes, Zoom/Teams/Otter transcripts, or audio via Whisper), extracts decisions, action items with owners and deadlines, and writes them into Obsidian as a linked task graph. Every meeting becomes a living knowledge graph node β€” tasks surface automatically, owners are explicit, deadlines tracked via Dataview.

Meets the "dead notes" problem: decisions get forgotten, action items get lost. This system turns every meeting into an actionable, trackable knowledge graph β€” with zero manual bookkeeping after setup.

Karpathy CLI Approach

Pipeline: transcript file β†’ shell script strips formatting/timestamps β†’ Claude extracts structured JSON β†’ Obsidian writes markdown notes:

  • Raw text: claude --print < meeting-notes.txt
  • Transcript formats: Zoom VTT, Teams TXT, Otter TXT/XLSX, Google Meet VTT β€” strip with sed/awk before processing
  • Audio: Whisper.cpp (local, CPU-friendly) β†’ transcript.txt β†’ Claude β†’ Obsidian

Structured Output Schema

Claude outputs ONLY valid JSON in this exact schema:

{
  "meeting_title": "string",
  "meeting_date": "YYYY-MM-DD",
  "attendees": ["name1", "name2"],
  "summary": "2-3 sentence executive summary",
  "decisions": [{ "id": "DEC-1", "decision": "...", "owner": "..." }],
  "action_items": [{
    "id": "ACT-1",
    "description": "starts with verb",
    "owner": "full name",
    "deadline": "YYYY-MM-DD or null",
    "priority": "high|medium|low",
    "status": "open"
  }]
}

Obsidian Vault Structure

meetings/
YYYY-MM-DD-title.md β€” meeting note template
tasks/
ACT-N.md β€” individual task notes
people/
Person notes with meeting backlinks
projects/
Project notes linked to related meetings

Every task note has frontmatter: id, owner, deadline, priority, status, meeting_source and links back to the originating meeting.

Cross-Linking Graph

  • Meeting β†’ Task: Task file path in meeting frontmatter; task links back via meeting_source
  • Meeting β†’ Person: Attendee list in frontmatter; each person note has meetings: [...]
  • Meeting β†’ Project: Topics/projects in frontmatter; project note has meetings: [...]
  • Task β†’ Project: project: [[Project Name]] in task frontmatter

Dataview Queries for Task Tracking

Open actions:
TABLE owner, deadline, priority FROM #action-item
WHERE status = "open" AND deadline != null SORT deadline ASC
Overdue:
TABLE owner, deadline FROM #action-item
WHERE status = "open" AND deadline < date(today)
By person:
TABLE deadline, priority FROM #action-item
WHERE owner = "Alice Smith" SORT deadline ASC

Required Tools

Claude CLI Dataview + Templater Obsidian Tasks plugin Calendar plugin Whisper.cpp (optional, for audio)
⚠️ Implied vs explicit actions: People rarely say "ACTION: Alice will..." β€” they say "I can send that over" or "someone needs to follow up". Claude must infer from context. Implied actions should be flagged with lower confidence for user review.

Key Challenges

  • Noisy transcripts: Auto-generated transcripts have hallucinated words, speaker confusion. Always review action item extraction.
  • Context window limits: Very long meetings (>2 hours) chunk by 30-min segments, then merge outputs
  • Owner name disambiguation: "John", "John from engineering", "JS" all same person β€” deduplicate
  • Deadline inference: "next Tuesday" β†’ meeting date + 7 days; complex verbal references need manual correction
  • Action items without owners: Flag as orphaned tasks for meeting organizer assignment

✍️

Writing Pipeline

Claude drafts from Obsidian context, learns your voice, writes consistently across months

Setup: Medium Time to first result: 1–2hr

What It Is

A self-contained AI writing pipeline where Claude operates as a persistent agent over an Obsidian vault, reading accumulated notes and context to draft documents β€” emails, reports, blog posts, documentation, presentations β€” in a consistent authorial voice. The vault acts as both knowledge base and drafting workspace.

Most AI writing tools start cold every session. This pipeline gives Claude persistent memory β€” writing that improves over time, accumulates knowledge, and maintains voice consistency.

Karpathy CLI Approach

Claude CLI + Obsidian MCP server via Local REST API plugin:

  • Read: Fetch relevant notes from vault (by topic, folder, or search) β†’ Claude analyzes context packet (typically 5–15 notes)
  • Draft: Generate content using Templater template β†’ write to vault/drafts/
  • Review: Show draft to user or auto-evaluate against criteria
  • Edit: Iterate draft in place
  • Commit: Git commit draft β†’ promote to vault/writing/final/ when approved

Core Prompts

  • Drafting from context: "Read notes on [TOPIC]. Draft a [DOCUMENT TYPE] synthesizing this context. Maintain [STYLE]. Output to vault/drafts/[date]-[slug].md."
  • Style matching: "Analyze writing style of [voice samples]. Apply: [specific traits]. Match sentence length, paragraph structure, formality."
  • Tone adaptation: "Audience is [AUDIENCE]. Adjust: more formal/informal, technical/plain-language, persuasive/informative."
  • Outline expansion: "Expand each section of [outline-file.md] into full prose. Keep flowing, avoid bullet fragmentation."

Voice / Style Learning

Claude learns writing patterns by reading 3–5 representative documents from vault/style/ or vault/writing/final/:

  • Average sentence length, paragraph structure, vocabulary level
  • Passive vs active voice, bullet frequency, formality
  • Store summary in vault/notes/meta/authorial-voice.md β€” Claude reads it at every session start
  • After edits, save feedback as a note: what changed, why, what to maintain β†’ compounding improvement

Writing Types & Turnaround

TypeApproachTime
πŸ“§ EmailsFast, context-loaded prompts, conversational30–60s
πŸ“„ ReportsLonger context, outline-first, multiple iterations10–30min
✍️ Blog postsFlexible, engaging opening, strong conclusions5–15min
πŸ“– DocumentationHighly structured, code examples, wikilinks10–20min
πŸŽ₯ PresentationsMarp format, slide notes in YAML frontmatter5–15min

Obsidian Vault Structure

vault/drafts/
Active work-in-progress, frequently git-committed
vault/writing/final/
Polished, approved documents
vault/templates/
Per-document-type Templater templates
vault/context/
Background knowledge notes
vault/style/
Voice corpus: 5–10 representative docs

Required Tools

Claude CLI Templater (required) Various Complements (auto-completion) Obsidian Git (required) Local REST API + Obsidian MCP

Key Challenges

  • MCP connection fragility: Local REST API plugin must be running in Obsidian β€” if closed, connection fails
  • Context window limits: Can't read entire vault β€” discipline to load only relevant context notes
  • Voice matching is approximate: Claude extracts patterns but subtle personal voice needs periodic correction/feedback
  • Export to DOCX/PDF: Requires Pandoc/Marp CLI β€” not built-in

πŸ“Š

BI Dashboard + Analyst

Live Trino/Snowflake SQL β†’ trend analysis β†’ Obsidian notes with charts and anomaly alerts

Setup: Medium-High Time to first result: 1–2hr

What It Is

Claude Code runs as a headless BI analyst loop β€” on demand or on a schedule, it executes SQL against Trino/Snowflake, performs trend analysis, anomaly detection, and forecasting, then writes rich Obsidian notes with tables, trend indicators, inline charts, and Dataview queries β€” creating a living BI dashboard that's browsable, searchable, and annotatable.

Replaces expensive BI tooling with a flexible, text-first, LLM-powered analyst that writes its own queries and generates its own reports. Obsidian becomes a personal/team knowledge base that also happens to be a live BI surface.

Agentic Loop β€” 9 Steps

β‘ Receive directive (trend/anomaly/forecast/scheduled report)
β‘‘Identify data source(s) and table(s) from schema registry
β‘’Write SQL (or generate with LLM + schema context)
β‘£Execute via trino-cli / snowsql; capture CSV/JSON
β‘€Analyze in context: z-score, moving average, seasonality
β‘₯Render: tables, trend arrows (β–²β–Όβž€), Mermaid or Chart.js
⑦Write/update Obsidian note at vault/metrics/
β‘§If anomaly β†’ tag #alert, write alert note, notify
⑨Log source query in frontmatter for full provenance

Core Prompts

  • Trend analysis: "Query last N days of metric_column grouped by group_by. Return Markdown table with trend arrows + 7-day moving average. Write Obsidian note with Mermaid line chart."
  • Anomaly detection: "Compute z-score vs mean/stddev. Flag rows where |z| > 2.5. Write alert note at vault/alerts/ if anomalies found."
  • Forecast: "Using last 90 days of metric data, perform 14-day linear trend forecast. Overlay on chart. Write note with confidence intervals."
  • Data storytelling: "Latest 30-day results from vault/metrics/. Write narrative briefing: headline findings β†’ drill-down β†’ flag anything needing attention β†’ next steps."

Chart Rendering β€” 6-Tier Strategy

MethodComplexityBest For
🟒 MermaidLowSimple lines, bars β€” no plugin needed
🟑 Grafana iframeLow-MedInteractive dashboards β€” embed via ![[chart.html]]
🟑 obsidian-charts pluginMediumClean chart syntax, JSON/CSV data in note
🟑 Chart.js HTML embedMediumRich charts, animations, tooltips
πŸ”΄ matplotlib PNGMediumPublication-quality static charts
πŸ”΄ Raw SVGLowLLM-generated, self-contained, animatable

Anomaly Detection Methods

  • Z-score: (value - mean) / stddev over 30-day rolling window. Flag if |z| > 2.5.
  • IQR: Flag values outside Q1 - 1.5Γ—IQR and Q3 + 1.5Γ—IQR. Better for skewed distributions.
  • % change: Flag if today's value deviates >X% from 7-day or 30-day moving average.
  • Seasonality-adjusted: Compare against same-day-of-week rolling average to avoid false alerts on weekly cycles.
  • MA crossover: Short-window MA crosses long-window MA = classic trend change signal.

Alert suppression: after firing, suppress repeats for 24h (same metric) to avoid alert fatigue.

12 Common Business Metrics

DAU
Daily Active Users
MRR
Monthly Recurring Revenue
ARPU
Revenue per User
Conversion Rate
%
CAC
Customer Acquisition Cost
Churn Rate
% monthly
NPS
Net Promoter Score
AOV
Average Order Value
Infra Cost Ratio
infra cost / revenue
Error Rate
API errors / total requests
P99 Latency
milliseconds
Queue Depth
messages pending

Cron Schedules

ScheduleTaskOutput
0 6 * * *Morning brief: key metrics yesterdayvault/metrics/YYYY/MM/daily-YYYY-MM-DD.md
0 7 * * 1Weekly rollup: 7-day week-over-weekvault/reports/weekly-YYYY-WXX.md
0 6 1 * *Monthly close: full month narrativevault/reports/monthly-YYYY-MM.md
*/15 * * * *Ops: queue depth, error rate, P99vault/ops/YYYY-MM-DD-HHMM.md
0 8 * * *14-day forecast refresh on revenuevault/forecasts/forecast-YYYY-MM-DD.md

Required Tools

Claude Code CLI Trino CLI SnowSQL Dataview + Templater + Calendar Grafana (optional, for chart iframes) Python + pandas + matplotlib (optional) cron / systemd timer
⚠️ LLM hallucinating metric values: Always include raw SQL + result set in note frontmatter. Make reproducibility first-class. Never let Claude summarize without anchoring to actual values.

Key Challenges

  • Vault growth: Implement retention policy β€” archive notes >12 months to separate vault
  • Trino timeouts: Set CLI timeout (120s); prefer pre-aggregated tables; log and retry once
  • Snowflake credits: Use QUERY RESULT CACHE; prefer batch aggregation over ad-hoc per metric
  • Alert fatigue: Seasonality-adjust thresholds; implement cooldown; require consecutive breaches
  • No live refresh: Dataview re-runs on note open; use Auto Refresh plugin or Templater journal creation

πŸ“š

Learning System

Papers β†’ summaries β†’ flashcards β†’ spaced repetition β€” all in Obsidian

Setup: Medium Time to first result: 1–2hr

What It Is

Claude reads papers (PDF), articles (URLs), and books; generates structured summaries; extracts key concepts; creates flashcards and Socratic questions; and writes them into Obsidian organized by topic. An SRS plugin (SM-2 or FSRS algorithm) schedules flashcard reviews and tracks retention β€” a closed-loop learning system where AI both ingests knowledge and generates review material.

Transforms passive reading into active learning with zero extra effort. The knowledge graph means concepts link across sources β€” building a personal wiki of learned material.

Agentic Loop β€” 7 Steps

β‘ Ingest: PDF (pdftotext), URL (web_fetch), arXiv, local text
β‘‘Summarize: abstract, key findings, methodology, implications
β‘’Extract concepts: 10–20 atomic facts, definitions, relationships
β‘£Generate flashcards: basic Q/A, cloze deletion, multi-modal
β‘€Socratic questions: reasoning over recall
β‘₯Write to Obsidian: source note + concept notes + flashcard note
⑦Schedule review: SM-2 or FSRS intervals, next-review dates

Flashcard Formats

Basic Q/A
Q: [question]
A: [answer]
Best for: definitions, facts
Cloze deletion
{{c1::hidden answer}} in context
Best for: formulas, fill-in-blank
Multi-modal
Image/diagram + explanation
Best for: anatomy, circuits, maps
Reversed cards
Auto-generate both directions
Forces deeper understanding
Socratic
Question + reasoning path + application
Requires reasoning, not recall

Spaced Repetition Algorithms

SM-2 (SuperMemo 2)
Classic algorithm. Ease factor starts at 2.5. Intervals: 1d β†’ 6d β†’ interval Γ— EF. Quality ratings 0–5. Reliable, well-understood.
FSRS (Free Spaced Repetition)
Modern neural-network-based. Uses 9 stability/hardship parameters. Better predictions, less "ι¬Ό rate" than SM-2. Available in Anki 23.10+ and Obsidian SRS plugins.

Review cycles: limit new cards/day (20–30) to avoid overwhelm. Review all due cards before adding new ones.

Obsidian Vault Structure

sources/papers/
Paper summaries by subject
sources/articles/
Article summaries by subject
concepts/
Atomic concept notes, linked to sources
flashcards/
Batch flashcards by date
daily/
Daily review logs
templates/
source, concept, flashcard templates

Frontmatter Schema

type: source|concept|flashcard
source-type: paper|article|book|video
title:
author:
date:
subject:
summary: (for sources)
concepts: [tag1, tag2]
read: false  (sources)
review-count: 0
last-review:
next-review:
rating: 0  (SM-2 ease Γ— 100)
interval: 1  (days)
source-url:

Dataview Queries

  • Due today: TABLE WHERE next-review <= date(today) AND type = "flashcard"
  • Overdue: TABLE WHERE next-review < date(today)
  • Retention stats: TABLE avg(rating), count() GROUP BY subject
  • By subject: TABLE key, value FROM "concepts" WHERE subject = "ml"

Key Plugins

Templater Dataview obsidian-spaced-repetition Natural Language Dates PDF Highlights pdftotext / pdfminer (optional)

Key Challenges

  • Flashcard quality: AI-generated cards can be superficial β€” always review before scheduling
  • Long papers: Process in chunks (abstract+intro, methods, results, conclusion) β€” then merge
  • SRS plugin quality: Community plugins vary β€” test multiple; check recent commit activity
  • Review overwhelm: Start strict (10–15 new cards/day); quality over quantity
  • Math notation: Use LaTeX in Obsidian (native support); cloze cards work well for formulas

Quick Comparison β€” All 8 Use Cases

Use Case Setup Time to Result Key Data Source Output
🧠 Knowledge RAG Medium30min–2hrObsidian vaultLinked atomic notes
πŸ—„οΈ Data Agent Medium2–4hrTrino + SnowflakeSQL + results + schema docs
🌐 Research Medium15–30minWeb searchLinked topic notes + citations
πŸ’» Codebase Medium15–30minLocal codebaseDocs + ADRs + architecture
πŸ“‹ Meetings Medium30–60minTranscripts / audioDecisions + task graph
✍️ Writing Medium1–2hrObsidian vaultDrafts + final documents
πŸ“Š BI Dashboard Medium-High1–2hrTrino + SnowflakeMetric notes + alerts + forecasts
πŸ“š Learning Medium1–2hrPDFs / URLs / papersSummaries + flashcards + SRS

Common Threads Across All 8 Use Cases