Turn Any Markdown Vault into a Knowledge Graph
Markdown is the universal note-taking format. Obsidian, Foam, Logseq, Hugo, Jekyll, Quartz, plain folders synced via Syncthing — all of them ultimately store .md files in directories on disk. KnodeGraph walks any such directory tree, parses YAML frontmatter as structured node attributes, resolves wiki-style `[[links]]` and standard Markdown links into typed graph edges, recognises daily-notes patterns, and runs Claude-powered extraction across the prose. Bring whichever flavour of Markdown PKM you actually use; the graph comes out the same.
Why connect Markdown Vault to KnodeGraph
- Markdown was invented by John Gruber and Aaron Swartz in 2004 — twenty years on, it's the dominant plain-text format for technical writing, PKM, and static-site authoring.
- Vault structure varies: Obsidian uses flat or nested folders with `.obsidian/` config; Foam wraps VS Code with `.foam/` metadata; Logseq uses a `pages/` + `journals/` split; Hugo and Jekyll use `content/` and `_posts/` conventions. KnodeGraph's directory-walking ingest handles all of them.
- YAML frontmatter (the `---`-delimited block at the top of many Markdown notes) parses into node attributes — `tags:`, `aliases:`, `created:`, `status:`, custom fields all become queryable.
- Wiki-link parsing covers the `[[Page Name]]` syntax (Obsidian, Foam) and the `[[Page Name|Display Text]]` alias variant; Logseq's `[[]]` and `((block-id))` references both resolve.
- Backlink resolution: KnodeGraph builds the bidirectional link graph during ingest, the same way Obsidian and Foam do at runtime — so every note that links to 'Bayesian Inference' becomes a backlink edge in the graph.
- Daily-notes pattern (filenames like `2024-09-15.md` or `2024/09/15.md`) detected automatically and flattened into per-date nodes, regardless of which tool created the convention.
- Tool-specific quirks handled: Obsidian's `embed` syntax (`![[Page]]`), Foam's templates, Logseq's bullet-block hierarchy with block IDs, Quartz's transclusion.
How it works end-to-end
1.Point KnodeGraph at any Markdown directory
Zip the vault folder and upload. KnodeGraph walks the directory tree recursively, picks up every .md file, and builds an index of file paths → page names → content. Hidden directories (`.obsidian/`, `.foam/`, `.git/`, `.logseq/`) are skipped automatically — KnodeGraph reads the content, not the tool's internal config.
2.Parse frontmatter and links
YAML frontmatter at the top of each note (`---` delimited) parses into a structured attribute set on the resulting node. Wiki-links `[[…]]` and standard Markdown links `[text](path)` resolve to typed edges. Aliases (Obsidian's `aliases:` frontmatter field) feed the link resolver so `[[NYC]]` and `[[New York City]]` find the same target node.
3.Detect the variant
KnodeGraph auto-detects whether you're using Obsidian (`.obsidian/`), Foam (`.foam/`), Logseq (`logseq/` and bullet-block syntax), Hugo (`content/` + Hugo frontmatter), or a plain vault. The detection biases the ingest defaults — daily-notes folder location, link syntax preference, embed handling — but you can override at upload time.
4.Pick a PKM template
'Personal Knowledge' for general PKM and zettelkasten patterns. 'Research Notes' for academic/literature vaults. 'Daily Journal' for vaults dominated by daily notes. 'Engineering Wiki' for technical-team Markdown wikis. Templates bias Claude's extraction toward the entity types that actually appear in each style.
5.Walk the structured graph
Wiki-links become 'links_to' edges automatically. Frontmatter properties become attributes. Claude-extracted prose entities (people, concepts, sources, theories, decisions) layer typed relationships on top — so the graph captures both the explicit links you authored and the implicit relationships in the text.
6.Refresh on whatever cadence fits
Re-zip and re-upload weekly, monthly, or on-demand. KnodeGraph dedupes by file path and note title; a file watcher integration is on the roadmap for live sync. For most PKM use cases, weekly re-ingest is plenty.
Why KnodeGraph is a good fit
- •Tool-agnostic ingest — works with whichever Markdown PKM you actually use without locking you in.
- •Wiki-link resolution plus prose extraction means both your explicit links and your implicit topical connections show up in the graph.
- •Frontmatter as queryable attributes — Dataview-style queries you'd write in Obsidian map cleanly to graph traversals here.
- •Daily-notes pattern recognition handles the temporal dimension — find every note from a specific week, every concept that first appeared in March 2024.
- •Self-hosted plan honours the local-first ethos most Markdown users care about.
- •100+ language support — useful for polyglot vaults (mixed English/Arabic/Mandarin notes are common in international academic and journalism work).
Supported formats
- CommonMark / GitHub Flavored Markdown (.md, .markdown)
- Obsidian vaults (with [[wiki-links]], embeds `![[…]]`, and aliases)
- Foam workspaces (VS Code, with `.foam/` metadata — auto-skipped during walk)
- Logseq graphs (with bullet-block hierarchy, block IDs `((…))`, and `pages/` + `journals/` split)
- Quartz, Hugo, and Jekyll static-site `content/` directories
- Plain hand-rolled Markdown folders synced via Syncthing, Dropbox, iCloud, etc.
- YAML frontmatter (parsed into node attributes — tags, aliases, status, created, custom fields)
- Daily notes (filenames like `2024-09-15.md`, `2024/09/15.md`, or `journals/2024_09_15.md`)
Limitations to know up front
- No live file-watcher today — re-ingest is via re-zip-and-upload. A directory-watching daemon for the self-hosted plan is on the roadmap.
- Tool-specific plugin output (Obsidian's Dataview query results, Logseq's query blocks, Foam's templates) renders only at runtime in the source tool — KnodeGraph sees the raw query source, not the rendered table.
- Logseq's outline/bullet-block model is partially flattened: block IDs are preserved as anchors but the deep nesting tree is collapsed into linear text per page. For Logseq users who rely heavily on block references, this is a known limitation.
- Canvas files (Obsidian's `.canvas`, Foam's whiteboards) and Excalidraw drawings aren't ingested — Markdown only.
- No two-way write-back: KnodeGraph never edits files in your vault. Edits in the graph stay in KnodeGraph's database.
- Image embedding `` captures the image reference; the image content is not analysed unless OCR is explicitly enabled at ingest time.
Frequently Asked Questions
Why a separate 'Markdown Vault' integration when you already have one for Obsidian?
Because not everyone uses Obsidian. Foam, Logseq, Quartz, Hugo, Jekyll, and plain hand-rolled Markdown folders all share the same underlying format — .md files in directories — but each has slightly different conventions around frontmatter, wiki-link syntax, daily notes, and tool-specific config folders. The 'Markdown Vault' integration is the tool-agnostic version that handles whichever variant you happen to use, including ones we haven't named explicitly. The Obsidian integration is a tighter, opinionated path for Obsidian-specific conventions like canvas placeholders and embed syntax.
How does Logseq differ from Obsidian for ingest?
Logseq's outline/bullet-block model is the main difference. Where Obsidian treats each .md file as a long-form document, Logseq treats each file as a tree of bullet blocks where every block can be referenced individually via a `((block-id))` syntax. KnodeGraph preserves the block IDs as anchors but collapses the nested-bullet structure into linear text per page. For most extraction workflows that's fine — the entity-relationship value is in the prose, not the bullet hierarchy. Heavy block-reference users may want to denormalise their Logseq graph to flatter pages before ingest.
Can it ingest a Hugo or Jekyll content folder for SEO research?
Yes, and it's a useful pattern. Point KnodeGraph at your `content/` (Hugo) or `_posts/` (Jekyll) directory and treat your published-content corpus as a graph. Hugo's frontmatter (`title`, `tags`, `categories`, `date`, custom params) parses into node attributes; the body Markdown extracts entities and relationships normally. Useful for content-cluster analysis, internal-linking audits, and finding gaps in topical coverage across hundreds of blog posts.
What about daily-notes patterns specifically?
KnodeGraph auto-detects the most common daily-notes filename patterns: `2024-09-15.md`, `2024/09/15.md`, `journals/2024_09_15.md` (Logseq style), and a few variants. Detected daily notes get a `note_type: daily` attribute plus a parsed `date` attribute, which makes temporal queries possible — 'show every concept that first appeared in March', 'every person mentioned in the last 60 days of daily notes'. If your daily-notes pattern is non-standard, you can declare it explicitly at upload time via a regex.
How do you handle conflicts between wiki-link aliases and actual page names?
KnodeGraph runs a two-pass link resolver. First pass: build a name → file map from filenames and any `aliases:` frontmatter. Second pass: resolve every `[[link]]` to the target file using that map; unresolved links become 'pending' nodes (so a link to a not-yet-written note still creates a placeholder, the same way Obsidian does). If a link is genuinely ambiguous (two notes with the same alias), KnodeGraph picks the closer file by directory proximity and flags the conflict in the staging UI for manual resolution. In practice this rarely happens in well-curated vaults.
Is there an offline mode for fully air-gapped vaults?
The hosted SaaS requires a Claude API call-out for extraction. The self-hosted plan can run with a self-hosted LLM (Ollama, vLLM, or any OpenAI-compatible inference endpoint) for fully air-gapped operation. Quality is somewhat lower than Claude — open-source models lag on entity-relationship extraction — but for users whose threat model requires no external network calls, this is the right path. Most Markdown-vault power users we've worked with land on hosted SaaS plus a personal Anthropic key on the self-host plan, which keeps the inference call internal but uses Claude.
Connect Markdown Vault to KnodeGraph
Start free with 3 graphs and 100 nodes. Upgrade to Pro for AI extraction, unlimited graphs, and 50K nodes.
Get Started Free