Guides menu

Guides Background

Research influences

A bibliography. The PKM lineage, the 2023–2025 multi-agent literature, and the capability-system papers behind specific design decisions.

This is a bibliography rather than a manifesto. The point is to make it easy to trace specific Egghead design choices back to the prior art they came from, so that when something looks unusual you can read the source and decide whether you agree with the call.

The lineages worth knowing about, in roughly the order they matter to the system: the personal-knowledge-management tradition that gave Egghead the records-first shape, the 2023–2025 multi-agent research that pointed at graph topology and capability-scoped roles, and the capability-systems literature that shaped the authority model.

Personal knowledge management

The notion of a knowledge graph as something that should be atomic, linked, and durable is decades older than LLMs. Three strands of that tradition shape the Egghead store directly.

Zettelkasten

Niklas Luhmann was a 20th-century German sociologist who developed a note-taking method using index cards in wooden slip-boxes. Each card held one idea; each had a unique identifier; cards pointed at other cards by id. Luhmann accumulated roughly 90,000 cards over his career and credited the system with enabling his published output of around fifty books and six hundred papers across multiple disciplines.

The core moves of the Zettelkasten method are atomicity (one idea per card), stable identifiers (so links don’t break when topics change), direct links between cards instead of folder hierarchies, and emergent sequencing (related cards cluster because they link to each other, not because they were catalogued together).

Egghead inherits all four. Records are atomic, with one idea per file. Records have stable ids that don’t change when you rename or move them. Links between records — links: and [[wikilinks]] — are the primary structure. Forward links and backlinks are both first-class queries. The directories where the files happen to sit on disk are mostly an organizational convenience.

  • Niklas Luhmann, Kommunikation mit Zettelkästen, 1981.
  • Sönke Ahrens, How to Take Smart Notes, 2017 — the most common English entry point.

Building a Second Brain

Tiago Forte’s 2022 book popularized a more project-oriented take on personal knowledge management. The two acronyms worth naming are CODE (Capture, Organize, Distill, Express), which describes how knowledge moves from in-the-world to in-your-head to in-your-output, and PARA (Projects, Areas, Resources, Archives), which is a taxonomy for organizing what you store.

Egghead’s record classes — durable, inbox, transcript, deliberation — overlap the Capture and Organize parts of CODE without prescribing the rest. We deliberately do not enforce a PARA-style folder taxonomy because Egghead is meant to live alongside whatever you already use, and the folder layout is yours to choose.

Evergreen notes

Andy Matuschak’s publicly readable working notes at notes.andymatuschak.org introduced the term “evergreen” for notes that develop iteratively, get edited continuously, and become more refined over time rather than being captured once and abandoned. Egghead’s class: durable records correspond.

Multi-agent systems (2023–2025)

The multi-agent literature is moving fast enough that anything written in early 2026 risks being out of date by the time you read it. The papers below are the ones whose findings landed in specific Egghead choices.

MultiAgentBench (MARBLE)

Zhu et al., MultiAgentBench: Evaluating the Collaboration and Competition of LLM Agents, ACL 2025 (arXiv:2503.01935).

MARBLE was the first broad evaluation framework for multi-agent collaboration. Egghead’s egghead eval command ports the milestone-based KPI scoring, the communication and planning ratings, and the category-specific dimensions (Innovation/Safety/Feasibility for research, buyer/seller scoring for bargaining). MARBLE’s reference research personas ship as the --roster task mode. The full mapping is in Evals.

The headline MARBLE finding that Egghead leans on is that graph topology out-performs star topology on research tasks when the coordination model is shared-transcript rather than dispatcher-routed. The chat-room shape in Egghead is the direct result.

MAST: a taxonomy of multi-agent failure

Cemri et al., MAST: A Taxonomy of Failure Modes in Multi-Agent LLM Systems, 2025 (arXiv:2503.13657).

An empirical study of where multi-agent systems break in practice. The headline finding is that most multi-agent failures are architectural, not model-level. Adding a stronger LLM rarely fixes a system that is failing at coordination, role separation, or context handling — those are infrastructure problems, not intelligence problems.

This is the paper whose conclusions justify investing in supervision trees, capability-scoped roles, and explicit handoff. If the failure mode is architectural, treating failure as a runtime primitive is the response, which is exactly what OTP gives you.

Talk Isn’t Always Cheap

Bhatt et al., Talk Isn’t Always Cheap: Conformity in LLM Multi-Agent Systems, 2025 (arXiv:2412.10859).

A study of what happens when multiple LLM agents discuss a question. The result, briefly: even capable models flip from a correct position to an incorrect one under persuasive-but-wrong peer pressure. Persona prompts help slightly; capability differentiation — different agents literally cannot perform each other’s roles — helps more, because role separation enforced by the runtime is harder to talk around than role separation enforced by a prompt.

Egghead’s capability model makes that role separation structural. An agent with records.read cannot suddenly perform a write because a peer agent argued it into doing so; the runtime simply will not let it. That choice is partly a response to this paper.

Capability-based security

The authority model in Capabilities draws from three lineages. None of these were designed for LLM agents, which is exactly why they generalize well.

OpenBSD pledge and unveil

pledge(2) restricts a process to a list of allowed verbs; unveil(2) restricts the parts of the filesystem those verbs can touch. Egghead’s capabilities: field is the pledge equivalent — what the agent is allowed to do. Egghead’s sandbox: and in: fields are the unveil equivalent — where the agent is allowed to do it. Both lists can only narrow over time.

OpenBSD’s posture is also worth naming directly: 33 of OpenBSD’s 36 boot processes use pledge, while only 3 of 47 adopted FreeBSD’s Capsicum (below). Simplicity drives adoption, and we tried to keep the same posture: one capability declaration per record, widening is a human edit, never a runtime “allow once” prompt.

FreeBSD Capsicum

Watson et al., Capsicum: Practical Capabilities for UNIX, USENIX Security 2010 (paper).

Capsicum’s contribution was granularity: rights attach to scoped resources, not just to verbs. A Capsicum capability is a verb plus a parameter scope, restrictions can only narrow, and narrowing is irreversible. This is the shape of Egghead’s scope vocabulary — net.get{hosts}, fs.read{in, paths}, proc.exec{in, cmds, patterns} — and the rule that scopes can only narrow.

Google Macaroons

Birgisson et al., Macaroons: Cookies with Contextual Caveats for Decentralized Authorization in the Cloud, NDSS 2014 (paper).

Macaroons are authorization tokens with chained caveats: the holder can add restrictions without contacting the issuer, and subset-only delegation is provable without coordination. This is where Egghead’s agent.grant attenuation rule comes from: when one agent grants capabilities to another, the proposed grants must be a subset of the granter’s own. No agent can hand out authority it does not hold.

See also

  • Capabilities is where the security-paper lineage actually lives in code.
  • Evals is where the MARBLE scoring methodology lives.
  • Chat rooms is where the graph-topology shape MARBLE’s findings argue for is implemented.
  • Why Elixir/OTP is where the MAST conclusion (“failures are architectural”) drives the runtime choice.
On this page