Every young technology field eventually discovers that clever models are not enough. First come the demos. Then come the tools. Then comes the long, less glamorous phase in which everyone invents formats, protocols, connectors, metadata conventions, manifest files, registries, and compatibility layers. This is where artificial intelligence is now.
The current generation of large language models is powerful, but oddly helpless when isolated. A model can reason about a database schema, but it does not know your database schema. It can explain an incident runbook, but it cannot guess which runbook your operations team actually uses. It can write code against an API, but only if the API’s semantics, constraints, authentication model, and institutional folklore are available in the right form at the right moment.
That problem is usually summarized as “context”. The word is overused, but in this case it is accurate. Modern AI systems do not merely need data. They need the surrounding knowledge that makes data usable: definitions, ownership, caveats, joins, examples, current status, deprecation warnings, business meaning, policy constraints, and operational procedures.
This explains why the industry is suddenly producing new AI infrastructure standards. Anthropic’s Model Context Protocol, usually shortened to MCP, is one of the most visible examples. Google’s Open Knowledge Format , or OKF, is a newer move in the same broad direction, although it solves a different part of the problem.
MCP is best understood as a protocol for interaction. It defines a way for AI applications to connect to external systems through servers that expose resources, prompts, and tools. In practical terms, MCP tries to replace a world of bespoke integrations with a common interface. Instead of every AI client needing a custom GitHub connector, a custom database connector, a custom file connector, and a custom issue-tracker connector, MCP offers a shared architectural pattern: hosts, clients, servers, capabilities, and JSON-RPC messages.
This is useful because agents are not just search boxes. They increasingly perform sequences of actions. They read files, inspect issues, run commands, query databases, edit documents, and call APIs. Without a protocol boundary, every integration becomes a miniature software project with its own security model, permissions, transport choices, error behavior, and maintenance burden.
But MCP does not by itself solve the knowledge representation problem. It gives an AI system a way to reach into tools and systems. It does not define how an organization should package the meaning of its internal world so that humans, agents, catalogs, and search indexes can all consume it.
That is where OKF enters.
Google’s Open Knowledge Format is deliberately less ambitious at runtime and more ambitious as a knowledge artifact. OKF represents knowledge as a directory of Markdown files with YAML frontmatter. Each file describes a concept: a dataset, table, metric, API, runbook, playbook, or another unit of organizational knowledge. The file path becomes part of the concept’s identity. Markdown links turn the directory into a graph. YAML frontmatter carries structured fields such as type, title, description, resource, tags, and timestamp.
At first glance, that sounds almost disappointingly simple. This is precisely the point.
The interesting decision in OKF is not the syntax. Markdown plus YAML frontmatter is old, boring, and everywhere. It is readable in a terminal, editable in any text editor, renderable on GitHub, usable in static site generators, searchable with ordinary tools, and versionable in Git. That boringness is a design feature. The format does not require a new runtime, an SDK, a database service, or a vendor account. A bundle of OKF documents is just files.
This matters because organizational knowledge is usually trapped in incompatible places. Data catalogs contain schemas and ownership metadata. Wikis contain prose and tribal memory. Code comments contain implementation assumptions. Notebooks contain experimental reasoning. Incident systems contain operational scars. Senior engineers contain everything else until they leave.
AI agents make this fragmentation more visible. A human can compensate for missing structure by asking around, remembering past incidents, or knowing which wiki page is out of date. An AI system tends to either miss the context or retrieve fragments without understanding their authority. A neat demo becomes fragile when deployed into the mess of a real organization.
OKF’s wager is that much of this context can be treated like source code: written, reviewed, linked, diffed, versioned, generated, enriched, and consumed by many tools. A table description can live beside example queries. A metric can link to the tables it depends on. A runbook can link to the services and dashboards it mentions. An API document can point to its owning team, lifecycle status, and canonical resource.
This is not a knowledge graph in the heavyweight semantic-web sense. It is closer to a disciplined wiki for machines and humans. That distinction is important. Many attempts at formal knowledge representation fail because they demand too much ontology up front. OKF appears to make the opposite trade-off: require very little, allow local extension, and rely on links, files, and conventions to create enough structure for agents to navigate.
The comparison with MCP is therefore illuminating. MCP is a live connector protocol. OKF is a portable knowledge format. MCP answers: “How can an AI application safely talk to external tools and data sources?” OKF answers: “How can we package curated organizational context so that different agents and tools can read it without translation?”
In practice, the two ideas could complement each other. An MCP server might expose an OKF bundle as a resource. An agent might use MCP to retrieve documents and use OKF conventions to understand their structure. A data catalog might export OKF. A coding agent might read OKF from a repository before touching analytics code. A governance tool might inspect OKF metadata without needing access to the original production systems.
The risk, of course, is standard inflation. AI already has too many partially overlapping abstractions: tool schemas, function calling formats, agent manifests, memory stores, vector indexes, prompt files, skill bundles, connector registries, evaluation formats, and now knowledge formats. Some will survive. Many will not. The industry is still in the Cambrian phase, and every vendor has an incentive to label its preferred convention “open”.
That does not make OKF irrelevant. Quite the opposite. The most plausible standards are often not the most elegant ones, but the ones that fit existing habits. Markdown files in Git already sit close to how many technical teams work. YAML frontmatter is familiar from static-site tools and documentation systems. Cross-links are understandable to humans. A directory can be copied, reviewed, archived, indexed, or handed to another system without ceremony.
The security implications are also different from MCP. MCP exposes tools and potentially actions. That immediately raises questions about authorization, prompt injection, tool trust, user consent, and unintended execution. OKF is primarily representational. It can still leak sensitive information, encode stale assumptions, or smuggle malicious instructions into agent-readable prose, but its default posture is closer to documentation than remote control. The governance problem is less “may the agent execute this?” and more “is this knowledge accurate, current, authorized, and safe to use?”
For technically interested observers, OKF is therefore worth watching less as a Google product announcement and more as a signal. The next competitive layer in AI may not be model intelligence alone. It may be the quality of the context substrate around the model: the formats, conventions, repositories, catalogs, and review processes that make institutional knowledge legible to machines.
Models are becoming more capable. That makes bad context more expensive, not less. An incompetent model fails obviously. A capable model with wrong context fails persuasively.
The unglamorous future of AI may be directories full of Markdown, schemas with careful metadata, boring review workflows, and protocols that let agents ask for the right things without being allowed to do the wrong ones. That may sound less dramatic than artificial general intelligence. It is probably more useful.
No comments yet