Understanding TOON: A Guide to Token-Oriented Object Notation
How AI-Native Data Formatting Reduces Costs and Improves LLM Performance

In the ever-evolving landscape of data serialization formats, a new approach has emerged that's specifically designed for the age of AI and large language models: Token-Oriented Object Notation, or TOON. While JSON, XML, and YAML have dominated data interchange for decades, TOON represents a paradigm shift optimized for token-based processing and LLM efficiency.
What Is Token-Oriented Object Notation?
TOON is a data serialization format engineered to minimize token count when processed by language models. Unlike traditional formats that prioritize human readability or parsing simplicity, it focuses on efficiency—crucial in an era where API costs are calculated per token and context windows have limits.
The core philosophy: every character counts. Redundant syntax is eliminated, delimiters are compact, and structure aligns with how tokenizers process text.
Why TOON Matters
Token Economy: A 30% reduction in tokens translates directly to cost savings and faster processing when paying for API calls by the token.
Context Window Optimization: More efficient encoding means fitting more meaningful data in each request.
LLM-Native Design: Built specifically for how transformers process text, not just for machines to parse or humans to read.
Reduced Latency: Fewer tokens enable faster transmission and processing for real-time applications.
How TOON Differs from JSON
Let's compare a simple data structure:
JSON:
{
"user": {
"name": "Alice",
"age": 30,
"active": true
}
}
TOON:
user:
name: Alice
age: 30
active: 1
The TOON version eliminates unnecessary whitespace, braces, quoted strings, boolean literals (using 0/1), and redundant delimiters.
JSON vs TOON – Learn With Examples
Now let's look at common JSON structures and their TOON equivalents:
1. Simple Object
JSON:
{ "name": "Alice", "age": 30, "city": "Bengaluru" }
TOON:
name: Alice
age: 30
city: Bengaluru
2. Array of Values
JSON:
{ "colors": ["red", "green", "blue"] }
TOON:
colors[3]: red,green,blue
3. Array of Objects
JSON:
{
"users": [
{ "id": 1, "name": "Alice", "role": "admin" },
{ "id": 2, "name": "Bob", "role": "user" }
]
}
TOON:
users[2]{id,name,role}:
1,Alice,admin
2,Bob,user
Here, users[2]{id,name,role} declares an array of two objects with the fields id, name, and role. The lines below are simply the data rows—no quotes, braces, or repeated keys needed.
4. Nested Objects
JSON:
{
"user": {
"id": 1,
"name": "Alice",
"profile": { "age": 30, "city": "Bengaluru" }
}
}
TOON:
user:
id: 1
name: Alice
profile:
age: 30
city: Bengaluru
Indentation represents nesting, similar to YAML but more structured.
5. Array of Objects With Nested Fields
JSON:
{
"teams": [
{
"name": "Team Alpha",
"members": [
{ "id": 1, "name": "Alice" },
{ "id": 2, "name": "Bob" }
]
}
]
}
TOON:
teams[1]:
- name: Team Alpha
members[2]{id,name}:
1,Alice
2,Bob
This is still perfectly understandable, and visibly reduces token usage by 30-50% depending on the data shape.
Key Design Principles
Token-Aware Syntax: Uses delimiters that typically tokenize as single units rather than multiple fragments.
Implicit Structure: Relies on positional and contextual cues that LLMs naturally understand, avoiding explicit container markers.
Abbreviated Keys: Supports short-form keys with standardized abbreviations interpretable through context.
Inline Schemas: Compact definitions (like users[2]{id,name,role}) convey structure without verbose type declarations.
How to Use TOON With JavaScript/TypeScript
TOON isn't meant to be handwritten in most cases. Instead, you'll encode existing JSON data into TOON format or decode TOON back to JSON.
First, install the official NPM package:
npm install @toon-format/toon
Converting JSON to TOON
import { encode } from "@toon-format/toon";
const data = {
users: [
{ id: 1, name: "Alice", role: "admin" },
{ id: 2, name: "Bob", role: "user" },
],
};
const toonString = encode(data);
console.log(toonString);
Output:
users[2]{id,name,role}:
1,Alice,admin
2,Bob,user
Converting TOON to JSON
import { decode } from "@toon-format/toon";
const toonString = `
users[2]{id,name,role}:
1,Alice,admin
2,Bob,user
`;
const jsonObject = decode(toonString);
console.log(jsonObject);
Output:
{
"users": [
{ "id": 1, "name": "Alice", "role": "admin" },
{ "id": 2, "name": "Bob", "role": "user" }
]
}
How to Use TOON With Python
Using TOON in Python is equally straightforward. First, install the package:
pip install python-toon
If you're using a virtual environment:
python -m venv venv
source venv/bin/activate
pip install python-toon
Encoding JSON to TOON
from toon import encode
# A channel object
channel = {"name": "tapaScript", "age": 2, "type": "education"}
toon_output = encode(channel)
print(toon_output)
Output:
name: tapaScript
age: 2
type: education
Decoding TOON to JSON
from toon import decode
toon_string = """
name: tapaScript
age: 2
type: education
"""
python_struct = decode(toon_string)
print(python_struct)
Output:
{"name": "tapaScript", "age": 2, "type": "education"}
Use Cases
LLM API Payloads: When sending data to or receiving data from language models, especially in multi-turn conversations where every token compounds.
Prompt Engineering: Embedding data in prompts more efficiently, leaving more room for instructions and examples.
RAG Systems: Retrieval-augmented generation systems that need to pass retrieved data efficiently to the model.
Agent Communication: AI agents communicating with each other or with APIs where token efficiency matters.
Streaming Applications: Real-time applications where lower latency from reduced token counts improves user experience.
Training Data: Less token overhead for structured training data when fine-tuning LLMs.
When JSON Still Makes More Sense
TOON is NOT a universal replacement for JSON. You should still prefer JSON when:
Your data is deeply nested or irregular (varying object shapes)
Your application needs strict schema validations or type enforcement
You're working with non-AI use cases where JSON is the established standard
Human readability and easy manual editing are priorities
A hybrid approach often works best: keep JSON for your application's data exchange with APIs, but convert to TOON when sending data to LLMs.
Challenges and Considerations
Human Readability: TOON sacrifices some readability for efficiency. It's best suited for machine-to-machine communication rather than configuration files humans frequently edit.
Ecosystem Maturity: As a newer format, tooling and library support is still developing compared to established formats.
Edge Cases: Complex nested structures and certain data types may require careful design to maintain both efficiency and clarity.
Standardization: The format is still evolving, and community consensus on best practices is forming.
Getting Started Today
You can start applying TOON principles immediately:
Minimize whitespace in data sent to LLMs
Use abbreviations consistently in your data schemas
Eliminate redundant syntax where structure is clear from context
Test tokenization of your formats using tools like OpenAI's tokenizer
Measure impact on costs and latency in your applications
The Future of TOON and Data Serialization
TOON represents a broader trend: as AI becomes increasingly central to software architecture, our tools and formats must adapt. Just as JSON emerged to meet the needs of web APIs and YAML addressed configuration complexity, token-oriented formats address the specific requirements of LLM-centric applications.
With its early traction in the developer community, TOON is already being explored for:
Compact data exchange in Agent frameworks
Faster serialization between MCP and AI workflow engines
Serverless AI APIs where cost and speed are critical
Cost-effective data handling in production AI systems
We're likely to see more innovation in this space:
Binary formats optimized for specific tokenizers
Hybrid approaches that balance human and LLM readability
Compression techniques aware of transformer attention mechanisms
Standardized vocabularies that tokenize predictably
Conclusion
Token-Oriented Object Notation isn't just about saving a few characters—it's about aligning our data formats with the fundamental architecture of modern AI systems. As language models become more powerful and ubiquitous, the formats we use to communicate with them will continue to evolve.
Just as JSON became the standard for web data exchange, TOON may soon be standardized for AI data interchange. So next time you craft a prompt or pass structured data to an AI model, try it in TOON format. You may notice the model gets faster and cheaper.
Whether TOON becomes a widespread standard or inspires other token-efficient formats, one thing is clear: in the age of AI, every token matters. The future of data serialization is being written in the language that machines speak most efficiently—and that language is token-oriented.



