Skip to main content

Command Palette

Search for a command to run...

Understanding TOON: A Guide to Token-Oriented Object Notation

How AI-Native Data Formatting Reduces Costs and Improves LLM Performance

Updated
7 min read
Understanding TOON: A Guide to Token-Oriented Object Notation

In the ever-evolving landscape of data serialization formats, a new approach has emerged that's specifically designed for the age of AI and large language models: Token-Oriented Object Notation, or TOON. While JSON, XML, and YAML have dominated data interchange for decades, TOON represents a paradigm shift optimized for token-based processing and LLM efficiency.

What Is Token-Oriented Object Notation?

TOON is a data serialization format engineered to minimize token count when processed by language models. Unlike traditional formats that prioritize human readability or parsing simplicity, it focuses on efficiency—crucial in an era where API costs are calculated per token and context windows have limits.

The core philosophy: every character counts. Redundant syntax is eliminated, delimiters are compact, and structure aligns with how tokenizers process text.

Why TOON Matters

Token Economy: A 30% reduction in tokens translates directly to cost savings and faster processing when paying for API calls by the token.

Context Window Optimization: More efficient encoding means fitting more meaningful data in each request.

LLM-Native Design: Built specifically for how transformers process text, not just for machines to parse or humans to read.

Reduced Latency: Fewer tokens enable faster transmission and processing for real-time applications.

How TOON Differs from JSON

Let's compare a simple data structure:

JSON:

{
  "user": {
    "name": "Alice",
    "age": 30,
    "active": true
  }
}

TOON:

user:
  name: Alice
  age: 30
  active: 1

The TOON version eliminates unnecessary whitespace, braces, quoted strings, boolean literals (using 0/1), and redundant delimiters.

JSON vs TOON – Learn With Examples

Now let's look at common JSON structures and their TOON equivalents:

1. Simple Object

JSON:

{ "name": "Alice", "age": 30, "city": "Bengaluru" }

TOON:

name: Alice
age: 30
city: Bengaluru

2. Array of Values

JSON:

{ "colors": ["red", "green", "blue"] }

TOON:

colors[3]: red,green,blue

3. Array of Objects

JSON:

{
  "users": [
    { "id": 1, "name": "Alice", "role": "admin" },
    { "id": 2, "name": "Bob", "role": "user" }
  ]
}

TOON:

users[2]{id,name,role}:
  1,Alice,admin
  2,Bob,user

Here, users[2]{id,name,role} declares an array of two objects with the fields id, name, and role. The lines below are simply the data rows—no quotes, braces, or repeated keys needed.

4. Nested Objects

JSON:

{
  "user": {
    "id": 1,
    "name": "Alice",
    "profile": { "age": 30, "city": "Bengaluru" }
  }
}

TOON:

user:
  id: 1
  name: Alice
  profile:
    age: 30
    city: Bengaluru

Indentation represents nesting, similar to YAML but more structured.

5. Array of Objects With Nested Fields

JSON:

{
  "teams": [
    {
      "name": "Team Alpha",
      "members": [
        { "id": 1, "name": "Alice" },
        { "id": 2, "name": "Bob" }
      ]
    }
  ]
}

TOON:

teams[1]:
  - name: Team Alpha
    members[2]{id,name}:
      1,Alice
      2,Bob

This is still perfectly understandable, and visibly reduces token usage by 30-50% depending on the data shape.

Key Design Principles

Token-Aware Syntax: Uses delimiters that typically tokenize as single units rather than multiple fragments.

Implicit Structure: Relies on positional and contextual cues that LLMs naturally understand, avoiding explicit container markers.

Abbreviated Keys: Supports short-form keys with standardized abbreviations interpretable through context.

Inline Schemas: Compact definitions (like users[2]{id,name,role}) convey structure without verbose type declarations.

How to Use TOON With JavaScript/TypeScript

TOON isn't meant to be handwritten in most cases. Instead, you'll encode existing JSON data into TOON format or decode TOON back to JSON.

First, install the official NPM package:

npm install @toon-format/toon

Converting JSON to TOON

import { encode } from "@toon-format/toon";

const data = {
  users: [
    { id: 1, name: "Alice", role: "admin" },
    { id: 2, name: "Bob", role: "user" },
  ],
};

const toonString = encode(data);
console.log(toonString);

Output:

users[2]{id,name,role}:
  1,Alice,admin
  2,Bob,user

Converting TOON to JSON

import { decode } from "@toon-format/toon";

const toonString = `
users[2]{id,name,role}:
  1,Alice,admin
  2,Bob,user
`;

const jsonObject = decode(toonString);
console.log(jsonObject);

Output:

{
  "users": [
    { "id": 1, "name": "Alice", "role": "admin" },
    { "id": 2, "name": "Bob", "role": "user" }
  ]
}

How to Use TOON With Python

Using TOON in Python is equally straightforward. First, install the package:

pip install python-toon

If you're using a virtual environment:

python -m venv venv
source venv/bin/activate
pip install python-toon

Encoding JSON to TOON

from toon import encode

# A channel object
channel = {"name": "tapaScript", "age": 2, "type": "education"}
toon_output = encode(channel)
print(toon_output)

Output:

name: tapaScript
age: 2
type: education

Decoding TOON to JSON

from toon import decode

toon_string = """
name: tapaScript
age: 2
type: education
"""

python_struct = decode(toon_string)
print(python_struct)

Output:

{"name": "tapaScript", "age": 2, "type": "education"}

Use Cases

LLM API Payloads: When sending data to or receiving data from language models, especially in multi-turn conversations where every token compounds.

Prompt Engineering: Embedding data in prompts more efficiently, leaving more room for instructions and examples.

RAG Systems: Retrieval-augmented generation systems that need to pass retrieved data efficiently to the model.

Agent Communication: AI agents communicating with each other or with APIs where token efficiency matters.

Streaming Applications: Real-time applications where lower latency from reduced token counts improves user experience.

Training Data: Less token overhead for structured training data when fine-tuning LLMs.

When JSON Still Makes More Sense

TOON is NOT a universal replacement for JSON. You should still prefer JSON when:

  • Your data is deeply nested or irregular (varying object shapes)

  • Your application needs strict schema validations or type enforcement

  • You're working with non-AI use cases where JSON is the established standard

  • Human readability and easy manual editing are priorities

A hybrid approach often works best: keep JSON for your application's data exchange with APIs, but convert to TOON when sending data to LLMs.

Challenges and Considerations

Human Readability: TOON sacrifices some readability for efficiency. It's best suited for machine-to-machine communication rather than configuration files humans frequently edit.

Ecosystem Maturity: As a newer format, tooling and library support is still developing compared to established formats.

Edge Cases: Complex nested structures and certain data types may require careful design to maintain both efficiency and clarity.

Standardization: The format is still evolving, and community consensus on best practices is forming.

Getting Started Today

You can start applying TOON principles immediately:

  1. Minimize whitespace in data sent to LLMs

  2. Use abbreviations consistently in your data schemas

  3. Eliminate redundant syntax where structure is clear from context

  4. Test tokenization of your formats using tools like OpenAI's tokenizer

  5. Measure impact on costs and latency in your applications

The Future of TOON and Data Serialization

TOON represents a broader trend: as AI becomes increasingly central to software architecture, our tools and formats must adapt. Just as JSON emerged to meet the needs of web APIs and YAML addressed configuration complexity, token-oriented formats address the specific requirements of LLM-centric applications.

With its early traction in the developer community, TOON is already being explored for:

  • Compact data exchange in Agent frameworks

  • Faster serialization between MCP and AI workflow engines

  • Serverless AI APIs where cost and speed are critical

  • Cost-effective data handling in production AI systems

We're likely to see more innovation in this space:

  • Binary formats optimized for specific tokenizers

  • Hybrid approaches that balance human and LLM readability

  • Compression techniques aware of transformer attention mechanisms

  • Standardized vocabularies that tokenize predictably

Conclusion

Token-Oriented Object Notation isn't just about saving a few characters—it's about aligning our data formats with the fundamental architecture of modern AI systems. As language models become more powerful and ubiquitous, the formats we use to communicate with them will continue to evolve.

Just as JSON became the standard for web data exchange, TOON may soon be standardized for AI data interchange. So next time you craft a prompt or pass structured data to an AI model, try it in TOON format. You may notice the model gets faster and cheaper.

Whether TOON becomes a widespread standard or inspires other token-efficient formats, one thing is clear: in the age of AI, every token matters. The future of data serialization is being written in the language that machines speak most efficiently—and that language is token-oriented.

More from this blog

S

Sky's Blog

7 posts

Hi everyone, I am Akash (Sky😉) The Versatile Developer I am a tech enthusiast who loves to code, and an avid photographer Join me on my journey of learning 📚, Let's learn and explore 🌏 together