# Introduction
JSON is great for APIs, storage, and application logic. But inside large language model (LLM) pipelines, there’s often a lot of token overhead that doesn’t add much value to the model: braces, quotes, commas, and repeated field names on each line. ToonShort for Token-Oriented Object Notation, is a new format specifically designed to keep the same JSON data model while using fewer tokens and giving clearer structural hints to the model. The official TOON documentation describes it as a compact, lossless representation of JSON for LLM input, particularly robust on homogeneous arrays of objects.
In this article, you will learn what TOON is, when it is appropriate to use it, and how to start using it step by step in your own LLM workflow. We’ll also keep the tradeoffs honest, because TOON is useful in some cases, not all.
# Why does JSON waste tokens in LLM pipelines
JSON becomes expensive in signals because it repeats the structure over and over again. LLM doesn’t care that JSON is a standard. They only see the token.
If you send 100 support tickets, product lines, or user records to a model, the same field names appear in each object. TOON reduces that repetition by declaring fields once and then streaming the row values in a concise tabular form. Here is a simple example.
JSON:
{
"users": (
{ "id": 1, "name": "Alice", "role": "admin" },
{ "id": 2, "name": "Bob", "role": "user" },
{ "id": 3, "name": "Charlie", "role": "user" }
)
}
Toon:
users(3){id,name,role}:
1,Alice,admin
2,Bob,user
3,Charlie,user
Same data, less clutter.
The structure is still clear, but the frequently repeated keys are gone. This is where TOON gets most of its value.
# What exactly is Toon and when is it worth using it
TOON is a serialization format for the JSON data model. This means it can represent objects, arrays, strings, numbers, booleans and null values - but in a way it is more compact for model inputs. The TOON project presents it as lossless relative to JSON, meaning you can convert JSON to TOON and back without losing information. The important thing to understand is this:
You don’t need to change the JSON in your app.
A better approach is to keep JSON in your backend, API, and storage, then convert it to TOON only when you’re going to send structured data to LLM.
TOON is most useful when your prompt contains repeatedly structured records with similar fields. Good examples include retrieving support tickets, catalog rows, analytics records, tool outputs, CRM entries, or memory snapshots for the agent system. However, if your structure is deeply entangled, highly irregular, completely flat, or very small, the benefits may be diminished or disappear.
# Getting Started with TOON
// Step 1: Installing the TOON Command-Line Interface
The easiest way to try TOON is through the TOON project’s official command-line interface (CLI). The TOON site links directly to its CLI, and the main repository offers formats as part of a broader SDK and tooling ecosystem.
Install package:
npm install -g @toon-format/cli
// Step 2: Converting the JSON File to TOON
Let’s create a folder first:
mkdir toon-test
cd toon-test
Now, run the following command to create the JSON file:
Paste this:
(
{ "id": 1, "name": "Alice", "role": "admin" },
{ "id": 2, "name": "Bob", "role": "user" },
{ "id": 3, "name": "Charlie", "role": "user" }
)
Now convert this:
npx @toon-format/cli users.json -o users.toon
You should get a summary result similar to this:
(3){id,name,role}:
1,Alice,admin
2,Bob,user
3,Charlie,user
This is the core TOON pattern: declare the shape once, then list the values row by row. This is consistent with the official design goal of tabular arrays for identical objects.
// Step 3: Using TOON as Model Input
The best place to use TOON is on the input side of your pipeline. Instead of pasting a big JSON blob into the prompt, pass along the TOON version and keep the instructions simple.
For example:
The following data is in TOON format.
users(3){id,name,role}:
1,Alice,admin
2,Bob,user
3,Charlie,user
Summarize the user roles and point out anything unusual.
This works well because TOON is designed to help the model read repeated structure with less overhead. This is how the official project formulates its benchmarks: as tests of understanding in various structured input formats.
// Step 4: Putting JSON to Output
This is one of the most important practical decisions. TOON is very useful for input, but JSON is still the better choice for output when another system needs to parse the model response. This is because JSON has very strong tooling support, and modern APIs can implement structured JSON output with schema.
In practice, the safest pattern is:
- JSON in your app.
- TOON for large structured quick reference.
- JSON again for machine-parsable model responses.
This gives you efficiency on the input side and reliability on the output side.
// Step 5: Benchmarking your own pipeline
Don’t change the format just based on promotion.
Run a small benchmark in your own workflow:
- Count input tokens for JSON.
- Count input tokens for TOON.
- Compare latency.
- Compare answer quality.
- Compare the total cost.
The official TOON project cites token savings as one of the main benefits, and third-party coverage reiterates those claims, but community discussion also suggests that the results depend heavily on the size of the data. That’s why the best question isn’t “Is TOON better than JSON?”
The better question is: “Is TOON better for this specific LLM stage?”
# final thoughts
TOON is not something you need to use everywhere.
This is an optimization targeted to a specific problem: wasting tokens on repeated JSON structure inside LLM signals. If your pipeline passes a lot of frequently structured records into a model, TOON is worth testing. If your payloads are small, irregular, or heavily nested, JSON may still be a better choice.
The smartest way to approach this is simple: keep JSON where JSON already works well, use TOON where you’re packing large structured inputs into signals, and benchmark the results on your functions before committing to it.
Kanwal Mehreen He is a machine learning engineer and a technical writer with a deep passion for the intersection of AI with data science and medicine. He co-authored the eBook “Maximizing Productivity with ChatGPT”. As a Google Generation Scholar 2022 for APAC, she is an advocate for diversity and academic excellence. She has also been recognized as a Teradata Diversity in Tech Scholar, a Mitex GlobalLink Research Scholar, and a Harvard VCode Scholar. Kanwal is a strong advocate for change, having founded FEMCodes to empower women in STEM fields.