Token
Simple Definition
A token is the basic unit of text that AI language models process. When you send a message to an AI, it doesn’t read word by word — it breaks your text into tokens first.
Tokens are roughly chunks of 3–4 characters. Common words are often one token; rarer words may be split into multiple tokens.
How Tokenization Works
The text “ChatGPT is great” might be split into tokens like:
- “Chat” + “G” + “PT” + ” is” + ” great”
Or possibly: “ChatGPT” + ” is” + ” great”
Tokenization depends on the model’s vocabulary and the specific tokenizer used.
Why Tokens Matter for Users
1. Context window limits — AI models can only process a certain number of tokens at once. Long conversations or documents need to fit within this limit.
2. API pricing — AI APIs charge by the token (both input and output tokens). Understanding token counts helps you estimate costs.
3. Response length — when you ask an AI to “keep it under 200 words,” you’re really controlling token count indirectly.
Rough Token Estimates
| Text | Approximate Tokens |
|---|---|
| 1 word | ~1.3 tokens |
| 1 sentence | ~15–20 tokens |
| 1 paragraph | ~60–80 tokens |
| 1 page | ~500–600 tokens |
| 1 book (300 pages) | ~150,000 tokens |
Input vs. Output Tokens
- Input tokens — everything you send to the model (your prompt, conversation history, documents)
- Output tokens — the response the model generates
APIs typically charge for both, with output tokens often costing more.
Related Terms
- Context Window — the maximum number of tokens a model can handle at once
- LLM — language models that process text as tokens
- Inference — the process where tokens are generated
- Temperature — the setting that affects token selection during generation
See AI terms in action
Browse practical AI workflows that use the concepts in this glossary.
Frequently Asked Questions
How many tokens is 1000 words?
Roughly 1,300–1,500 tokens. A common rule of thumb is that 1 token ≈ 0.75 words in English, so 1,000 words ≈ 1,333 tokens.
Last updated: