2 min read

Simplify LLM Cost Tracking with Tokencost

Simplify LLM Cost Tracking with Tokencost

As the popularity of Large Language Models (LLMs) grows, so does the need to accurately track and estimate the costs associated with using these powerful APIs. This is where Tokencost comes in - a Python library that simplifies the process of calculating the USD cost of prompts and completions for major LLM providers.

This blog is by Sourabh, our CTO who spends most of his time building an AI agent and gives the best restaurant recommendations. If you like this post, try KushoAI today, and start shipping bug-free code faster!

Key Features

  1. LLM Price Tracking: Tokencost stays up-to-date with the latest pricing changes from major LLM providers, ensuring you always have accurate cost estimates.
  2. Token Counting: Accurately count the number of tokens in your prompts before sending requests to OpenAI, helping you optimize your usage and costs.
  3. Easy Integration: With just a single function call, you can get the cost of a prompt or completion, making it simple to incorporate Tokencost into your existing projects.


Install Tokencost via PyPI:

pip install tokencost

Cost Estimation

Tokencost makes it easy to calculate the cost of prompts and completions from OpenAI requests. Here's an example:

from openai import OpenAI
from tokencost import calculate_prompt_cost, calculate_completion_cost

client = OpenAI()
model = "gpt-3.5-turbo"
prompt = [{ "role": "user", "content": "Say this is a test"}]

chat_completion = client.chat.completions.create(
    messages=prompt, model=model

completion = chat_completion.choices[0].message.content
prompt_cost = calculate_prompt_cost(prompt, model)
completion_cost = calculate_completion_cost(completion, model)
print(f"{prompt_cost} + {completion_cost} = {prompt_cost + completion_cost}")

You can also calculate costs using string prompts instead of messages:

from tokencost import calculate_prompt_cost

prompt_string = "Hello world"
response = "How may I assist you today?"
model= "gpt-3.5-turbo"

prompt_cost = calculate_prompt_cost(prompt_string, model)
print(f"Cost: ${prompt_cost}")

Token Counting

In addition to cost estimation, Tokencost provides functions for counting tokens in both message lists and string prompts:

from tokencost import count_message_tokens, count_string_tokens

message_prompt = [{ "role": "user", "content": "Hello world"}]
print(count_message_tokens(message_prompt, model="gpt-3.5-turbo"))

print(count_string_tokens(prompt="Hello world", model="gpt-3.5-turbo"))

This allows you to optimize your prompts and stay within the token limits of your chosen LLM.


As LLMs continue to advance and find new applications, managing the costs associated with these powerful APIs becomes increasingly important. Tokencost simplifies this process, providing accurate cost estimates and token counting for major LLM providers.

By integrating Tokencost into your projects, you can ensure that you're using LLMs efficiently and cost-effectively, helping you build better AI-powered applications without breaking the bank.

At KushoAI, we're building an AI agent that tests your APIs for you. Bring in API information in any format and watch KushoAI turn it into fully functional and exhaustive test suites in minutes.