Model Usage and Cost Management

Model Usage and Cost Management

This feature is EXPERIMENTAL and under active development. It may change significantly, be discontinued, or have breaking changes without notice. Documentation may be incomplete or outdated and is NOT recommended for production use. Use at your own risk. Please refer to our Upgrade and Release Process for more information.

Overview

The Model Usage and Cost Management feature gives organisations full visibility into how their users interact with AI models — tracking token consumption and estimated spend per user, assistant, and application in real time.

Once enabled, the feature:

  • Records every LLM call (input tokens + completion tokens + estimated cost in USD)

  • Shows an Admin dashboard with usage reporting and model pricing

  • Shows a per-user spend badge in the Chat interface

  • Supports CSV export for offline analysis


Enabling the Feature

Model Usage tracking is controlled by two feature flags. Both are disabled by default and must be activated by Unique.

Feature Flag

What it enables

Feature Status

Feature Flag

What it enables

Feature Status

FEATURE_FLAG_SAVE_MODEL_USAGE_UN_12832

Starts recording usage data to the database. Must be on for anything to work.

Experimental

FEATURE_FLAG_SAVE_MODEL_USAGE_DASHBOARD_UN_18889

Shows the Cost Management section in the Admin UI and the spend badge in the Chat user menu.

Experimental

To enable these flags for a client, please contact your Unique Representative or raise a request via the Enterprise mailbox.


Admin UI — Cost Management

Visible in the Settings Page → Cost Management (left sidebar), once the dashboard flag is enabled.

Requires the user to have the role CHAT_FEEDBACK_READ or CHAT_DATA_ADMIN.

Model Pricing

A read-only table showing the active pricing for every supported model:

Column

Description

Column

Description

Model

Internal model identifier (e.g. AZURE_GPT_4o_2024_1120, litellm:anthropic-claude-sonnet-4-5)

Input cost / 1M tokens

Cost in USD per million prompt tokens

Completion cost / 1M tokens

Cost in USD per million completion (output) tokens

Currency

USD by default

Usage Reporting

An aggregated view of token consumption and estimated spend across the organisation.

Filters available:

  • Date range — This month, last month, this week, or custom range (with previous/next period navigation)

  • View by — Model, User, or Assistant

  • Drill-down — Click any row to see the breakdown within that dimension

  • Text filter — Search by model name, user, or assistant

  • Pagination — Configurable page size


How Costs Are Calculated

Cost is computed per LLM call using the formula:

spent = (inputTokens x inputCostPer1M + completionTokens x completionCostPer1M) / 1,000,000

All costs are expressed in USD.

Default Pricing Sources

Default prices are maintained by the Unique engineering team and sourced from:

Sample Default Prices (April 2026)

Model

Input (USD / 1M tokens)

Completion (USD / 1M tokens)

Model

Input (USD / 1M tokens)

Completion (USD / 1M tokens)

AZURE_GPT_4o_2024_1120

$2.50

$10.00

AZURE_GPT_4o_MINI_2024_0718

$0.15

$0.60

AZURE_GPT_41_2025_0414

$2.00

$8.00

AZURE_o3_2025_0416

$2.00

$8.00

AZURE_o4_MINI_2025_0416

$1.10

$4.40

litellm:anthropic-claude-sonnet-4-5

$3.00

$15.00

litellm:anthropic-claude-opus-4-5

$5.00

$25.00

litellm:gemini-2-5-pro

$1.25

$10.00

Per-Client Price Override

Prices can be customised per client. If a client's Azure contract or LiteLLM agreement carries different rates, the Unique engineering team can supply a custom pricing configuration for that deployment. Contact your CS to arrange this.


CSV Export

Usage data can be exported as a CSV file via the analytics export pipeline. User privacy settings (pseudonymisation or anonymisation) are respected per organisation configuration.

CSV columns:

Column

Description

Column

Description

S/N

Row sequence number

User ID

User identifier (may be pseudonymised/anonymised per org settings)

Assistant ID

The assistant used (N/A if not applicable)

Chat ID

The conversation session

App ID

The Unique application

Language Model

Model identifier

Spent

Estimated cost in USD

Input Tokens

Number of prompt tokens

Completion Tokens

Number of output tokens

Timestamp

UTC timestamp of the LLM call


Required User Roles

Role

What it grants

Role

What it grants

CHAT_FEEDBACK_READ

View usage reporting and model pricing in Admin

CHAT_DATA_ADMIN

Full access to model usage data in Admin


Related Documentation