Availability of Pay-As-You-Go Models in Unique

Availability of Pay-As-You-Go Models in Unique


Overview

This page provides an overview of the pay-as-you-go models currently available within Unique. It highlights which models are accessible globally (all regions), specifically for Standard SWE, or for Standard CH.

The information is based on this Azure reference and augmented with our internal deployment knowledge.


Notes

  • Global Models: These models perform inference (data processing) across all regions provided by Microsoft, including the US. This is the default option for US tenants and can also be deployed in QA and OleOle for testing.

  • Standard SWE Models: Available exclusively in Sweden Central. This option applies when customers opt for data processing solely within Europe.

  • Data zone Standard Models: Data is processed somewhere in Europe (and not anymore only in Sweden). This is another option that applies when customers opt for data processing solely within Europe.

  • Standard CH Models: Designed specifically for Switzerland. This option applies when customers choose to have data processing only within Switzerland. This is the default option for all our Europe Single Tenants.

  • Request-Based Models: Certain models require a request and approval from Microsoft. (Refer to the Request Table below for details.)

  • PTU (Prepaid Through Unit): Some customers have purchased PTUs and maintain direct communication with Microsoft. However, they likely also follow a pay-as-you-go strategy, making this page relevant to them as well.

  • Escalations to Microsoft: If a model is not available, customers should escalate the request directly to Microsoft via the email contact they have from MS. The customer should submit a request specifying the model needed.


Model Availability Matrix

Model Name
(can be copy pasted into the configs)

Available in code with release, still needs provisioning

Supported Base Models for UniqueAI Agent
❌: Not suited

🟡: functionally tested works
✅ Extensive Quality tests done

Available Globally,

Only applicable for us-tenant
Quota
[tokens/min]

 

Data Zone Standard
EU

Standard SWE
Quota [tokens/min]

Standard CH
Quota
[tokens/min]

Needs Request for Standard Deployment?
(Only ST and CMT)

URL to Request

Available on MT?
Standard deployment only

Retirement
(link)

Additional information

Model Name
(can be copy pasted into the configs)

Available in code with release, still needs provisioning

Supported Base Models for UniqueAI Agent
❌: Not suited

🟡: functionally tested works
✅ Extensive Quality tests done

Available Globally,

Only applicable for us-tenant
Quota
[tokens/min]

 

Data Zone Standard
EU

Standard SWE
Quota [tokens/min]

Standard CH
Quota
[tokens/min]

Needs Request for Standard Deployment?
(Only ST and CMT)

URL to Request

Available on MT?
Standard deployment only

Retirement
(link)

Additional information

AZURE_GPT_35_TURBO_0125

before
2025.12

400K

 

No earlier than July 16, 2025

 

AZURE_GPT_4_0613

before
2025.12

90K

June 6, 2025

 

AZURE_GPT_4_32K_0613

before
2025.12

170K

June 6, 2025

 

AZURE_GPT_4_TURBO_2024_0409

before
2025.12

50k

June 6, 2025

 

AZURE_GPT_4o_2024_0513

before
2025.12

🟡 (use AZURE_GPT_4o_2024_1120 as replacement)

500k

 

 

AZURE_GPT_4o_2024_0806

before
2025.12

🟡 (use AZURE_GPT_4o_2024_1120 as replacement)

200k

 

 

AZURE_GPT_4o_2024_1120

2025.18

 

1M

 

 

AZURE_GPT_4o_MINI_2024_0718

before
2025.12

700k

 

 

AZURE_o1_2024_1217

2025.14

2.5M

 

The system prompt is not allowed. The temperature and topP is always set to 1, —these values are hardcoded in the backend

See: https://github.com/Unique-AG/monorepo/pull/11597/files .

AZURE_o1_MINI_2024_0912

before
2025.12

60K

 

image-20250311-120431.png

 

 

The system prompt is not allowed. The temperature and topP is always set to 1, —these values are hardcoded in the backend

See: https://github.com/Unique-AG/monorepo/pull/11597/files .

AZURE_o3_MINI_2025_0131

2025.14

❌ (no image processing)

2.5M

 

image-20250311-120439.png

 

 

The system prompt is not allowed. The temperature and topP is always set to 1, —these values are hardcoded in the backend

See: https://github.com/Unique-AG/monorepo/pull/11597/files .

AZURE_GPT_45_preview

2025.12

 

https://customervoice.microsoft.com/Pages/ResponsePage.aspx?id=v4j5cvGGr0GRqy180BHbR7en2Ais5pxKtso_Pz4b1_xUQTVJMVdCODFMUEhISlpGUVI2M1JEWEtTTSQlQCN0PWcu

 

 

AZURE_GPT_41_2025_0414

2025.14

🟡

5M

2M

 

 

 

AZURE_GPT_41_MINI_2025_0414

2025.24

 

5M

 

 

 

AZURE_GPT_41_NANO_2025_0414

2025.24

 

5M

2M

 

 

 

AZURE_o3_2025_0416

2025.20

5M

 

 

 

AZURE_o4_MINI_2025_0416

2025.20

🟡

5M

 

 

 

litellm:openai-o1

2025.20

 

 

 

litellm:openai-o3

2025.20

 

 

 

litellm:openai-o4-mini

2025.20

🟡

 

 

 

litellm:openai-gpt-4-1-mini

2025.20

 

 

 

 

litellm:openai-gpt-4-1-nano

2025.20

 

 

 

 

litellm:openai-o3-pro

2025.26

❌: Chat Completion API is not supported for o3-pro. Only response API.

 

 

 

litellm:anthropic-claude-3-7-sonnet-thinking

2025.20

🟡

 

 

 

litellm:anthropic-claude-3-7-sonnet

2025.20

🟡

 

 

 

litellm:gemini-2-5-flash-preview-04-17

2025.20

🟡

 

 

 

litellm:llama-3-3-70b-instruct-turbo

2025.20

🟡

  • no forced tool calls possible

  • web search creates a string instead of a tool call

 

 

 

litellm:deepseek-r1

2025.20

 

 

 


Request Process for Restricted Models

If a model requires a request, follow these steps:

  1. Submit a Request: Open a ticket in [Jira/ServiceNow] with details on why the model is needed.

  2. Approval Workflow: The request will be reviewed by the Data Science Team, who will decide whether the client should make the request themselves or if Unique should do it

  3. On Approval: A request is made on behalf of the client for our managed single tenants customer.

  4. Access Provisioning: Upon approval, provision for the client using our terraform and tests on next, qa, prod and OleOle.


Author

@Pascal Hauri

 

© 2025 Unique AG. All rights reserved. Privacy PolicyTerms of Service