Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Current »

Unique offers customers to have multiple or custom embedding models on an Unique instance. So for example Company A can use the default embedding model provided by Unique and Company B can use a custom provided embedding model.

Setup

To setup additional embedding models you need to contact the maintainer of the instance because it requires to set an environment variable on the node-ingestion application. The environment variable (ADDITIONAL_EMBEDDING_MODELS_JSON) has those properties:

ADDITIONAL_EMBEDDING_MODELS_JSON='[{
  "endpoint": "https://myInstance.openai.azure.com/",
  "apiKey": "mySecretApiKey",
  "apiVersion": "2024-02-01",
  "deploymentName": "text-embedding-ada-002"
}]'

Configuration

When the environment variable is set and the application has been restarted then we can start configuring a company of Unique FinanceGPT to us the new additional embedding model.
To configure an company to use a different embedding model an update of the company entry in the CompanyMeta table is required to specify this additional embedding model. This can be done via API request:

curl --location 'https://gateway.<tenantName>.unique.app/ingestion/graphql' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer <token>' \
--data '{"query":"mutation CompanyMetaUpdate($embeddingModel: String) {\n  companyMetaUpdate(embeddingModel: $embeddingModel) {\n    companyId\n  }\n}","variables":{"embeddingModel":"text-embedding-ada-002"}}'

It will change the company entry based on the user this token belongs to. After this change the generation of the search embedding vector during a semantic search and also the ingestion of new data will use this specified embedding model.

Re-Embed existing content

If the company has already ingested data, a re-embedding of this data is required to align with the new embedding model. Otherwise you end up having discrepancy embedding vectors because they might be created with a different model. This re-embedding can be done with the content admin endpoint to re-embed all contents. This is documented in this section: https://unique-ch.atlassian.net/wiki/x/RYHHIQ

  • No labels