Usage or change of custom embedding model
Unique offers customers to have multiple or custom embedding models on an Unique instance. So for example Company A can use the default embedding model provided by Unique and Company B can use a custom provided embedding model.
Setup
To setup additional embedding models you need to contact the maintainer of the instance because it requires to set an environment variable on the node-ingestion
application. The environment variable (ADDITIONAL_EMBEDDING_MODELS_JSON
) has those properties:
ADDITIONAL_EMBEDDING_MODELS_JSON='[{
"endpoint": "https://myInstance.openai.azure.com/",
"apiKey": "mySecretApiKey",
"apiVersion": "2024-02-01",
"deploymentName": "text-embedding-ada-002"
}]'
Configuration
When the environment variable is set and the application has been restarted then we can start configuring a company of Unique FinanceGPT to us the new additional embedding model.
To configure an company to use a different embedding model an update of the company entry in the CompanyMeta table is required to specify this additional embedding model. This can be done via API request:
curl --location 'https://gateway.<tenantName>.unique.app/ingestion/graphql' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer <token>' \
--data '{"query":"mutation CompanyMetaUpdate($embeddingModel: String) {\n companyMetaUpdate(embeddingModel: $embeddingModel) {\n companyId\n }\n}","variables":{"embeddingModel":"text-embedding-ada-002"}}'
It will change the company entry based on the user this token belongs to. After this change the generation of the search embedding vector during a semantic search and also the ingestion of new data will use this specified embedding model.
Re-Embed existing content
If the company has already ingested data, a re-embedding of this data is required to align with the new embedding model. Otherwise you end up having discrepancy embedding vectors because they might be created with a different model. This re-embedding can be done with the content admin endpoint to re-embed all contents.
This is documented in this section: https://unique-ch.atlassian.net/wiki/x/RYHHIQ
Author | @Adrian Gugger |
---|
© 2024 Unique AG. All rights reserved. Privacy Policy – Terms of Service