Additional ingestion configuration options
There are other ingestion configuration options available.
Ingestion configuration
Following here is the complete ingestionConfig (first values = default). Those values can be adjusted as described in the sub chapters. And set on different levels for documents.
{
ingestionConfig: {
chunkMaxTokens: 600,
chunkMaxTokensOnePager: 1000,
chunkMinTokens: 3,
chunkStrategy: 'UNIQUE_DEFAULT_CHUNKING' | 'CUSTOM_CHUNKING_API',
pdfReadMode: 'PDFTODOCX_ONLY' | 'DOC_INTELLIGENCE_DEFAULT' | 'DOC_INTELLIGENCE_ON_TABLE' | 'DOC_INTELLIGENCE_FALLBACK' | 'DOC_INTELLIGENCE_DISABLED' | 'CUSTOM_SINGLE_PAGE_API',
jpgReadMode: 'NO_INGESTION' | 'DOC_INTELLIGENCE_DEFAULT',
wordReadMode: 'MAMMOTH_ONLY' | 'DOC_INTELLIGENCE_DEFAULT',
uniqueIngestionMode: 'INGESTION' | 'SKIP_INGESTION' | 'EXTERNAL_INGESTION';
documentMinTokens: 25;
customApiOptions: [] | Array<{
customisationType: 'CUSTOM_SINGLE_PAGE_API' | 'CUSTOM_CHUNKING_API',
apiIdentifier: 'YOUR IDENTIFIER',
apiPayload?: '{"xxx": "yyyy"}'
}>;
}
}
The ingestion configuration can be set on different levels:
On space level for uploads in chat: Ingestion Configuration: MS Document Intelligence (Layout) Ingestion | Enable for Upload in Chat
On scope/folder level via API: Ingestion Configuration: MS Document Intelligence GA Version (deprecated) | Enable for Scope In Knowledge Base
On the content object on file upload directly: https://unique-ch.atlassian.net/wiki/x/OIBdIw
On instance level for all companies on a tenant. Contact Unique Customer Success.
Setting the general Unique Finance GPT ingestion mode
This mode defines the overall behaviour of the ingestion. There are three possible options:
INGESTION
(default)SKIP_INGESTION
EXTERNAL_INGESTION
Mode Ingestion
This is the default mode. The content will be queued by Unique FinanceGPT to be ingested by Unique. During this ingestion flow there are still some customisations possible. But generally Unique will lead the ingestion process.
Mode skip_ingestion
This mode will directly set the status of an uploaded content as FINISHED
. Means Unique FinanceGPT expect that this content needs no ingestion. Use case for this can be uploaded images/charts to be referenced in a chat message.
Mode external_ingestion
The external ingestion mode indicates Unique FinanceGPT that the whole ingestion progress of this content will be done by an SDK integration. Therefor Unique will also skip its ingestion process but does set the status of the content to QUEUED
. In the assumption that the SDK will then pick up this message and start ingesting, adding chunks and updating the status of the content when finished.
Additional Parameters
The following ingestion parameters can also be set on the ingestion config of a space, scope or content.
chunkMaxTokens (number)
Defines the maximum amount of tokens a normal chunk is allowed to have.chunkMaxTokensOnePager (number)
Defines the maximum amount of tokens of a content with only one total page which might would result in two or at least one too small chunk with no contextual meaning.chunkMinTokens (number)
This defines the minimum amount of tokens a chunk needs to have.documentMinTokens (number)
The document min tokens defines a minimum amount of tokens a content needs to have. If this amount is not reached for a document it can indicate its owner that either the document is not meaningful or the ingestion process was not able to parse the content correctly. (scanned documents just containing images inside - no text)
Author | @Adrian Gugger |
---|
Related content
© 2025 Unique AG. All rights reserved. Privacy Policy – Terms of Service