Ingestion API
Ingestion (Text)
GraphQL Endpoint:
curl --location 'http://gateway.<baseUrl>/ingestion/graphql' \
--header 'Content-Type: <MIME_TYPE>' \
--header 'Authorization: <your-access-token>' \
--data '{"query":"mutation ContentUpsert($input: ContentCreateInput!, $scopeId: String, $sourceOwnerType: String, $sourceName: String, $storeInternally: Boolean, $text: String, $sourceKind: String)\r\n{\r\n contentUpsert(\r\n input: $input\r\n scopeId: $scopeId\r\n sourceOwnerType: $sourceOwnerType\r\n sourceName: $sourceName\r\n storeInternally: $storeInternally\r\n text: $text\r\n sourceKind: $sourceKind\r\n ) {\r\n id\r\n key\r\n byteSize\r\n mimeType\r\n ownerType\r\n ownerId\r\n writeUrl\r\n readUrl\r\n createdAt\r\n internallyStoredAt\r\n }\r\n}\r\n","variables":{"input":{"title":"Lorem Ipsum","key":"<unique identifier of the content>","ownerType":"<COMPANY | SCOPE | USER | CHAT>","mimeType":"text/html"},"scopeId":"<SCOPE_ID | PATH | COMPANY>","sourceOwnerType":"<USER | COMPANY | SCOPE>","sourceKind":"<UNIQUE_BLOB_STORAGE | FILE_DOWNLOAD | ATLASSIAN_CONFLUENCE_CLOUD | ATLASSIAN_CONFLUENCE_ONPREM | MICROSOFT_365_SHAREPOINT | INTRANET>","text":"Lorem Ipsum"}}'
Parameters:
input
[json]
key
required [string]
A unique external identifier of this content. Since the ingestion is an upsert, ingesting with the same key and scopeId will update the content. Therefore, to update content, post the same payload with the updated text, title, etc.
mimeType
required [string]
The MIME type of the content, see Common MIME types - HTTP | MDN
ownerType
required [enum]USER
, COMPANY
, CHAT,
SCOPE
title
optional [string]
The title of this content. This will be shown when a reference is displayed in the answer from the AI.
url
optional [string]
The url to the source of the content. This will be shown when a reference is displayed in the answer from the AI.
metadata
optional [json]
The document metadata.
text
optional [string]
The main text content that will be ingested. This can be a string of raw text, HTML, or Markdown. When using HTML or Markdown, it is strongly advised to cut unnecessary and unrelevant data (like header, footer, etc.).
scopeId
optional [string]
This will be provided by Unique during the setup phase (depending on the scope).
storedInternally
optional [boolean]
Whether the documents are persisted in Unique’s storage.
sourceKind
optional [boolean]UNIQUE_BLOB_STORAGE
(default), FILE_DOWNLOAD
, ATLASSIAN_CONFLUENCE_CLOUD
, ATLASSIAN_CONFLUENCE_ONPREM
, MICROSOFT_365_SHAREPOINT
, INTRANET
sourceName
optional [boolean]
The name of the source, for example “Intranet connector”.
sourceOwnerType
optional [enum]USER
, COMPANY
, CHAT,
SCOPE
Example
Example JSON with the structure that our Confluence connector uses:
{
"input": {
"title": "Example Title",
"key": "123",
"ownerType": "COMPANY",
"mimeType": "text/html"
},
"scopeId": "COMPANY",
"sourceOwnerType": "COMPANY",
"sourceKind": "INTRANET"
"text": "Lorem Ipsum"
}
Ingestion (File)
General Flow
ATM, ingestion service does not allow for arrays of files to ingest, so a separate call must be made for each document, one to save the metadata and one to mark the document content ready for ingestion.
3 successive calls must be made to Unique to trigger the ingestion:
The initial call to
ingestion service
stores the document metadata and gets back thewriteUrl
for file storageCall to the blob storage to store the document and make it available for ingestion
The second call to the
ingestion service
triggers the document content ingestion
Credentials
It’s recommended to create a service user and use the service user’s credentials to make the calls to the ingestion API. For creating a service user and getting a valid JWT access token for the service user refer to this documentation https://unique-ch.atlassian.net/wiki/x/KYAUIw.
The received access token must be passed to the ingestion calls in the headers as a bearer token:
"Authorization: Bearer your_access_token"
Writes to the blob storage do not need credentials as they are made through a signed url.
Ingestion Calls
Find below curl examples with placeholders. All calls to ingestion service are POST
calls.
Initial call to ingestion service
curl --location 'http://gateway.<baseUrl>/ingestion/graphql' \
--header 'Content-Type: <MIME_TYPE>' \
--header 'Authorization: Bearer <your-access-token>' \
--data '{"query":"mutation ContentUpsert($input: ContentCreateInput!, $scopeId: String, $sourceOwnerType: String, $sourceName: String, $storeInternally: Boolean) {\n contentUpsert(\n input: $input\n scopeId: $scopeId\n sourceOwnerType: $sourceOwnerType\n sourceName: $sourceName\n storeInternally: $storeInternally\n ) {\n id\n key\n byteSize\n mimeType\n ownerType\n ownerId\n writeUrl\n readUrl\n createdAt\n internallyStoredAt\n }\n}","variables":{"input":{"key":"<arbitrary id. Can be an external id, file name, etc>","mimeType":"<DOCUMENT_MIME_TYPE>","ownerType":"<COMPANY | SCOPE>","byteSize":<DOCUMENT_BYTE_SIZE>},"scopeId":"<SCOPE_ID | PATH | COMPANY>","sourceOwnerType":"<USER | COMPANY | SCOPE>","storeInternally":<boolean>,"sourceName": "<Custom string to identify the source>"}}'
Upload to blobstorage
The document is sent to the blob storage signed URL is received in the response’s body from the initial call to the ingestion service indata.contentUpsert.writeUrl
.
Second call to ingestion service
The second call to mark the file for ingestion must be the exact same call with the same parameters, with the extra field fileUrl
passed as such:
Passing fileUrl
to the second call tells the service that the file is available for reads, and thus for ingestion.
For testing purposes, sourceOwnerType
,scopeId,
and ownerType
can all be set to COMPANY
giving automatic access to all chat users. Access scoping can be discussed later based on your needs.
Scope management (retrieval of existing scopes, creation of new scopes, etc) through the API is documented here.
Call to Ingestion service: parameters
ownerType
&scopeId
must be set up in accordance with each other. Possible values are:ownerType
:SCOPE
orCOMPANY
scopeId
:COMPANY
orPATH
or the scope Id applicable for the documentPossible sets of values and behavior:
If both variables are set to
COMPANY
the files will be available to the whole company.If
scopeId
is set toPATH
,ownerType
must be set toSCOPE
: this is in the case that you want to match/create scopes based on the document source folder structure. Each folder will generate its own permission scope that will be applied to nested files but not to subfolders which will have their own scope as permission set flattens the folder structure.If
scopeId
is set to a specific scope id,ownerType
must be set toSCOPE
Any other set of values will fail the ingestion calls
storeInternally
: boolean value that defines whether or not the synced documents should also be stored permanently in Unique’s blob storage. This allows users access to the file (e.g.: when clicking references in the chat) without the need to have access to the document source system from the chat context. It is mandatory if you want to use the PDF preview / highlighting feature. Default value isfalse
.
Delete Content:
URL:
GrapQL Query:
GraphQL Variables:
Author | @Jeremy Isnard |
---|
© 2024 Unique AG. All rights reserved. Privacy Policy – Terms of Service