General Flow

 ATM, ingestion service does not allow for arrays of files to ingest, so a separate call must be made for each document, one to save the metadata and one to mark the document content ready for ingestion.

image-20240607-123358.png

3 successive calls must be made to Unique to trigger the ingestion:

  1. Initial call to ingestion serviceto store the document metadata and get back the writeUrl for file storage

  2. Call to the blob storage to store the document and make it available for ingestion

  3. Second call to ingestion service to trigger the document content ingestion

Credentials

It’s recommended to create a service user and use the service user’s credentials to make the calls to the ingestion API. For creating a service user and getting a valid JWT access token for the service user refer to this documentation https://unique-ch.atlassian.net/wiki/x/KYAUIw.

The received access token must be passed to the ingestion calls in the headers a a bearer token:

"Authorization: Bearer your_access_token"

Writes to the blob storage do not need credentials as they are made through a signed url.

Ingestion Calls

Find below curl examples with placeholders. All calls to ingestion service are POST calls.

Initial call to ingestion service

curl --location 'http://gateway.<baseUrl>/ingestion/graphql' \
  --header 'Content-Type: <MIME_TYPE>' \
  --header 'Authorization: Bearer <your-access-token>' \
  --data '{"query":"mutation ContentUpsert($input: ContentCreateInput!, $scopeId: String, $sourceOwnerType: String, $sourceName: String, $storeInternally: Boolean) {\n  contentUpsert(\n    input: $input\n    scopeId: $scopeId\n    sourceOwnerType: $sourceOwnerType\n    sourceName: $sourceName\n     storeInternally: $storeInternally\n  ) {\n    id\n    key\n    byteSize\n    mimeType\n    ownerType\n    ownerId\n    writeUrl\n    readUrl\n    createdAt\n    internallyStoredAt\n  }\n}","variables":{"input":{"key":"<arbitrary id. Can be an external id, file name, etc>","mimeType":"<DOCUMENT_MIME_TYPE>","ownerType":"<COMPANY | SCOPE>","byteSize":<DOCUMENT_BYTE_SIZE>},"scopeId":"<SCOPE_ID | PATH | COMPANY>","sourceOwnerType":"<USER | COMPANY | SCOPE>","storeInternally":<boolean>,"sourceName": "<Custom string to identify the source>"}}'

Upload to blobstorage

The document is sent tot the blob storage signed url is received in the response’s body from the initial call to the ingestion service indata.contentUpsert.writeUrl.

curl --location --request PUT '<writeUrl>' \
--header 'Content-Type: application/pdf' \
--header 'x-ms-blob-type: BlockBlob' \
--data '<binary-file-data>'

Second call to ingestion service

The second call to mark the file for ingestion must be the exact same call with the same parameters, with the extra field fileUrl passed as such:

curl --location 'http://gateway.<baseUrl>/ingestion/graphql' \
  --header 'Content-Type: <MIME_TYPE>' \
  --header 'Authorization: Bearer <your-access-token>' \
  --data '{"query":"mutation ContentUpsert($input: ContentCreateInput!, $fileUrl: String, $scopeId: String, $sourceOwnerType: String, $sourceName: String,  $storeInternally: Boolean) {\n  contentUpsert(\n    input: $input\n    fileUrl: $fileUrl\n    scopeId: $scopeId\n    sourceOwnerType: $sourceOwnerType\n    sourceName: $sourceName\n     storeInternally: $storeInternally\n  ) {\n    id\n    key\n    byteSize\n    mimeType\n    ownerType\n    ownerId\n    createdAt\n    internallyStoredAt\n  }\n}","variables":{"input":{"key":"<arbitrary id. Can be anexternal id, file name, etc>","mimeType":"<DOCUMENT_MIME_TYPE>","ownerType":"<COMPANY | SCOPE>","byteSize":<DOCUMENT_BYTE_SIZE>},"scopeId":"<SCOPE_ID | PATH | COMPANY>","sourceOwnerType":"<USER | COMPANY | SCOPE>","fileUrl":"<BLOB_READ_URL>","storeInternally":<boolean>,"sourceName": "<Custom string to identify the source>"}}'

Passing fileUrl to the second call tells the service that the file is available for reads, and thus for ingestion. 

For testing purposes, sourceOwnerType,scopeId and ownerType can all be set to COMPANY giving automatically access to all chat users. Access scoping can be discussed later based on your needs. 

Scope management (retrieval of existing scopes, creation of new scopes, etc) through the API is documented here.

Call to Ingestion service: parameters

(blue star) Any other set of values will fail the ingestion calls (blue star)