Working with Content admin Endpoints

 

You may clog your system when over-using these endpoints.

Be reasonable with the number of documents you run at once and take small steady steps.

Ingestion from other sources might become delayed if these endpoints are overused.

These are maintenance endpoints for occasional use.


Some content might be not correctly reflected in our application. You can execute some corrective operations on them.


Each of these methods uses a 2-phase protocol.

  1. Mark the content.

  2. Execute the operations on the content.

You will need admin access rights for the following queries.


Execute this query against the Ingestion GraphQL endpoint.

For this you will need an access token this is how you get it:

Get an Auth Token using this guideline: How to get a Token for our APIs

Checking the content

query Content { content(where: { key: { contains: "MediGroup" } }) { metadata key id } }

Re-Indexing vectors

Synchronizes the vectors from the Postgres with the VectorDB.

This can only be done for files in the state FINISHED.

mutation MarkAllForReindexing { markAllForReindexing(where: { key: { contains: "MediGroup" } }) }
mutation ReIndexVectorDB { reIndexVectorDB(waitAfterRounds: 40, waitInMs: 250) }

Re-Embed the text of chunks

In some cases, for example changing of the embedding model, it is required to create new embeddings based on the text of the chunks of the contents. This can be done with the following two API requests:

APIs & Integrations

Marking them as RE_EMBEDDING

Start the Re-Embedding

Rebuilding Meta Data

This adds the default metadata again on the vectors and keeps the current metadata fields intact, which is used once you need the metadata filtering for search etc. Eventually, all data needs to be migrated like this.

This can only be done for files in the state FINISHED.

Check Integrity

This compares the vectors of the content in the Postgres database and the VectorDB. Should it be out of sync it will call the reindexing to put the vectors into sync again basically repairing the state.
This can only be done for files in the state FINISHED.

Re-Ingest

It sometimes occurs that you encounter an issue with the current ingestion and need to re-ingest the whole file again eg. too long chunks because of tables that were not taken apart nicely.
Then you would like to re-ingest.

Limitations
You can only reingest files that are stored on our blobs this is handled automatically though.
This can only be done for files in state FINISHED & FAILED.

This uses the producer worker principle as normal ingestion.

Force a certain state for content

Content may end up in the wrong state for whatever reason but can be forced into a state.

 


Author

@Andreas Hauri

 

© 2024 Unique AG. All rights reserved. Privacy PolicyTerms of Service