Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Overview

The SharePoint Connector connects Microsoft SharePoint Online to Unique FinanceGPT via Microsoft PowerAutomate.

The Unique SharePoint connector consists of a series of Power Automate flows bunled in a solution that will be provided by Unique (ZIP file).

Clients need to import and setup the Solution in their own environment’s Power Automate.

As the solution uses environement variables, only one solution can be deployed per environment.

Files and site pages to sync are identified by an ad hoc boolean column (FinanceGPT in this documentation) whose name can be customised.

The solution runs globally and is able to sync all Sharepoint sites and libraries at once.

Requirements

  • A Power Automate (PA) environment with a Dataverse database

  • A PA user that has the following permissions on Sharepoint:

    • Read list and library names, as well as the names of the columns for all synchronised sites

    • Read files and metadata for all synchronised sites and libraries

    • Read and update the Configuration List items

  • Power Automate required connectors:

  • An Azure key vault with PA user read permission

  • Ability to call Unique’s endpoints from the PA environment

Current Export of the Solution

v. 1.0.0.8 (includes Delete Flows)

Architecture

The architecture of the SharePoint connector with the scheduled sync approach makes the SharePoint integration more scalable and more stable compared to an event based approach, as per the current state of SP events that show unreliability and inconsistency (Aug 2024).

The benefits of following the scheduled approach are:

  • SIngle run to synchronise multiple libraries without the need to duplicate the PA flows.

  • Easy and quick customisation by the customer’s SP admins to fit tailored needs.

  • Direct overview of the flow runs and errors by the SP admins allowing facilitated debugging by the same team that is in charge of running and maintaining SP.

The solution is a collection of flows that scan the files to ingest, detect the changes, and triggers ingestion, changes of folder, or deletion where needed.

The sychronisation flows are called sequentially and the whole synchronisation process cannot run in parallel to ensure consistency.

This means that no matter how high the synchronisation frequency is set, no new synchronisation will be triggered until the ongoing synchronisation is done sending all files to be ingested to Unique.

In the following sections, the logic of the SharePoint connector is explained in an overview diagram and the individual flows are explained in their dedicated section.

Overview Diagram

Scheduler flow

The “Scheduler” flow in the solution is responsible for triggering the sync of the files from SharePoint to Unique’s knowledge base.

The flow triggers the synchronisation process by calling the Sharepoint Files Scan flow.

Sharepoint Files Scan flow

The flow holds the main logic part and loops through the sites and their libraries. In the process it calls the child flows IDP Get Access Token to get he Unique token, and Ingest Content.

A list of SharePoint Sites that contain content to synchronise in some or all of their libraries must be provided as a configuration list. The main logic in the flow uses this input and executes a series of nested loops.

Within the listed sites, the flow identifies the libraries that have the FinanceGPT column set, and fetches the set of files that have been marked for ingestion.

We filter for the published documents and Site Pages with OData__ModerationStatus properties on the file. Field values are as follows:

0 = Approved
1 = Rejected
2 = Pending
3 = Draft

A diff is calculated by calling Unique file-diff endpoint on the ingestion service:

  • Deleted files → if content exists in the knowledge base, the content gets deleted from the knowledge base by Unique’s backend.

  • Moved files → path will be adjusted by Unique’s backend to maintain an accurate scope .

  • New files/ content updated → the list of created and updated files is sent back to the flow, and is passed to the Ingest Content flow.

Ingest Content flow

The flow is responsible for handling the calls to the Unique ingestion API and uploading the file content to the blob storage for ingestion. IDP Get Access Token flow is called as well in this flow to get a token for the API calls.

Calls to the flow are library based : the flow loops through the provided file id list of new and modified files within a single library and sends them for ingestion.

IDP Get Access Token flow

The flow is responsible for getting an access token that can be used to call the Unique APIs. Unique sets up a service user for each client that has the necessary permissions to call the ingestion APIs.

The service user’s client credentials will be provided to the clients and are used to get a token in this flow.

username + password flow (deprecated)

If you are not using a key vault you can use the IDP Get Access Token - username + password flow. You will need to configure the related environment variables.

You will need to modify the child flow called in the Get Access Token actions in both Sharepoint Files Scan and Ingest Content flows to select the proper child flow to get the token.

Delete Flows

Two flows are responsible for deleting all files belonging to a site and to its libraries.

Delete Library

Deletes the library by sending an empty array to file-diff. Ingested files are deleted in Unique by Unique’s backend.

Delete Site

Loops through all site libraries and call Delete Library for each one. THe flow is called when a site is marked for deletion in the configuration list.

Setup

Add custom column in SharePoint

A custom boolean column needs to be added to a SharePoint library to control the ingestion of the files.

To set the custom column:

  1. In SharePoint, add a custom column by clicking the “+ Add column” button. Select the column type “Yes/No”.

  2. Name the custom column. This will be the name you need to use when setting up the Power Automate Flow variables. Also set a default value for new files.

  3. After saving and creating the column, you can optionally format the column to make the selected values more obvious to the users in SharePoint. Select the column’s dropdown > Column settings > Format this column. There select “Format yes and no” (you can also choose the colors by editing the styles).

Setting up the SharePoint Connector’s Power Automate flows

The setup process for the Unique SharePoint Connector consists of the following steps:

  1. Import the Power Automate Solution provided by Unique

  2. Configure the environment variables while or after having imported the solution

The steps will be performed in Power Automate, which you can reach by navigating to https://make.powerautomate.com.

Import the Power Automate Solution

Unique provides the Unique SharePoint connector to Customers as an exported Power Automate solution, which is a ZIP file. Along with the ZIP file, the Customers receive client credentials and all necessary values for configuring the environment variables.

In Power Automate navigate to the Solutions tab on the left side. You should see an overview of all existing solutions. On the top, click the Import solution button and you will be prompted to provide a file. Upload and import the ZIP file that Unique provided containing the Power Automate solution for the Unique SharePoint connector.

Set Up the Configuration list

The Sharepoint sites to synchronise are stored in a Sharepoint list that needs to be accessible from the Power Automate Flow.

The site hosting the list does not have to be synchronised but it must be accessible by the user running the flow.

The list must contain the following columns:

  • Title : the title of the site. This is value is not evaluated by the flows, it is used to ease the sites management.

  • Url: the full url of the site inculding the protocol and the SP domain. Ex: https://uniqueapp.sharepoint.com/sites/SiteToSync

  • Scope & Owner Type : these two variables must be setup in accordance with each other. Possible values are:

    • Owner Type : SCOPE or COMPANY

    • Scope: COMPANY or PATH or the scope Id applicable to all synced files

    • Possible set of values and behaviour:

      • If both variables are set to COMPANY the files will be availble to the whole company.

      • If Scope is set to PATH, Owner Type must be set to SCOPE: each folder will generate its own permission scope that will be applied to nested files but not to subfolders which will have their own scope as permission set flattens the folder structure.

      • If Scope is set to a specific scope id applicable to all synchronised files, Owner Type must be set to SCOPE

⚠️ Any other set of scope values will fail the ingestion calls ⚠️

  • Delete: marks the site for deletion.

Configure environment variables

There are two logical sets of environment variables with the global prefix ufgpt like the rest of the objects within the solution:

  • sp_xxx related to Sharepoint setup

  • uq_xxxrelated to Unique setup

The Sharepoint variables must be configured as such:

  • sp_domain : the root Sharepoint domain

  • Configuration list variables: access to the conflist is managed by 2 environment variables:

    • sp_list_hosting_site : the site where the configuration list is hosted

    • sp_sites_list_name : the name of the list

      • the urls of the sites listed MUST be stored in a column named exactly Url

      • you can pass the list display name in sp_sites_list_name, but it is recommended to pass the list id to prevent disruption may the list name be inadvertently modified. Go to the list then Settings > List settings. The list id can be found in the url as List={the-list-id}.

  • sp_sync_column_name : the name of the Unique column that controls for file synchronisation

The Unique variables must be configured as such:

  • uq_file_sync_endpoint : the url of the file-diff endpoint on ingestion service

  • uq_idp_project_id: the Zitadel project id

  • uq_idp_token_endpoint : the Unique token issuer endpoint

  • uq_ingestion_graphql : /graphql endpoint on ingestion service

  • uq_store_internally: boolean value that defines whether or not the synced documents should also be stored internally at Unique. This allows users access to the file (e.g.: when clicking references in the chat) without the need to have access to the SharePoint library. Also needed to use the PDF preview / highlighting feature. Default value is false.

  • If you are using the username + password credential flow (deprecated), set up the following 2 variables:

    • uq_idp_client_id : Zitadel client id

    • uq_idp_client_secret : Zitadel client secret

Power Automate does not like empty environment variables so if they are not used, these variables should be left to the default empty string value "".

Scopes Considerations

Scope types

3 types of scopes are available:

  • PATH : file access scopes are attributed by SP folder. Folder hierarchy is flat, meaning that access to a folder does not grant access to subfolder nor top folders. Individual scopes must be attributed manually in the backend

  • COMPANY : all ingested files are available at the company level

  • Scope Id: all ingested files are attributed to a single specific scope identified by its scope id (can be different from the company scope id).

SP Service User Scope Access

The connector is in charge of creating the needed scopes in the case of scope set to PATH. The service user is automatically granted READ/WRITE permission on all the scopes it creates, and only on those.

Known Limitations & Issues

Ingestion of linked content in SharePoint Pages

What can be ingested from SharePoint pages has limitations.

What works:

  • All text content

What does not work:

  • Linked document libraries

  • Linked content in general

The limitation stems from the fact that we fetch the content of the SharePoint page from SharePoint’s API via Power Automate flow and what we receive there is the actual content on the page. Linked content, like a embedded Document Library Widget cannot be ingested because it’s just a link / embedding / iFrame that shows content on that page but is loaded from elsewhere (not present in the content we fetch from the API).

Other Known Limitations

Note that most known limitations can be overcome by adapting the fows to your needs.

The Sharepoint Inline - Power automate connector is designed to serve as a general purpose connector for the Unique chat ingestion.

It mostly serves as a basis for further configurations that match your specific needs, and we will be happy to assist you with those.

Splitting the flows

In its current form, failing to sync a file will show the whole scheduled synchronisation as failed. Debugging can be cumbersome as you have to go though each iteration to eventually find the culprit. This also means that the whole sync might be tried again until cancellation or resolution.

One way to mitigate this side effect would be to split the ingestion flow to decorrelate the call to Unique from the Sharepoint calls, to have the actual file ingestion flow to be triggered at file level. This would create an unwanted side effect though: as Zitadel is not able to provide the current valid token, it recreates a new token for each call. This means that we would rapidly hit the token issuance limit from the file ingestion flow.

This could be mitigated by scheduling a token issuance and storing it in the key vault, and have the single file ingestion flow fetch the token from the key vault rather than from the token endpoint.

For this, the service principal connecting to the key vault must have write access on the key vault.

Scopes + Deletion at library level

The connector is designed to work at the sites level, as defined in the configuration list.

In its basic state it is unable to pass scopes at library nor file levels, and unable to automatically deletes individual libraries.

Nonetheless, the flow Delete Library can be called manually if needed.

Also, the configuration list and the Scan Files to Sync flow can be adapted to work at the library levels.
In the same manner, custom metadata can be added at the file level and flows can be adjusted to pass scopes at file level.

  • No labels