Frequently Asked Questions (FaQ)

Prompts

Can one configure custom prompts?	Yes, all Prompts can be completely customised
Is there a platform available for conducting automatic tests on prompts, including comparing results etc.?	Yes, we have a benchmarking that automatically tests hundreds of prompts against the data and models that are in the system.
Is it possible to implement version control for prompts, such as maintaining a development version, publishing a beta version, and continuing to use a previous version?	Yes, it is possible to apply version control to all prompts within the system, allowing for the independent experimentation of new prompts without affecting those that are already operational. This facilitates the development of new prompts and Assistants.
Can prompts be shared by user-groups?	Yes, prompts can be defined by user-groups.
Can one get feedbacks for the prompts?	Yes, there is a feedback mechanism for each answer so the users can give feedback on the quality of the prompts.
Can one call configured prompts from an API?	Yes, this is possible.
Is it possible to configure unique disclaimers for each prompt?	Yes, disclaimers for each prompt can be configured by the system's Admin, who has the ability to set disclaimers per User-Group.
How are updates carried out?	Currently, updates are done via API, but a User Interface is expected to be launched in April.
Is chat history, encompassing questions and answers, stored somewhere, or are only the details of the current chat session retained? If stored, where is this information kept?	The chat history is stored in two places: Audit Logs In the history of the user, saved in a database accessible only by the client and not Unique Prompts will not be stored on Microsoft Azure as we opted out for abuse monitoring, preventing Microsoft from saving the prompts.

Large Language Models (LLMs)

What are the LLMs that can be used?	Any provisioned models can be connected to the system via config. So far all, OpenAI Models have been tested: GPT 3.5 - GPT 3.5-Turbo GPT 4-Turbo - GPT 4-Turbo GPT 4-Vision (in various minor versions) Other LLM’s: Mistral AI - Zephyr AI
Can models be customised / bring our own models?	Yes, we even have customers that are doing this.
Is there a platform available for conducting automated tests and comparing results for custom models?	Yes, through benchmarking, it is possible to perform comparisons. There is a documentation for this process: Benchmarking
How can one train, test, and deploy a model for use with Unique’s solution?	So far, we have not directly trained a model ourselves; instead, our customers have undertaken this task. However, our Data-Science team has provided support and guidance to them throughout the process.
Does one need to use Azure AI Studio?	There's no need to restrict yourself to Azure AI Studio exclusively. As long as the model can be provisioned, we are capable of integrating it.
Is it possible to implement version control for models, such as maintaining a development version, publishing a beta version, and continuing to use a previous version?	Yes, within the system, each prompt allows for the selection of the model and its version at will. We practice this on a regular basis, especially with the release of new minor or preview versions from Azure OpenAI.
Can models be shared by user-groups?	Yes, it can be scoped by user-groups.
Can models be restricted by user-groups?	Yes, it can be restricted by user-groups.
Can tokens consumption be followed by model or by user-groups?	This feature is currently under development and not available yet. However, an Analytics Framework with downloadable CSV-Reports is already in place and covers these points: User Engagement Assistant Usage Most referenced files Read more about this here: Analytics A report incl. consumption by assistant/model is planned for Q2 2024.
Are there several types of prices depending on the models used ?	Our pricing model remains fixed, however, the costs of the underlying models set by Microsoft are subject to change and are transparently communicated back to you. Prices may fluctuate. We offer guidance on which prompts require specific models.
How is visibility kept on the costs related to the API usage?	We report the costs generated on the Subscription on a monthly basis. In the early days of the project, we can negotiate a faster rhythm.
Is it possible to set token limits for each model or user group, including actions like sending alerts or shutting down the API?	This feature is not available yet and currently under development, planned for Q3 2024.
Is it possible to grant standard access to ChatGPT-3.5, replacing the direct access currently provided to certain staff members?	Yes, this is even included in the base configuration of Unique. You can even give access to ChatGPT-4.
Is your solution offered on the MS Marketplace?	Unique is currently not offered in the MS Marketplace.
What test have been done to select the appropriate LLM models?	We conducted benchmarks using our documents, and our clients performed similar tests. This process helps us select the most suitable models for each prompt or use case. While we have evaluated other models, we found that they do not yet match the performance of GPT-4, especially in situations requiring RAG.

Services

How do guardrails work?	The language model operates within a set structure, using only the data provided by the organisation to ensure its responses comply with specific standards and do not include external information not given by the company. Furthermore, by including citations in each reply, the origin of the information used in the responses can be traced. Additionally, extra safeguards can be implemented into the chat flow as needed, particularly if the user input encompasses forbidden or harmful material.
What tooling is used for pseudonymisation?	A local model is employed, executed directly within the cluster and independent of OpenAI, to recognise names and entities. These identified elements are subsequently substituted with anonymised tokens, which are later restored to their original form.
How is Document ingestion maintained?	We maintain multiple default ingestion pipelines for the different types of files. See the documentation here: Ingestion Customers can build their own in the context of our Co-Development Agreement if needed. We are improving continuously to get the best possible results in for the RAG.
Are the sources always shared with the users?	Yes, Unique adds references to each answer to indicate to the user where the information is coming from. This happens through the RAG process.
Can automated workflows be executed?	Yes, we already have customers that use our API to execute workflows autonomously without the intervention of a user.
How is continuous feedback loop orchestrated?	As an admin, you can export the user feedback as CSV on demand. There will be monthly meetings with the project lead to analyse the feedback and to derive improvements options.
Can your system integrate with various Identity Providers (IDPs), and does it support seamless user provisioning and login with credentials from external systems?	The IDP can be integrated in our system. Your logins can be used, and users are automatically provisioned. We support the following list: https://zitadel.com/docs/guides/integrate/identity-providers
What gets anonymised and how does it work?	The anonymisation service processes the prompt intended for the OpenAI Endpoint by performing Named-Entity Recognition. It replaces identified entities with placeholders before sending them to the model. Once the model responds, the anonymised placeholders are replaced with the original identifying data. The user will not receive the anonymised entities in the response. Additionally, the data is stored in subscription databases, which are exclusively accessible by the client.
How flexible can new services be developed and tested?	This can be done independently developed, and tested. Each developer can run an independent version of FinanceGPT on their local machine to develop without interfering with others.
How would customised workflows be prepared and released?	If you develop your own assistants that are not coming as part of the default, these assistants need to be deployed. The deployment can be orchestrated by you or us. Below you find a drawing explaining the process.

Architecture, RAG, Vectors and more

What technologies are used in the RAG pattern?	For Vectorisation, the embedding model ADA from Azure OpenAI’s is used. To learn more about our Architecture, see here: Architecture We use Qdrant to save the vector and the metadata (self-hosted). For saving text, we use Postgres (Azure service).
Why is vector DB Qdrant being used?	Qdrant performs very well on metadata filtering and similarity search compared to others. This is also needed for ACL
Do you duplicate data and store a local copy of indexed documents?	Yes, we store the data locally.
What are the existing connectors?	Sharepoint (online/on prem) Confluence (online/on prem) Website-Crawlers
Can a local (on premise) vector database be used?	If Unique is deployed on prem, yes. But in phase 1, it’s a workload we deploy on Azure fully encrypted.
Do connectors support images, video and sound indexation?	Currently not, though we are exploring options with GPT-4 Vision.
Can hook be added in the dataflow to check data before indexing?	Planned for Q3 2024 in product roadmap
Can hook be added to enrich metadata during indexing?	Planned for Q3 2024 in product roadmap
How is the lifecycle of indexed documents managed?	Same documents are replaced with the new version. Content owners are responsible for deduplication.
What are the supported languages?	All the languages that OpenAIs models support.
Can multiple context sources, like vector databases + custom databases be used?	Yes, this is possible.
Is there an initial limit on the documents provided?	No, but it’s useful to only index what is truly needed, this makes the quality control easier. Documents are taken in and transformed by our ingestion-workers into markdown. Markdown is then broken apart into chunks preserving titles with paragraph connections. And tables with headings so that the ideal context is given to the models at retrieval time.
Are the sources of information selected automatically?	Yes, this is fully automatic.
What limitations on documents are there?	Images on documents are not yet included in the ingestion process.
Can defined users or applications be viewed in the tenant?	Yes, this is possible.
How can the chatbot in web application be integrated?	This can be achieved by utilizing Unique's APIs or by employing Iframe-like functionalities for front-end display.
Can the solution be integrated with Microsoft Dynamic CRM on premise?	Yes, this is possible.
Can Semantic Kernel with indexation, prompt, models be used?	Yes, this is possible.
Can LangChain be used to interact with indexation, prompt, models?	Yes, any python can be used with our APIs/SDK. See details about the APIs/SDK here: Software Development Kit (SDK)
How long does a typical RAG request work?	Time to streaming the answer is around 3-5 seconds depending on the use case.

Security & Compliance

How do you adhere to the data security measures implemented on the data source when querying data in the vector database?	There are access controls applied to adhere to this.
Can we restrict access with MFA or IP filtering?	Yes, both options are possible.
Can we have access to audit logs on resources security configuration?	Yes, audit logs be available upon request.
How can the conversation history be extracted?	You can extract your chat history via API.
Can we view defined users or applications in the tenant ?	Yes this is possible.
Where is client data hosted?	We work together with Microsoft Switzerland and our data is stored in the Azure Cloud in Switzerland.
Is there a process maintained to remove personal data based on the right to be forgotten if applicable to the services provided?	Yes, there is a process in place.
Are there any other locations outside Switzerland where data is stored?	Only if recorded through the app or uploaded manually on the Unique Portal the recording is temporarily (1 hour) stored in Frankfurt, Germany for transcription. Otherwise no.
Where are the videos saved that you record?	On Microsoft Azure cloud hosted in Switzerland protected by enterprise security standards of Microsoft.
Does Microsoft Switzerland share data with Microsoft US (based on the so called CLOUD Act)?	No. Data is never shared between Microsoft CH and Microsoft US.
Is full-disk encryption enabled for all systems that store or process Scoped Data?	Yes, it is.
Is recording of client conversations legally allowed?	Yes, in the European area as long as the caller asks for consent before recording (it is a GDPR requirement)
Does the US government have access to the data on Azure CH (based on the CLOUD Act)?	Not directly. The US government can request access to any data regardless of where it is stored based on the CLOUD Act if a judge approves the request These requests are blocked by Microsofts legal team in most cases before they would even be forwarded to us. Which is a clear benefit of the Azure cloud compared to on-premises storage, where all requests would be sent directly to the data controller

Author	Daylan Araz