Frequently Asked Questions (FaQ)

Prompts

Can one configure custom prompts?	Yes, all Prompts can be completely customised
Is there a platform available for conducting automatic tests on prompts, including comparing results etc.?	Yes, we have a benchmarking that automatically tests hundreds of prompts against the data and models that are in the system.
Is it possible to implement version control for prompts, such as maintaining a development version, publishing a beta version, and continuing to use a previous version?	Yes, it is possible to apply version control to all prompts within the system, allowing for the independent experimentation of new prompts without affecting those that are already operational. This facilitates the development of new prompts and Assistants.
Can prompts be shared by user-groups?	Yes, prompts can be defined by user-groups.
Can one get feedbacks for the prompts?	Yes, there is a feedback mechanism for each answer so the users can give feedback on the quality of the prompts.
Can one call configured prompts from an API?	Yes, this is possible.
Is it possible to configure unique disclaimers for each prompt?	Yes, disclaimers for each prompt can be configured by the system's Admin, who has the ability to set disclaimers per User-Group.
How are updates carried out?	Currently, updates are done via API, but a User Interface is expected to be launched in April.
Is chat history, encompassing questions and answers, stored somewhere, or are only the details of the current chat session retained? If stored, where is this information kept?	The chat history is stored in two places: Audit Logs In the history of the user, saved in a database accessible only by the client and not Unique Prompts will not be stored on Microsoft Azure as we opted out for abuse monitoring, preventing Microsoft from saving the prompts.

Large Language Models (LLMs)

What are the LLMs that can be used?	Any provisioned models can be connected to the system via config. So far all, OpenAI Models have been tested: GPT 3.5 - GPT 3.5-Turbo GPT 4-Turbo - GPT 4-Turbo GPT 4-Vision (in various minor versions) Other LLM’s: Mistral AI - Zephyr AI
Can models be customised / bring our own models?	Yes, we even have customers that are doing this.
Is there a platform available for conducting automated tests and comparing results for custom models?	Yes, through benchmarking, it is possible to perform comparisons. There is a documentation for this process: Benchmarking
How can one train, test, and deploy a model for use with Unique’s solution?	So far, we have not directly trained a model ourselves; instead, our customers have undertaken this task. However, our Data-Science team has provided support and guidance to them throughout the process.
Does one need to use Azure AI Studio?	There's no need to restrict yourself to Azure AI Studio exclusively. As long as the model can be provisioned, we are capable of integrating it.
Is it possible to implement version control for models, such as maintaining a development version, publishing a beta version, and continuing to use a previous version?	Yes, within the system, each prompt allows for the selection of the model and its version at will. We practice this on a regular basis, especially with the release of new minor or preview versions from Azure OpenAI.
Can models be shared by user-groups?	Yes, it can be scoped by user-groups.
Can models be restricted by user-groups?	Yes, it can be restricted by user-groups.
Can tokens consumption be followed by model or by user-groups?	This feature is currently under development and not available yet. However, an Analytics Framework with downloadable CSV-Reports is already in place and covers these points: User Engagement Assistant Usage Most referenced files Read more about this here: Analytics A report incl. consumption by assistant/model is planned for Q2 2024.
Are there several types of prices depending on the models used ?	Our pricing model remains fixed, however, the costs of the underlying models set by Microsoft are subject to change and are transparently communicated back to you. Prices may fluctuate. We offer guidance on which prompts require specific models.
How is visibility kept on the costs related to the API usage?	We report the costs generated on the Subscription on a monthly basis. In the early days of the project, we can negotiate a faster rhythm.
Is it possible to set token limits for each model or user group, including actions like sending alerts or shutting down the API?	This feature is not available yet and currently under development, planned for Q3 2024.
Is it possible to grant standard access to ChatGPT-3.5, replacing the direct access currently provided to certain staff members?	Yes, this is even included in the base configuration of Unique. You can even give access to ChatGPT-4.
Is your solution offered on the MS Marketplace?	Unique is currently not offered in the MS Marketplace.
What test have been done to select the appropriate LLM models?	We conducted benchmarks using our documents, and our clients performed similar tests. This process helps us select the most suitable models for each prompt or use case. While we have evaluated other models, we found that they do not yet match the performance of GPT-4, especially in situations requiring RAG.

Services

How do guardrails work?	The language model operates within a set structure, using only the data provided by the organisation to ensure its responses comply with specific standards and do not include external information not given by the company. Furthermore, by including citations in each reply, the origin of the information used in the responses can be traced. Additionally, extra safeguards can be implemented into the chat flow as needed, particularly if the user input encompasses forbidden or harmful material.
What tooling is used for pseudonymisation?	A local model is employed, executed directly within the cluster and independent of OpenAI, to recognise names and entities. These identified elements are subsequently substituted with anonymised tokens, which are later restored to their original form.
How is Document ingestion maintained?	We maintain multiple default ingestion pipelines for the different types of files. See the documentation here: Ingestion Customers can build their own in the context of our Co-Development Agreement if needed. We are improving continuously to get the best possible results in for the RAG.
Are the sources always shared with the users?	Yes, Unique adds references to each answer to indicate to the user where the information is coming from. This happens through the RAG process.
Can automated workflows be executed?	Yes, we already have customers that use our API to execute workflows autonomously without the intervention of a user.
How is continuous feedback loop orchestrated?	As an admin, you can export the user feedback as CSV on demand. There will be monthly meetings with the project lead to analyse the feedback and to derive improvements options.
Can your system integrate with various Identity Providers (IDPs), and does it support seamless user provisioning and login with credentials from external systems?	The IDP can be integrated in our system. Your logins can be used, and users are automatically provisioned. We support the following list: https://zitadel.com/docs/guides/integrate/identity-providers
What gets anonymised and how does it work?	The anonymisation service processes the prompt intended for the OpenAI Endpoint by performing Named-Entity Recognition. It replaces identified entities with placeholders before sending them to the model. Once the model responds, the anonymised placeholders are replaced with the original identifying data. The user will not receive the anonymised entities in the response. Additionally, the data is stored in subscription databases, which are exclusively accessible by the client.
How flexible can new services be developed and tested?	This can be done independently developed, and tested. Each developer can run an independent version of FinanceGPT on their local machine to develop without interfering with others.
How would customised workflows be prepared and released?	If you develop your own assistants that are not coming as part of the default, these assistants need to be deployed. The deployment can be orchestrated by you or us. Below you find a drawing explaining the process.

Architecture, RAG, Vectors and more

What technologies are used in the RAG pattern?	For Vectorisation, the embedding model ADA from Azure OpenAI’s is used. To learn more about our Architecture, see here: Architecture We use Qdrant to save the vector and the metadata (self-hosted). For saving text, we use Postgres (Azure service).
Why is vector DB Qdrant being used?	Qdrant performs very well on metadata filtering and similarity search compared to others. This is also needed for ACL
Do you duplicate data and store a local copy of indexed documents?	Yes, we store the data locally.
What are the existing connectors?	Sharepoint (online/on prem) Confluence (online/on prem) Website-Crawlers
Can a local (on premise) vector database be used?	If Unique is deployed on prem, yes. But in phase 1, it’s a workload we deploy on Azure fully encrypted.
Do connectors support images, video and sound indexation?	Currently not, though we are exploring options with GPT-4 Vision.
Can hook be added in the dataflow to check data before indexing?	Planned for Q3 2024 in product roadmap
Can hook be added to enrich metadata during indexing?	Planned for Q3 2024 in product roadmap
How is the lifecycle of indexed documents managed?	Same documents are replaced with the new version. Content owners are responsible for deduplication.
What are the supported languages?	All the languages that OpenAIs models support.
Can multiple context sources, like vector databases + custom databases be used?	Yes, this is possible.
Is there an initial limit on the documents provided?	No, but it’s useful to only index what is truly needed, this makes the quality control easier. Documents are taken in and transformed by our ingestion-workers into markdown. Markdown is then broken apart into chunks preserving titles with paragraph connections. And tables with headings so that the ideal context is given to the models at retrieval time.
Are the sources of information selected automatically?	Yes, this is fully automatic.
What limitations on documents are there?	Images on documents are not yet included in the ingestion process.
Can defined users or applications be viewed in the tenant?	Yes, this is possible.
How can the chatbot in web application be integrated?	This can be achieved by utilizing Unique's APIs or by employing Iframe-like functionalities for front-end display.
Can the solution be integrated with Microsoft Dynamic CRM on premise?	Yes, this is possible.
Can Semantic Kernel with indexation, prompt, models be used?	Yes, this is possible.
Can LangChain be used to interact with indexation, prompt, models?	Yes, any python can be used with our APIs/SDK. See details about the APIs/SDK here: Software Development Kit (SDK)
How long does a typical RAG request work?	Time to streaming the answer is around 3-5 seconds depending on the use case.

Data Protection, Data Retention and Data Storage

Where is client data hosted?	We work together with Microsoft Switzerland and our data is stored in the Azure Cloud in Switzerland.
How is my data segregated from other customers data?	If you choose the Platform as a service deployment option your data is logically separated from other customers. If you have stronger requirements regarding tenant separation the single tenant deployment option completely physically separates your data in your own azure landing zone from other customers.
Will my data be used to train any models or fine-tune models?	No. No client data will be used without explicit consent in written form from client.
Does the Azure Open.AI Model learn from my data?	No. Azure OpenAI models never learn from data and Unique has an opt-out available from output checking with Microsoft.
Will my data be send to “unsafe, third countries”	No. All data remains in Switzerland for data hosting and processing. If you chose the single tenant or customer tenant deployment option than no client data will leave your dedicated single tenant.
Do you have a data processing agreement in place?	Yes, we do have a DPA: https://www.unique.ch/data-processing-addendum .
Do you have Terms of Use?	Yes, we do have Terms of Use for end users.
Do you have an AI Governance in place?	Yes, we do have an AI Governance framework. More details can be found here: AI Governance
Does Microsoft Switzerland share data with Microsoft US (based on the so called CLOUD Act)?	No. Data is never shared between Microsoft CH and Microsoft US.
Are prompts attributable to specific users or organizations (when no identifying information is included in the prompt)? If no, can you provide evidence of the controls?	Prompts are associated with a specific user (audit logs) via login credentials. If you choose the single tenant or customer tenant deployment option this data will only be stored in the client specific tenant.
How long will our inputs/prompts be retained if submitted via the user interface?	Prompts are not stored. All relevant data, including prompts and output, is processed in memory in the model and never stored. Neither Unique nor Microsoft use prompts or any customer data to train the AI model.
How long will our inputs/prompts be retained if submitted via the API?	Prompts are not stored. All relevant data, including prompts and output, is processed in memory in the model and never stored. Neither Unique nor Microsoft use prompts or any customer data to train the AI model.
Are there different data retention polices for the user interface versus the AP	No.
Do you have controls in place to ensure the foundational model was not trained with prohibited or biased content?	We rely on Microsoft public statements that they will cover costs for IP infringements in case needed (Customer Copyright Commitment Required Mitigations \| Microsoft Learn).
Is the model data de-identified, aggregated, and anonymized?	No. We will integrate your DLP to run on audit logs after user interaction.
Have you performed any independent audits or validation of AI model outputs?	We perform regular internal tests and compare different models (see Benchmarking ). This has not been part of an external validation report so far.
Are you a data controller or data processor?	We are acting as a data processor of your data only.

Security & Compliance

How do you adhere to the data security measures implemented on the data source when querying data in the vector database?	We have dedicated access controls applied to adhere to this.
Can we restrict access with MFA or IP filtering?	Yes, both options are possible.
Can we have access to audit logs on resources security configuration?	Yes, audit logs be available upon request.
How can the conversation history be extracted?	You can extract your chat history via API.
Can we view defined users or applications in the tenant ?	Yes this is possible.
Is there a process maintained to remove personal data based on the right to be forgotten if applicable to the services provided?	Yes, there is a process in place.
Are there any other locations outside Switzerland where data is stored?	Only if recorded through the app or uploaded manually on the Unique Portal the recording is temporarily (1 hour) stored in Frankfurt, Germany for transcription. Otherwise, no.
Is full-disk encryption enabled for all systems that store or process customer data?	Yes, it is.
Where are the videos saved that you record?	On Microsoft Azure cloud hosted in Switzerland protected by enterprise security standards of Microsoft.
Is recording of client conversations legally allowed?	Yes, in the European area as long as the caller asks for consent before recording (it is a GDPR requirement)

Cloud Computing & Development

What type of cloud service is offered (SaaS, PaaS or IaaS)?	We offer a SaaS model.
Is virtualization used the in services?	Yes, we use virtualization in providing services, and KPI/SLAs are tracked for reporting. However, we do not have a hypervisor vulnerability management in place.
Is capacity planning conducted to prevent any redirection of contracted capacity to other tenants?	Yes, capacity planning is conducted on an ad-hoc basis when the utilization reaches a threshold limit to prevent any redirection of contracted capacity to other tenants without approval.
Is a client allowed to conduct penetration testing of the cloud infrastructure that is hosting its data?	Yes, you are allowed to test the cloud infrastructure that hosts your data after informing us at least 5 business days beforehand.

Author	Daylan Araz Michael Dreher Sina Wulfmeyer Tom Hobbs