Frequently Asked Questions (FaQ)
Prompts
Can one configure custom prompts? | Yes, all Prompts can be completely customized. |
---|---|
Is there a platform available for conducting automatic tests on prompts, including comparing results, etc.? | Yes, we have a benchmarking that automatically tests hundreds of prompts against the data and models that are in the system. |
Is it possible to implement version control for prompts, such as maintaining a development version, publishing a beta version, and continuing to use a previous version? | Yes, it is possible to apply version control to all prompts within the system, allowing for the independent experimentation of new prompts without affecting those that are already operational. This facilitates the development of new prompts and Assistants. |
Can prompts be shared by user-groups? | Yes, prompts can be defined by user-groups. |
Can one get feedback for the prompts? | Yes, there is a feedback mechanism for each answer so the users can give feedback on the quality of the prompts. |
Can one call configured prompts from an API? | Yes, this is possible. |
Is it possible to configure unique disclaimers for each prompt? | Yes, disclaimers for each prompt can be configured by the system's Admin, who has the ability to set disclaimers per user-group. |
How are updates carried out? | Currently, updates are done via API, but a User Interface is expected to be launched in April. |
Is chat history, encompassing questions and answers, stored somewhere, or are only the details of the current chat session retained? If stored, where is this information kept? | The chat history is stored in two places:
Prompts will not be stored on Microsoft Azure as we opted out of abuse monitoring, preventing Microsoft from saving the prompts. |
How Unique FinanceGPT prevents CID information in user prompts? | In principle, Unique's users are strongly warned not to paste any CID or personal data when using FinanceGPT; In addition, technically:
|
Large Language Models (LLMs)
What are the LLMs that can be used? | Any provisioned models can be connected to the system via config. So far all, OpenAI Models have been tested:
Other LLM’s:
|
Is the availability of Azure OpenAI models restricted to certain regions? | Yes, the availability of Azure OpenAI models is dependent on the deployment region. The specific models available in each region can be checked on the Azure website: Azure OpenAI Service models - Azure OpenAI This page is regularly updated, allowing users to see when new models become available in different regions. |
Will Azure retire OpenAI models over time? | Yes, Microsoft does retire OpenAI models over time. Information about model retirements is provided on the Azure page under “Model Retirements.” Azure OpenAI Service model retirements - Azure OpenAI This section includes a table listing the current models, their versions, and the retirement dates. Clients will be informed in advance about any model retirements, as well as new models becoming available in their region. |
What is the difference between model retirement and deprecation communicated by Azure for their provided OpenAI models? | The difference between retirement and deprecation of a model is important to understand:
|
Can models be customized/bring our own models? | Yes, we even have customers that are doing this. |
Is there a platform available for conducting automated tests and comparing results for custom models? | Yes, through benchmarking, it is possible to perform comparisons. There is a documentation for this process: https://unique-ch.atlassian.net/wiki/x/AQDJIw |
How can one train, test, and deploy a model for use with Unique’s solution? | So far, we have not directly trained a model ourselves; instead, our customers have undertaken this task. However, our Data-Science team has provided support and guidance to them throughout the process. |
Does one need to use Azure AI Studio? | There's no need to restrict yourself to Azure AI Studio exclusively. As long as the model can be provisioned, we are capable of integrating it. |
Is it possible to implement version control for models, such as maintaining a development version, publishing a beta version, and continuing to use a previous version? | Yes, within the system, each prompt allows for the selection of the model and its version at will. We practice this on a regular basis, especially with the release of new minor or preview versions from Azure OpenAI. |
Can models be shared by user-groups? | Yes, it can be scoped by user-groups. |
Can models be restricted by | Yes, it can be restricted by user-groups. |
Can token consumption be followed by model or by user-groups? | This feature is currently under development and not available yet. However, an Analytics Framework with downloadable CSV-Reports is already in place and covers these points:
Read more about this here: Analytics A report incl. consumption by assistant/model is planned for Q2 2024. |
Are there several types of prices depending on the models used? | Our pricing model remains fixed, however, the costs of the underlying models set by Microsoft are subject to change and are transparently communicated back to you. Prices may fluctuate. We offer guidance on which prompts require specific models. |
How is visibility kept on the costs related to the API usage? | We report the costs generated on the Subscription on a monthly basis. In the early days of the project, we can negotiate a faster rhythm. |
Is it possible to set token limits for each model or user group, including actions like sending alerts or shutting down the API? | This feature is not available yet and is currently under development, planned for Q3 2024. |
Is it possible to grant standard access to ChatGPT-3.5, replacing the direct access currently provided to certain staff members? | Yes, this is even included in the base configuration of Unique. You can even give access to ChatGPT-4. |
Is your solution offered on the MS Marketplace? | Unique is currently not offered in the MS Marketplace. |
What tests have been done to select the appropriate LLM models? | We conducted benchmarks using our documents, and our clients performed similar tests. This process helps us select the most suitable models for each prompt or use case. While we have evaluated other models, we found that they do not yet match the performance of GPT-4, especially in situations requiring RAG. |
Services
How do guardrails work? | The language model operates within a set structure, using only the data provided by the organization to ensure its responses comply with specific standards and do not include external information not given by the company. Furthermore, by including citations in each reply, the origin of the information used in the responses can be traced. |
What tooling is used for pseudonymisation? | A local model is employed, executed directly within the cluster and independent of OpenAI, to recognize names and entities. These identified elements are subsequently substituted with anonymized tokens, which are later restored to their original form. |
How is Document ingestion maintained? | We maintain multiple default ingestion pipelines for the different types of files. See the documentation here: Ingestion Customers can build their own in the context of our Co-Development Agreement if needed. We are improving continuously to get the best possible results for the RAG. |
How long is the retention period for uploaded files? | Clients can configure their retention period for uploaded files how they want. Most of our clients have set it between 2-7 days. |
Are the sources always shared with the users? | Yes, Unique adds references to each answer to indicate to the user where the information is coming from. This happens through the RAG process. |
Can automated workflows be executed? | Yes, we already have customers that use our API to execute workflows autonomously without the intervention of a user. |
How is a continuous feedback loop orchestrated? | As an admin, you can export the user feedback as CSV on demand. There will be monthly meetings with the project lead to analyze the feedback and derive improvement options. |
Can your system integrate with various Identity Providers (IDPs), and does it support seamless user provisioning and login with credentials from external systems? | The IDP can be integrated into our system. Your logins can be used, and users are automatically provisioned. We support the following list: ZITADEL Docs |
What gets anonymized and how does it work? | The anonymization service processes the prompt intended for the OpenAI Endpoint by performing Named-Entity Recognition. It replaces identified entities with placeholders before sending them to the model. Once the model responds, the anonymized placeholders are replaced with the original identifying data. The user will not receive the anonymized entities in the response. Additionally, the data is stored in subscription databases, which are exclusively accessible by the client. |
What happens with client names in the recordings, are they anonymized? | Clients show up as “Participant X” in the recording transcripts until you explicitly assign a name to them. After that, they are recognized by name on other recordings in the same deal. |
How flexible can new services be developed and tested? | This can be done independently developed, and tested. Each developer can run an independent version of FinanceGPT on their local machine to develop without interfering with others. |
How would customized workflows be prepared and released? | If you develop your own assistants that are not coming as part of the default, these assistants need to be deployed. The deployment can be orchestrated by you or us. Below you find a drawing explaining the process. |
Can we view defined users or applications in the tenant? | Yes, this is possible. |
Is there monitoring and alerting for the network? | Yes. |
Is encryption and integrity protection in place for all external (public) network traffic that potentially carries sensitive information? | Yes. |
Do you use an automated source code analysis tool to detect security defects in code prior to production? | Yes, GH Advanced security and trivy. |
What service hosting models and deployment models are provided as part of Unique services? |
|
Is a website supported, hosted, or maintained that has access to customer systems and data? | Yes. |
Architecture, RAG, Vectors, and more
What technologies are used in the RAG pattern? | For Vectorisation, the embedding model ADA from Azure OpenAI’s is used. To learn more about our Architecture, see here: Architecture We use Qdrant to save the vector and the metadata (self-hosted). For saving text, we use Postgres (Azure service). |
Why is vector DB Qdrant being used? | Qdrant performs very well on metadata filtering and similarity search compared to others. This is also needed for ACL |
Do you duplicate data and store a local copy of indexed documents? | Yes, we store the data locally. |
What are the existing connectors? |
|
Can a local (on-premise) vector database be used? | If Unique is deployed on-prem, yes. But in phase 1, it’s a workload we deploy on Azure fully encrypted. |
Do connectors support images, video, and sound indexation? | Currently not, though we are exploring options with GPT-4 Vision. |
Can hook be added in the dataflow to check data before indexing? | Planned for Q3 2024 in product roadmap |
Can hook be added to enrich metadata during indexing? | Planned for Q3 2024 in product roadmap |
How is the lifecycle of indexed documents managed? | The same documents are replaced with the new version. Content owners are responsible for deduplication. |
What are the supported languages? | Unique supports the languages that are listed and offered on Azure: |
Can multiple context sources, like vector databases + custom databases be used? | Yes, this is possible. |
Is there an initial limit on the documents provided? | No, but it’s useful to only index what is truly needed, this makes the quality control easier. Documents are taken in and transformed by our ingestion workers into markdown. Markdown is then broken apart into chunks preserving titles with paragraph connections. And tables with headings so that the ideal context is given to the models at retrieval time. |
Are the sources of information selected automatically? | Yes, this is fully automatic. |
What limitations on documents are there? | Images on documents are not yet included in the ingestion process. |
Can defined users or applications be viewed in the tenant? | Yes, this is possible. |
How can the chatbot in web applications be integrated? | This can be achieved by utilizing Unique's APIs or by employing Iframe-like functionalities for front-end display. |
Can the solution be integrated with Microsoft Dynamic CRM on-premise? | Yes, this is possible. |
Can Semantic Kernel with indexation, prompt, and models be used? | Yes, this is possible. |
Can LangChain be used to interact with indexation, prompts, and models? | Yes, any Python can be used with our APIs/SDK. See details about the APIs/SDK here: Software Development Kit (SDK) |
How long does a typical RAG request take? | Time to streaming the answer is around 3-5 seconds depending on the use case. |
Cloud Computing & Development
What features does the Cloud service offer? | Measured service: to control and optimize resource use by leveraging a metering capability. |
Upon the contract termination, what happens to the data? | Data is securely erased/destroyed and returned to the client in a defined time frame when requested. |
Are technical measures applied for defense-in-depth techniques (e.g., deep packet analysis, traffic throttling, and black-holing)? | No, we do not have technical measures that provide defense-in-depth for detection and timely response to network-based attacks. |