Agentic Workflows / Autonomous Agents

All our promots and functionalities are working through Agents. In this document I will describe how to think about agents in the context of Unique and introduce a measure to express the Agency of the workflow.

Definitions:

  • Function (deprecated)

Agentic what is it?

In the context of Large Language Models (LLMs), "agentic" refers to iterative workflows where the LLM is prompted multiple times to refine and improve its output incrementally, instead of generating the final output in a single pass. This approach aims to enhance the quality and accuracy of the results by mimicking human-like processes of revision and refinement.

In order to express how “Agentic” an agent is can be expressed in 4 dimensions according to Andrew Ng who explained that nicely in a talk:

https://www.youtube.com/watch?v=sal78ACtGTc

These 4 dimensions are:

  • Reflection:
    The ability to look at the produced result and check if the result is correct given all the available information. Like a human would reread a written text and revise it, to make the statements correct.

  • Tool use:
    The ability to choose and use different functionalities that are external to the llm, for example making use of a search, making use of an API or sending an email. Like a human would decide on what tool to use like using outlook to send an email.

  • Planning:
    The ability to crate subgoals to reach a given goal. Like a human would make a plan on how to reach a goal for example: Planning vacation, you have the subgoals to book a flight, book the hotel and so on.

  • Multi-agent colaboration:
    The ability to use multiple LLMs to achieve a certain task. Like humans in a company would be working together like the CEO the developers the designers all work in tandem to achieve a certain goal and contribute to it.

For our workflows we estimate the extent on a scale from 1-5 on advance the Agents are. That is how autonomous the agents can is in the given dimension.


One Simple but Powerfull example: Our Agentic RAG flow

When a user uses a space with our Document Search (RAG flow) in it and makes a prompt like: “What is our code of conduct about“ the following happends:

  1. Orchestrartor: The Agent decides on what of the tools given should be choosen in this case the Document Search

  2. Prompt Rephrasor: The Agent asks another agent (Prompt Rephrasor) to rephrase the question to better incorporate the history into it and makes sure the text is in the language of the underlying documents to produce better matches. Here the API is used to retreive documents using different search techniques resulting in a flow to extract

    1. VectorSearch

    2. Full-text search (optional)

    3. Combine the two (optional)

    4. Rerank the results (optional)

  3. Relevancy Checker: This Agent is checking the found document parts for relevancy with LLMs is that chunk relevant for answering the given question, if not remove the chunk to not confuse the model during answer. Highly paralellized to get the results fast. (optional)

  4. Document Search: The Agents uses the document Search alongside the information of the user and the Prompt to then filter down the Documents to the correct ones and surfaces the information to another Agent (Librarian) that is able to assemble the information into an answer with References

  5. Halucination check: The given information is taken by yet another Agent (Hallucinator Checker) to check if the answer given is hallucinated by checking the relevant resource and checking the information content on both.

  6. Answer revision: As an optional next step the agent (Answer Revisor) can decide based on the outcome of the hallucination check to make another attempt at answering the question. (not yet implemented)

Here a diagram that illustrates that process:


Here is how we would rate this agent on the multi-dimensional scale defined above:
Reflection: 3/5
Tool Use: 2/5
Planning: 1/5
Multi-Agent: 3/5

Why dont we give Agents not full autonomie yet?

Agents with LLMs are not yet at the level of humans in decision making and need guidance to produce usable results.
For example if you leave the Agents to fully autonomously plan and reflect on their plan they often get stuck in a loop that they would not be able to exit on them selfs.

Another example is if you give the LLM to many tools to choose from it is not good at making the right choice.

Agentic implementation

This workflow represents a structured and modular system designed to manage user inputs. The system operates in two main phases: module selection and function calling within a module, both aimed at ensuring that the user’s query or task is efficiently addressed using different internal resources and functions.

  1. Module Selection:

    • The Space Agent takes user input and selects one of several available modules. These modules represent different use cases (e.g., Internal Knowledge, Investment Assistant, Questionnaire, etc.).

    • Once a module is selected, the system transitions to the function-calling stage within a module.

  2. Function Calling within a module:

    • Inside a module (e.g., Internal Knowledge), specific functions are triggered based on user input or requirements.

    • Functions can include Knowledge Base Search, Table Search, Plot Generator, or Comparator.

    • The Module Agent determines which functions (if any) to call, based on its system state and the user's needs.

    • If no function is called, the result is streamed out and the process ends.

    • If one or more functions are called, their results (e.g., search results, plots, comparisons) are saved in the state/memory and appended to the history, allowing subsequent iterations to build upon previous outputs.

The module agent ensures that the appropriate functions are called within the loop until the required information is retrieved. However, the module selection phase does not allow looping to avoid unpredictable behavior.

Important points to consider:

  • Module agent system message is updated dynamically, depending on previous function calls and results. E.g. the information about referencing style is only appended if sources are found in previous function calls.

  • Modules need to be redesigned such that they include a look of function calling. Each functionality of the module needs to be abstracted into a function that is usable across different modules

  • Like this, we build up a collection of functions that can be activated/deactivated for new modules

 

 

Archive:

There are two distinct styles of interaction with function calling stand out: loop interaction and sequential function execution.

1. Loop Interaction (Dynamic, Iterative Flow)

In the loop interaction style, the agent operates iteratively, continuously evaluating the user input in conjunction with previous results to determine the next necessary steps. This dynamic loop involves several key components:

  • Agent Evaluation: The agent receives the user's input and evaluates whether one or multiple functions are needed to process the request. It checks whether function calls are required, based on the specific query and the current context.

  • Function Execution: If functions are needed, the agent initiates them.

  • State Object: A central state object maintains the context throughout the interaction, storing the history of all prior function calls, user inputs, and their respective results. This state serves as a memory, ensuring the agent has full context for each iteration.

  • Iterative Process: Once function outputs are received, the agent reassesses whether more functions are required based on the current context and results. The process continues iteratively, looping back to the agent evaluation stage until:

    • The query is fully satisfied (i.e., the agent reaches a "finished" status).

    • A maximum number of iterations is reached, ensuring that the loop doesn’t run indefinitely.

Example Scenario:
A user asks, "What is the stock of the week and what is its price?"

  • First, the agent determines that it needs to call a function to identify the "stock of the week", e.g. AstraZeneca

  • Once this function is executed and the stock is identified, the agent re-evaluates and sees that it now needs to retrieve the price for this specific stock (AstraZeneca) in the database.

  • The process loops until both pieces of information are gathered, and then the final response is presented to the user.

This style is ideal for situations where information is interdependent or requires multiple layers of decision-making and real-time evaluation.

2. Sequential Function Execution (One-Time Evaluation, Linear Flow)

ChatGPT is atm using this flow → https://unique-ch.atlassian.net/wiki/spaces/Q/pages/646119595

In contrast, the sequential function execution style is simpler and involves only one round of agent evaluation. The key aspects of this approach are:

  • Single Agent Evaluation: The agent is called once to assess the user’s input. In this single step, it determines all necessary tasks and corresponding functions as well as the order in which they should be executed. This is a more straightforward process where no further agent involvement is needed after the initial evaluation.

  • Predefined Function Sequence: After the agent identifies which functions are needed, they are executed sequentially, one after the other, in the defined order. Each function’s output is either returned directly to the user or passed as input to the next function in the chain. The agent doesn't loop back or reassess the situation once the sequence has started.

  • No Further Re-evaluation: After all the functions are executed as initially determined by the agent, the flow ends. There is no iterative re-checking or reassessment of whether additional function calls are needed.

Example Scenario:
A user asks, "What is the weather in Zurich, and create an image about it."

  • The agent first calls a function to retrieve the weather in Zurich.

  • Once the weather information is obtained, it then calls another function to create an image based on the weather data.

  • Both functions are executed sequentially, and the agent doesn’t re-evaluate after each step. The flow finishes once all functions are complete.

This style is more efficient for cases where the tasks are independent, and the function calls do not require dynamic re-evaluation after each step. It provides a more linear, predictable process flow.


Key Differences Between the Two Styles:

Aspect

Loop Interaction

Sequential Function Execution

Aspect

Loop Interaction

Sequential Function Execution

Agent Involvement

Continuous re-evaluation after each function call.

Single evaluation, no reassessment after execution.

Function Execution

Functions are called dynamically, based on ongoing evaluation.

Functions are called in a predefined sequence, determined in the initial step.

Ideal for

Complex queries where outputs depend on previous results.

Simple, independent queries where outputs can be processed in a fixed order.

 

© 2024 Unique AG. All rights reserved. Privacy PolicyTerms of Service