If reading the post isn’t convenient, you can listen to this experimental podcast, AI-generated from blog’s content:
What are Spring AI Advisors?
At their core, Spring AI Advisors are components that intercept and potentially modify the flow of chat-completion requests and responses in your AI applications. The key player in this system is the AroundAdvisor, which allows developers to dynamically transform or utilize information within these interactions. The main benefits of using Advisors include:- Encapsulation of Recurring Tasks: Package common GenAI patterns into reusable units.
- Transformation: Augment data sent to Language Models (LLMs) and format responses sent back to clients.
- Portability: Create reusable transformation components that work across various models and use cases.
How Advisors Work
The Advisor system operates as a chain, with each Advisor in the sequence having the opportunity to process both the incoming request and the outgoing response. Here’s a simplified flow:- An
AdvisedRequestis created from the user’s prompt, along with an emptyadvisor-context. - Each Advisor in the chain processes the request, potentially modifying it and then forwards the execution to the next advisor in the chain. Alternatively, it can choose to block the request by not making the call to invoke the next entity.
- The final Advisor sends the request to the Chat Model.
- The Chat Model’s response is passed back through the advisor chain as an
AdvisedResponsea combination of the originalChatResponseand the advise context from the input path of the chain. - Each Advisor can process or modify the response.
- The augmented
ChatResponsefrom the finalAdvisedResponseis returned to the client.
Using Advisors
Spring AI comes with several pre-built Advisors to handle common scenarios and Gen AI patterns:- MessageChatMemoryAdvisor, PromptChatMemoryAdvisor, and VectorStoreChatMemoryAdvisor: These manage conversation history in various ways.
- QuestionAnswerAdvisor: Implements the RAG (Retrieval-Augmented Generation) pattern for improved question-answering capabilities.
- SafeGuardAdvisor: Very basic, sensitive words based advisor, that helps prevent the model from generating harmful or inappropriate content. It demonstrates how to block a request by not making the call to invoke the next adviser in the chain. In this case, it’s advisor’s responsible for filling out the response or throw and error.
Implementing Your Own Advisor
The Advisor API consists of CallAroundAdvisor and CallAroundAdvisorChain for non-streaming, and StreamAroundAdvisor and StreamAroundAdvisorChain for streaming scenarios. It also includes AdvisedRequest to represent the unsealed Prompt request data, and AdvisedResponse for the chat completion data. The AdvisedRequest and the AdvisedResponse have anadvise-context field, used to share state across the advisor chain.
Simple Logging Advisor
Creating a custom Advisor is straightforward. Let’s implement a simple logging Advisor to demonstrate the process:aggregateAdvisedResponse(...) utility combines AdviseResponse chunks into a single AdvisedResponse, returning the original stream and accepting a Consumer callback for the completed result.
It preserves original content and context.
Re-Reading (Re2) Advisor
Let’s implement a more advanced Advisor based on the Re-Reading (Re2) technique, inspired by this paper, which can improve the reasoning capabilities of large language models:Advanced Topics
Spring AI’s advanced topics encompass important aspects of advisor management, including order control, state sharing, and streaming capabilities. Advisor execution order is determined by the getOrder() method. State sharing between advisors is enabled through a shared advise-context object, facilitating complex multi-advisor scenarios. The system supports both streaming and non-streaming advisors, allowing for processing of complete requests and responses or handling continuous data streams using reactive programming concepts.Controlling Advisor Order
The order of Advisors in the chain is crucial and is determined by thegetOrder() method. Advisors with lower order values are executed first.
Because the advisor chain is a stack, the first advisor in the chain is the last to process the request and the first to process the response.
If you want to ensure that an advisor is executed last, set its order close to the Ordered.LOWEST_PRECEDENCE value and vice versa to execute first set the order close to the Ordered.HIGHEST_PRECEDENCE value.
If you have multiple advisors with the same order value, the order of execution is not guaranteed.
Using AdvisorContext for State Sharing
Both theAdvisedRequest and the AdvisedResponse share an advise-context object.
You can use the advise-context to share state between the advisors in the chain, and build more complex processing scenarios that involve multiple advisors.
Streaming vs. Non-Streaming
Spring AI supports both streaming and non-streaming Advisors. Non-streaming Advisors work with complete requests and responses, while streaming Advisors handle continuous streams using reactive programming concepts (e.g., Flux for responses). For streaming advisors, it’s crucial to note that a singleAdvisedResponse instance represents only a chunk (i.e., part) of the entire Flux<AdvisedResponse> response. In contrast, for non-streaming advisors, the AdvisedResponse encompasses the complete response.
Best Practices
- Keep Advisors focused on specific tasks for better modularity.
- Use the
advise-contextto share state between Advisors when necessary. - Implement both streaming and non-streaming versions of your Advisor for maximum flexibility.
- Carefully consider the order of Advisors in your chain to ensure proper data flow.