Hey, hey, I've been away for a bit, focusing on training and racing the London Marathon, yet I haven't stopped writing. In fact, I launched a new blog about my running journey. You can check it out
Now, back to our shared passion for software engineering. Recently, both professionally and personally, I've been exploring AI Agents. Let me explain what these are.
AI Agents are software programs equipped with data access, algorithms, reasoning, and conversational abilities. They dynamically interact with their environment and its inputs to autonomously perform tasks and achieve specific goals.
AI Agents can be classified based on their functionality, the complexity of tasks they handle, and the environment in which they operate. Here’s a list of common types of AI Agents:
Reactive Agents
Description: Reactive agents operate based on the current status, ignoring any previous history. They directly link their current state to actions and are designed to respond to specific situations.
Example: A simple home thermostat that adjusts heating based on temperature changes.
Model-Based Agents
Description: These agents have an internal model of the world and use it to make decisions based on how the world changes in response to their actions. They maintain a state over time to track the environment.
Example: A navigation system in a car that updates its route based on traffic conditions.
Goal-Based Agents
Description: Goal-based agents act to achieve specific goals and can consider various actions to choose the most appropriate path. They use a form of predictive modeling to foresee the outcome of actions.
Example: An investment AI that manages a portfolio with the goal of maximizing returns based on market predictions.
Utility-Based Agents
Description: These agents not only strive to achieve a goal but also do so by maximizing a utility function, which is a measure of their happiness or satisfaction. They evaluate the potential outcomes based on a preference hierarchy.
Example: A smart home system that adjusts lighting, heating, and security settings not just to satisfy predefined conditions but to optimize comfort and energy efficiency.
Learning Agents
Description: Learning agents improve their performance and adapt to changing environments over time. They have the capability to learn from past actions and improve their strategies dynamically.
Example: A recommendation system for streaming services like Netflix or Spotify that adapts to user preferences over time to improve recommendation accuracy.
Hybrid Agents
Description: These agents combine characteristics of different types, often integrating reactive and deliberative capabilities, to perform complex functions that require a variety of skills.
Example: Advanced robotics used in manufacturing that can perform pre-programmed tasks while adapting to new efficiencies or unexpected obstacles.
Now that we have a better understanding of AI Agents, their types, and capabilities, let’s go into a real example that I implemented the first weekend after the London Marathon (yes, I’m taking a moment to brag about that :) ).
Leveraging my ecommerce experience I develop a simple agent called
Intelligent Product Description Enhancement Agent
DescriboBot, sorry for the name, GPT suggested it and as you know as a SWE I could lose weeks naming a variable or a project :), is an AI Agent designed to streamline and enhance the quality of product descriptions within an e-commerce platform. It stands as an intermediary between the user that could be a typical ecommerce manager and multiple data sources, utilizing a feedback loop to refine the generated content.
Looking at this diagram, let me try to explain the flow of the information:
The ecommerce manager interacts with DescriboBot (1), asking the agent to review the product descriptions that exist in the ecommerce engine. The agent then communicates with an e-commerce API (2), fetching product information and descriptions which are stored in the ecommerce PIM (Product Information Management) system.
The DescriboBot leverage a Language Model (LLM) (3) to analyse and process the product descriptions, utilizing advanced algorithms and natural language processing techniques to generate a specific feedback and a score related to each product description and an improved Product description content based on the LLM feedback.
Each feedback, score and proposed content are then stored in a data storage (4)
Once the AI Agent has crafted an enhanced product description, it interfaces with the Feedback Loop API (5). This API drives the ongoing enhancement of descriptions through a dynamic, iterative process. It enables e-commerce managers to review generated feedback, assess and suggest content, and provide human feedback. If the feedback indicates the need for improvement, it triggers the generation of a new product description.
Finally, the updated product description is pushed back to the e-commerce platform (6), completing the cycle. This process ensures that product descriptions are not only crafted to a high standard but also evolve to meet user preferences and behaviours, contributing to a more engaging shopping experience.
Inside the Code: A File-by-File Breakdown
I uploaded here the code DescriboBot repository let’s analyse the code
app.py: This is the heart of DescriboBot, where the Flask web server is defined. It serves as the interface for user interactions, managing both the input and output of product descriptions. Through this script, DescriboBot receives user feedback, communicates with the feedback storage, and invokes the Langchain-enhanced regeneration process of product descriptions.
description_eval.py: Key to DescriboBot's intelligent operation, this script is where Langchain shines. Langchain is an advanced library that bridges the gap between language models like OpenAI's GPT and application-specific logic. In DescriboBot, it is tasked with the function of evaluating and regenerating product descriptions, applying natural language understanding and generation techniques to refine content.
By integrating with Langchain, DescriboBot is able to leverage large language models in a plug-and-play fashion, simplifying the integration of complex AI functionalities into everyday applications. Langchain is a framework that allows developers to build sequences of operations—chains—that can include transformations, extractions, and reasoning based on language inputs. This makes DescriboBot not only reactive to feedback but also proactively improving in a way that mirrors human learning and reasoning.
Test
when I run python3 app.py
The agent is able to fetch the products from the PIM, print the Original description, provide feedback as Suggested Improvement then providing a new generated content for that specific product description.
This data is all stored, in this example in an in memory storage, in a hypothetical user journey the ecommerce manager can review each feedback and proposal and provide feedback, for example a negative one that we can provide using a REST API
curl -X POST http://localhost:5000/submit_feedback \
-H "Content-Type: application/json" \
-d '{"product_id": "PRD_1", "rating": 2, "user_feedback": "Poor description. Needs improvement."}'
{"message":"Feedback received and processing","status":"success"}
Once this request is received by the server, you can see Regenerated Description for Product PRD_1 - completely the feedback loop and setting the product description ready to be pushed back to the PIM.
AI Agent Type: Utility and Learning Combined
DescriBot could be best classified as a Utility-Based Agent with Learning capabilities. As a Utility-Based Agent, it is designed to optimize the quality of product descriptions, aiming to maximize a specific utility function—here, the effectiveness of e-commerce content for engaging customers and potentially improving SEO. Its goal is not merely to perform a task but to perform it in the best way possible, as evidenced by the generation of detailed, informative product descriptions.
Additionally, the agent exhibits learning capabilities. It iteratively improves its performance by incorporating human feedback into subsequent iterations of content generation. This feedback loop enables the agent to refine its understanding of what constitutes a 'better' description, demonstrating a capacity for adaptation and growth over time—key traits of a Learning Agent.
In summary, AI Agents like the one I've examined represent a very powerful way to build software utility that can perform complex task without a human supervision. By leveraging these agents, we can automate complex tasks with a level of precision that was once out of reach. The "DescriBot" stands as an example of practical AI application in software engineering—a tool that not only performs its given task but learns and evolves to do it better over time.
As we continue to develop and refine these agents, the potential for improved efficiency and effectiveness in our software is substantial. For developers, the implications are clear: integrating AI Agents into our solutions can significantly elevate the quality of our work and the satisfaction of our users.