Topics In Demand
Notification
New

No notification found.

The Next Wave in Generative AI: Harnessing the Power of Agents
The Next Wave in Generative AI: Harnessing the Power of Agents

54

0

Authored by: Suresh Bansal

The journey of Artificial Intelligence (AI) and Machine Learning (ML) has been transformative. It all began when we shifted from manual coding to training computers with data. In the early days, AI could only handle specific tasks like classification and object identification—functions for which they were explicitly trained.

But everything changed at the end of 2022 with the launch of ChatGPT by OpenAI. This groundbreaking tool could generate content and perform a wide range of tasks, quickly capturing the attention of millions worldwide. As noted in Gartner's 2023 Hype Cycle for AI, Generative AI has reached the "peak of inflated expectations" and is expected to hit the "Plateau of Productivity" within the next 5 to 10 years.

Overcoming Challenges and Limitations

Reaching the Plateau of Productivity, according to Gartner, means that AI will become widely adopted, with its benefits well-defined and clear guidelines for implementation. To get there, we must first address the current limitations of AI technology and explore how agents can help overcome these challenges.

While today’s large language models (LLMs) excel at tasks like generating emails, writing essays, and conducting sentiment analysis, they still struggle with complex tasks, such as intricate math calculations or multi-step problem-solving. Additionally, LLMs have other notable limitations:

  • Hallucinations or misleading outputs
  • Technical constraints like limited context length and memory
  • Bias in outputs
  • Potential for toxic or harmful speech
  • Limited knowledge (e.g., ChatGPT 3.5's knowledge cutoff is September 2021)

Interestingly, these challenges are not so different from those we humans face. We, too, are prone to mistakes, bias, limited memory, and occasionally harmful responses. To manage these shortcomings, we typically:

  • Seek information online and use tools like Excel and Word.
  • Revise our work multiple times to correct errors and improve quality.
  • Seek feedback from peers and mentors and incorporate their insights.
  • Collaborate in teams to achieve better results.

By applying similar strategies, we can improve the outputs from LLMs, leading us to the concept of Generative AI Agents.

What are Generative AI Agents?

Generative AI Agents are designed to overcome many of the limitations of current LLMs by executing complex tasks that standalone models cannot handle. For example, if you want to identify the top three companies by revenue from a dataset, an agent would:

  1. Retrieve revenue data for all companies.
  2. Sort the companies by revenue.
  3. Return the top three companies.

To accomplish this, agents combine LLMs with key components such as planning, memory, and tools:

  • Planning: The agent outlines and executes a plan using an LLM.
  • Memory: The agent retains information while performing multiple steps, allowing it to process complex tasks.
  • Tools: Agents use various tools to perform specific tasks, which are discussed in more detail below.

Generative-AI-Agents-Xoriant

Key Features of Generative AI Agents

Generative AI agents are designed to:

  • Plan and execute tasks
  • Reflect on outcomes
  • Use tools to achieve specified goals
  • Operate with minimal human intervention

Examples of such agents include website builders, data analysts who provide insights from Excel sheets, and travel agents planning trips based on user inputs.

The Role of Tools in Generative AI Agents

Tools are critical for agents, enabling them to perform their tasks effectively. In the realm of generative AI, tools allow an LLM agent to interact with external environments and applications, such as internet searches, code interpreters, and math engines. These tools can access databases, knowledge bases, and external models.

For instance, a travel agent would need tools to search and book flights, as well as search the internet. Other tools could include:

  • Entity Extraction: Extract specific information from unstructured documents.
  • Chat DB: Retrieve information from a database without needing SQL knowledge.
  • Knowledge Bot: Uses Retrieval-Augmented Generation (RAG) to answer questions based on a custom knowledge repository.
  • Internet Search: Fetches content from search engines based on user queries.
  • Summarization: Provides summaries of large documents tailored to specific personas.
  • Program Execution: Executes Python code to solve specific problems.
  • Wikipedia Search: Retrieves content from Wikipedia based on user queries.
  • Comparison: Answers comparative questions, like performance metrics or product recommendations.

Tools-Generative-AI-Agents-Xoriant

Agentic Design Patterns

To perform complex tasks, agents must orchestrate these tools effectively. Based on lectures by Andrew NG, several agentic design patterns have emerged:

  • Reflection: The LLM evaluates its own work to improve it.
  • Tool Use: The LLM utilizes tools like web searches or code execution to gather information and process data.
  • Planning: The LLM devises a multi-step plan to achieve a goal and then executes it.
  • Multi-Agent Collaboration: Multiple AI agents collaborate, dividing tasks and debating ideas to find better solutions.

While the first two patterns yield predictable outcomes, the latter two are still in the experimental phase.

The LLM Agent Framework

Building on the understanding of agents, tools, and design patterns, a variation of the planning pattern emerges. This framework involves defining a task or goal and then iteratively planning and executing the next action, followed by a feedback loop.

An LLM agent consists of core components:

  • Brain/LLM: Acts as the coordinator.
  • Memory (Vector DB): Stores intermediate steps and results.
    • Short-term memory: Holds context information within the context window.
    • Long-term memory: An external vector store providing relevant contextual information.
  • Tools/Internet: Enable the agent to perform tasks like web searches or program execution.
  • Policy: Ensures trust by design, preventing the processing of toxic inputs.

Flow-Narrative-Generative-AI-Agents-Xoriant

A Future with Intelligent Agents

The future of generative AI lies in the collaboration between intelligent agents and humans. Imagine a world where doctors, designers, and customer service representatives are supported by agents that enhance their capabilities. The possibilities are endless, from scientific discoveries to artistic creations.

For businesses, integrating generative AI agents into their operations offers a strategic advantage, unlocking new levels of efficiency, personalization, and problem-solving. These agents won't replace human ingenuity; they'll empower it, shaping a future rich with innovation and progress.

About Author:

Suresh Bansal is a Technical Manager at Xoriant with expertise in Generative AI and technologies such as Vector DB, LLM, Hugging Face, Llama Index, Lang Chain, Azure, and AWS. With experience in pre-sales and sales, he has exceled at creating compelling technical proposals and ensuring client success. Suresh has worked with clients from the US, UK, Japan, and Singapore, achieved advanced-level partnerships with AWS, and presented research recommendations to C-level leadership.


That the contents of third-party articles/blogs published here on the website, and the interpretation of all information in the article/blogs such as data, maps, numbers, opinions etc. displayed in the article/blogs and views or the opinions expressed within the content are solely of the author's; and do not reflect the opinions and beliefs of NASSCOM or its affiliates in any manner. NASSCOM does not take any liability w.r.t. content in any manner and will not be liable in any manner whatsoever for any kind of liability arising out of any act, error or omission. The contents of third-party article/blogs published, are provided solely as convenience; and the presence of these articles/blogs should not, under any circumstances, be considered as an endorsement of the contents by NASSCOM in any manner; and if you chose to access these articles/blogs , you do so at your own risk.


Xoriant is a Silicon Valley-headquartered digital product engineering, software development, and technology services firm with offices in the USA,UK, Ireland, Mexico, Canada and Asia. From startups to the Fortune 100, we deliver innovative solutions, accelerating time to market and ensuring our clients' competitiveness in industries like BFSI, High Tech, Healthcare, Manufacturing and Retail. Across all our technology focus areas-digital product engineering, DevOps, cloud, infrastructure, and security, big data and analytics, data engineering, management and governance -every solution we develop benefits from our product engineering pedigree. It also includes successful methodologies, framework components, and accelerators for rapidly solving important client challenges. For 30 years and counting, we have taken great pride in our long-lasting, deep relationships with our clients.

© Copyright nasscom. All Rights Reserved.