Future Workflows Powered by AI Agents

Introduction

In the fast-evolving world of AI, "agents" have become a buzzword, promising autonomous systems that go beyond simple chatbots. Drawing from Andrew Ng's seminal talk at Sequoia Capital's AI Ascent in March 2024, this post explores what AI agents really are, how they work, and why they're set to transform industries. Andrew Ng, a pioneer in deep learning, argues that agentic workflows’ iterative processes mimicking human thinking could deliver GPT-5-level performance using today's models like GPT-4. Let's dive in.

What Makes an AI Agent?

Traditional AI interactions are "non-agentic": You prompt a model like ChatGPT, and it generates a response in one go. It's impressive but limited like asking someone to write an essay without revisions.

Enter agentic workflows. These are looped processes where the AI plans, acts, observes, and refines. For example, to write an essay, an agent might outline, research via web tools, draft, self-critique, and revise. The result? Dramatically better outputs.

Ng illustrates this with benchmarks:

HumanEval (Coding): Zero-shot GPT-4 solves 67% of problems. But wrap GPT-3.5 in an agentic loop, and it hits ~85%—surpassing the more advanced model (Evaluating Large Language Models Trained on Code, Chen et al., 2021).
GSM8K (Math Problems): Similar gains, showing agents amplify existing models.

True agents aren't fully autonomous yet, but these workflows are a crucial step toward AGI.

The Four Pillars of Agentic Design

Andrew Ng outlines four patterns:

Reflection: The AI reviews its own work. Prompt it to "check your code for bugs and efficiency," and it iterates. Papers like Self-Refine show GPT-4 improving itself without extra data.
Tool Use: Agents call external resources. Gorilla fine-tunes models for API calls, outperforming GPT-4; MM-REACT integrates vision tools for multimodal tasks.
Planning: Break tasks into steps with Chain-of-Thought prompting. It turns flat performance curves into steep gains on reasoning benchmarks.
Multi-Agent Collaboration: Teams of agents specialize and debate. ChatDev simulates a company (CEO, coder, tester) to build software; multi-agent debates (e.g., GPT vs. Gemini) boost accuracy.

These patterns are messy but potent. Reflection is reliable, while multi-agent can be "mind-blowing" but inconsistent.

Building and Applying Agents Today

You don't need code. Tools like n8n, CrewAI, Autogen, LangChain let you prototype workflows. Ng advises focusing on fast models for more iterations. Speed enables 100x loops, potentially outpacing slower, smarter models.

Applications? Automate coding, research, or even game development. For businesses, this means productivity leaps; for developers, it's a new paradigm.

AI Agents: Unlocking Future Workflows

Introduction

What Makes an AI Agent?

The Four Pillars of Agentic Design

Building and Applying Agents Today

Recommended Reading

Comments

More from this blog

Attempt to Learn Elixir: Functions and Patterns

Back to the Blog: Returning to Technical Writing After a Long Hiatus

I switched to Neovim from VSCode as my main IDE

My NextJS Setup

Command Palette

Introduction

What Makes an AI Agent?

The Four Pillars of Agentic Design

Building and Applying Agents Today

Recommended Reading

Comments

More from this blog