Full Transcript

AI Agents, Clearly Explained

10:10EnglishBy Jeff SuTranscribed May 11, 2026

Analyze another video with Pro30-day money-back guarantee

0:03

AI. AI. AI. AI. AI.

0:07

AI. You know, more agentic. Agentic

0:10

capabilities. An AI agent. Agents.

0:12

Agentic workflows. Agents. Agents.

0:15

Agent. Agent. Agent. Agent. Agentic.

0:19

All right. Most explanations of AI

0:20

agents is either too technical or too

0:23

basic. This video is meant for people

0:26

like myself. You have zero technical

0:28

background, but you use AI tools

0:30

regularly and you want to learn just

0:33

enough about AI agents to see how it

0:36

affects you. In this video, we'll follow

0:38

a simple one, two, three learning path

0:41

by building on concepts you already

0:43

understand like chatbt and then moving

0:46

on to AI workflows and then finally AI

0:49

agents. All the while using examples you

0:52

will actually encounter in real life.

0:55

And believe me when I tell you those

0:56

intimidating terms you see everywhere

0:58

like rag, rag, or react, they're a lot

1:02

simpler than you think. Let's get

1:04

started. Kicking things off at level

1:05

one, large language models. Popular AI

1:08

chatbots like CHBT, Google Gemini, and

1:10

Claude are applications built on top of

1:14

large language models, LLMs, and they're

1:17

fantastic at generating and editing

1:19

text. Here's a simple visualization.

1:21

You, the human, provides an input and

1:24

the LLM produces an output based on its

1:27

training data. For example, if I were to

1:29

ask Chachi BT to draft an email

1:31

requesting a coffee chat, my prompt is

1:33

the input and the resulting email that's

1:36

way more polite than I would ever be in

1:37

real life is the output. So far so good,

1:40

right? Simple stuff. But what if I asked

1:43

Chachi BT when my next coffee chat is?

1:47

Even without seeing the response, both

1:49

you and I know Chachi PT is gonna fail

1:52

because it doesn't know that

1:53

information. It doesn't have access to

1:56

my calendar. This highlights two key

1:58

traits of large language models. First,

2:00

despite being trained on vast amounts of

2:02

data, they have limited knowledge of

2:04

proprietary information like our

2:07

personal information or internal company

2:09

data. Second, LLMs are passive. They

2:12

wait for our prompt and then respond.

2:14

Right? Keep these two traits in mind

2:17

moving forward. Moving to level two, AI

2:19

workflows. Let's build on our example.

2:21

What if I, a human, told the LM, "Every

2:25

time I ask about a personal event,

2:26

perform a search query and fetch data

2:29

from my Google calendar before providing

2:31

a response." With this logic

2:33

implemented, the next time I ask, "When

2:35

is my coffee chat with Elon Husky?" I'll

2:38

get the correct answer because the LLM

2:40

will now first go into my Google

2:42

calendar to find that information. But

2:45

here's where it gets tricky. What if my

2:48

next follow-up question is, "What will

2:50

the weather be like that day?" The LM

2:53

will now fail at answering the query

2:55

because the path we told the LM to

2:57

follow is to always search my Google

3:00

calendar, which does not have

3:02

information about the weather. This is a

3:04

fundamental trait of AI workflows. They

3:07

can only follow predefined paths set by

3:10

humans. And if you want to get

3:12

technical, this path is also called the

3:15

control logic. Pushing my example

3:17

further, what if I added more steps into

3:20

the workflow by allowing the LM to

3:22

access the weather via an API and then

3:24

just for fun use a text to audio model

3:26

to speak the answer. The weather

3:28

forecast for seeing Elon Husky is sunny

3:31

with a chance of being a good boy.

3:33

Here's the thing. No matter how many

3:35

steps we add, this is still just an AI

3:39

workflow. Even if there were hundreds or

3:41

thousands of steps, if a human is the

3:44

decision maker, there is no AI agent

3:47

involvement. Pro tip: retrieval

3:49

augmented generation or rag is a fancy

3:52

term that's thrown around a lot. In

3:54

simple terms, rag is a process that

3:56

helps AI models look things up before

3:58

they answer, like accessing my calendar

4:00

or the weather service. Essentially, Rag

4:03

is just a type of AI workflow. By the

4:06

way, I have a free AI toolkit that cuts

4:07

through the noise and helps you master

4:09

essential AI tools and workflows. I'll

4:10

leave a link to that down below. Here's

4:12

a real world example. Following Helena

4:14

Louu's amazing tutorial, I created a

4:17

simple AI workflow using make.com. Here

4:19

you can see that first I'm using Google

4:21

Sheets to do something. Specifically,

4:23

I'm compiling links to news articles in

4:25

a Google sheet. And this is that Google

4:28

sheet. Second, I'm using Perplexity to

4:31

summarize those news articles. Then

4:34

using Claude and using a prompt that I

4:36

wrote, I'm asking Claude to draft a

4:38

LinkedIn and Instagram post. Finally, I

4:42

can schedule this to run automatically

4:44

every day at 8 a.m. As you can see, this

4:46

is an AI workflow because it follows a

4:49

predefined path set by me. Step one, you

4:52

do this. Step two, you do this. Step

4:55

three, you do this. And finally,

4:57

remember to run daily at 8 am. One last

4:59

thing, if I test this workflow and I

5:02

don't like the final output of the

5:05

LinkedIn post, for example, as you can

5:08

see right here, uh, it's not funny

5:10

enough and I'm naturally hilarious,

5:11

right? I'd have to manually go back and

5:16

rewrite the prompt for Claude. Okay? And

5:20

this trial and error iteration is

5:23

currently being done by me, a human. So

5:25

keep that in mind moving forward. All

5:27

right, level three, AI agents.

5:29

Continuing the make.com example, let's

5:31

break down what I've been doing so far

5:33

as the human decision maker. With the

5:36

goal of creating social media posts

5:37

based off of news articles, I need to do

5:39

two things. First, reason or think about

5:43

the best approach. I need to first

5:44

compile the news articles, then

5:46

summarize them, then write the final

5:48

posts. Second, take action using tools.

5:51

I need to find and link to those news

5:53

articles in Google Sheets. Use

5:55

Perplexity for real-time summarization

5:58

and then claw for copyrightiting. So,

6:00

and this is the most important sentence

6:01

in this entire video. The one massive

6:04

change that has to happen in order for

6:06

this AI workflow to become an AI agent

6:09

is for me, the human decision maker, to

6:13

be replaced by an LLM. In other words,

6:16

the AI agent must reason. What's the

6:19

most efficient way to compile these news

6:20

articles? Should I copy and paste each

6:22

article into a word document? No, it's

6:24

probably easier to compile links to

6:26

those articles and then use another tool

6:28

to fetch the data. Yes, that makes more

6:30

sense. The AI agent must act, aka do

6:34

things via tools. Should I use Microsoft

6:37

Word to compile links? No. Inserting

6:39

links directly into rows is way more

6:41

efficient. What about Excel? M. So the

6:44

user has already connected their Google

6:45

account with make.com. So Google Sheets

6:47

is a better option. Pro tip. Because of

6:49

this, the most common configuration for

6:51

AI agents is the react framework. All AI

6:55

agents must reason and act. So

6:59

react. Sounds simple once we break it

7:01

down, right? A third key trait of AI

7:03

agents is their ability to iterate.

7:06

Remember when I had to manually rewrite

7:08

the prompt to make the LinkedIn post

7:10

funnier? I, the human, probably need to

7:13

repeat this iterative process a few

7:15

times to get something I'm happy with,

7:17

right? An AI agent will be able to do

7:19

the same thing autonomously. In our

7:22

example, the AI agent would autonomously

7:25

add in another LM to critique its own

7:28

output. Okay, I've drafted V1 of a

7:30

LinkedIn post. How do I make sure it's

7:32

good? Oh, I know. I'll add another step

7:34

where an LM will critique the post based

7:36

on LinkedIn best practices. And let's

7:38

repeat this until the best practices

7:40

criteria are all met. And after a few

7:42

cycles of that, we have the final

7:45

output. That was a hypothetical example.

7:47

So let's move on to a real world AI

7:50

agent example. Andrew is a preeeminent

7:53

figure in AI and he created this demo

7:55

website that illustrates how an AI agent

7:58

works. I'll link the full video down

8:00

below, but when I search for a keyword

8:02

like skier, enter the AI vision agent in

8:07

the background is first reasoning what a

8:10

skier looks like. A person on skis going

8:12

really fast in snow, for example, right?

8:14

I'm not sure. And then it's acting by

8:18

looking at clips in video footage,

8:22

trying to identify what it thinks a

8:24

skier is, indexing that clip, and then

8:29

returning that clip to us. Although this

8:32

might not feel impressive, remember that

8:34

an AI agent did all that instead of a

8:36

human reviewing the footage beforehand,

8:39

manually identifying the skier, and

8:42

adding tags like skier, mountain, ski,

8:45

snow. The programming is obviously a lot

8:47

more technical and complicated than what

8:49

we see in the front end, but that's the

8:51

point of this demo, right? The average

8:53

user like myself wants a simple app that

8:56

just works without me having to

8:58

understand what's going on in the back

9:00

end. Speaking of examples, I'm also

9:02

building my very own basic AI agent

9:05

using Nan. So, let me know in the

9:07

comments what type of AI agent you'd

9:08

like me to make a tutorial on next. To

9:11

wrap up, here's a simplified

9:12

visualization of the three levels we

9:14

covered today. Level one, we provide an

9:17

input and the LM responds with an

9:19

output. Easy. Level two, for AI

9:22

workflows, we provide an input and tell

9:24

the LM to follow a predefined path that

9:27

may involve in retrieving information

9:29

from external tools. The key trait here

9:31

is that the human programs a path for LM

9:34

to follow. Level three, the AI agent

9:37

receives a goal and the LM performs

9:39

reasoning to determine how best to

9:41

achieve the goal, takes action using

9:44

tools to produce an interim result,

9:46

observes that interim result, and

9:48

decides whether iterations are required,

9:51

and produces a final output that

9:53

achieves the initial goal. The key trait

9:56

here is that the LLM is a decision maker

9:58

in the workflow. If you found this

10:00

helpful, you might want to learn how to

10:02

build a prompts database in Notion. See

10:04

you on the next video. In the

10:05

meantime, have a great one.

Continue with YouTLDR

Analyze another video with Pro

Process a new video, search every timestamp, compare sources, and keep the result in your library.

Get Pro — $12/month30-day money-back guarantee

More transcripts

Explore other videos transcribed with YouTLDR.