Full Transcript

·YouTLDR

Stanford CS153 Frontier Systems | Scale, AGI, and the Future of Everything

41:07EnglishTranscribed Jun 15, 2026

Open in Studio

0:09

Please join me in welcoming Sam Olen.

0:13

[applause]

0:18

This class was designed as an

0:19

inspiration from a you know from a set

0:21

of different experiences uh while I was

0:23

a student here. One of them was Terry

0:24

Wintergrad's uh intro seminar CS47N

0:29

computers and the open society. Uh but a

0:31

second one that was a pretty formative

0:33

experience uh for me and a lot of my

0:35

friends and peers on campus at the time

0:37

in 2014 was uh CS183

0:40

how to start a startup by SAM. Um and so

0:43

it's really cool to have you back. Uh

0:45

what's it like? How how's it feeling for

0:47

you to be back? I was thinking as I was

0:49

walking in, if I had just a little more

0:50

time, I would do uh an update to that

0:52

class because I think everything about

0:55

starting a startup has changed so much

0:57

and I have not seen anyone do a good

1:00

version of how you're supposed to make a

1:01

startup now. Uh so I had that like just

1:04

walking in here I had that like ah it'd

1:05

be fun to do it again.

1:06

>> So uh timeline wise yeah you you taught

1:09

that in 14 I think open was founded in

1:12

2015 is that right?

1:13

>> 16 basically 16. Okay. So, so then you

1:16

went, you know, it was like you were it

1:18

it it felt to me from the from an

1:20

observer perspective that you had like

1:22

come up with your working theory for how

1:23

to do it right and then you went and

1:25

tried to implement it. Is that is that a

1:27

spare assessment or is that not the

1:28

case? Obi was like the strangest startup

1:32

of the last maybe couple of decades in

1:35

the Silicon Valley because it started as

1:37

a research lab. It was it was really not

1:38

a company at all,

1:39

>> right? Um, and that

1:44

the kind of normal course of of startups

1:46

is that you start a product company and

1:48

then it like grows for a while and then

1:49

growth slows down and then you start a

1:51

research lab and you like bolt that on

1:52

and you try to figure out the next thing

1:53

to do. And we were the opposite of that.

1:56

We were a research lab first that later

1:58

had to bolt on a startup,

1:59

>> right?

2:00

>> And uh I don't really recommend that.

2:02

It's kind of an unusual thing, but that

2:04

that's not quite what I meant. What I

2:07

meant is like we still followed the

2:09

preAI rules of a startup because we were

2:11

trying to make AI. We didn't have it

2:12

yet.

2:13

>> But now like watching what the best

2:15

startups do is so different than how

2:18

startups worked even a couple of years

2:19

ago. Um that I think someone I'm

2:23

probably not going to do it. Someone

2:23

should do that class again. And what

2:25

would be the biggest updates you you'd

2:27

make based on your data? Um,

2:34

you with like an affordable amount of

2:37

spend on tokens, you can do what a 100

2:40

person incredibly

2:43

great engineering team would do as a

2:45

startup and that was just totally

2:46

impossible. That was like not in the set

2:48

of options for a startup and now it is.

2:52

So, so I think what you can take on, uh,

2:54

the level of ambition you can have, the

2:55

speed of which you can move, the amount

2:57

of stuff you can do at once, uh, is just

2:59

totally different. And, um, does that

3:02

change the shape of the problems you

3:04

feel like you'd assign at the end of the

3:06

class for people to attack, you know, at

3:08

the end of that quarter if you were

3:10

teaching it again? I don't think

3:12

assigning problems to attack ever works

3:14

because if you like if I can think of a

3:17

problem, if I can think of like a really

3:18

great startup idea, uh if it's like

3:20

obvious enough to me, uh then it's

3:23

probably obvious to a lot of people.

3:25

When we started OpenAI, we were we were

3:26

like the uh you know, one of maybe

3:29

generously speaking four AGI efforts in

3:32

the world, right? And you want to find

3:33

something like that. And I'm sure that

3:35

there exists something today that just

3:37

wasn't possible at all pre like

3:40

automated coding era uh that is totally

3:43

unobvious that will be you know a multi-

3:46

trillion dollar market soon uh and that

3:48

only four companies are working on right

3:49

now. But I don't know what that is. It's

3:50

much more likely you all know what that

3:52

is than I know what that is. I just you

3:54

know my brain is like taken over by open

3:55

AAI. Um but you know the kind of idea

3:59

someone can assign you to work on is

4:01

probably not what you want.

4:02

>> Yep. Um, okay. So, that that's fair. Um,

4:06

but I think it would be helpful since

4:07

this is a systems class to maybe uh

4:10

reason about a particular problem that

4:12

you have to reason through so that they

4:14

can then apply the shape of the

4:16

techniques used to break down from a

4:18

systems perspective that problem into

4:19

solutions to their own problem.

4:21

>> Yeah. Um and a and a concept that uh you

4:24

had started to tease in the class you

4:25

know back in 2014 and then uh clearly

4:28

you've talked about publicly over the

4:29

years is um scale right scale is its own

4:33

beast it's it's you know quantity is its

4:35

own quality what scale as a concept has

4:39

been something it seems like you've um

4:42

empirically investigated in all kinds of

4:44

ways over the last 10 years.

4:46

Um could you help help us first unpack

4:48

like what you mean by scale now 10 years

4:50

later how would you deconstruct that as

4:52

a systems design uh attribute to apply

4:55

whether it's a as as a tool um can can

4:58

we start there yes uh so I don't know

5:03

why the following observation is true I

5:06

offer no theory

5:09

that I find satisfying to explain it and

5:11

that makes me a little bit nervous to

5:14

suggest trust you follow it, but I'm

5:16

going to anyway because empirically it

5:18

does seem to be true, which is all of

5:20

the most interesting things I have

5:22

observed in my career in watching other

5:25

uh things happen. All of the most

5:27

interesting ones uh have had something

5:31

to do with emergent properties that

5:34

scale or scale continuing to provide

5:36

returns far beyond what the consensus

5:39

thinks will work. And this obviously

5:41

happens with like scaling loss for AI

5:43

models. Um but this happens with uh you

5:48

know getting more smart people together

5:50

to think about one problem. This h in a

5:52

in a research setting. Um this happens

5:54

with uh companies and the sort of

5:57

economy of scale. You can get all the in

5:58

all these different ways. I really

6:00

learned this at Y Combinator when uh it

6:03

became clear to me that everybody was

6:06

saying, "Oh, Y Combinator's gotten too

6:07

big. It should shrink. We should film

6:08

less companies per batch." you know, the

6:10

best times of Y Combinator when it was

6:12

like 10 companies per batch. And a lot

6:14

of like very smart people were saying

6:16

this and

6:18

and it was like tempting because it

6:20

would have been like much less work. And

6:21

the theory was that, you know, the best

6:23

companies are always kind of obvious and

6:25

then you fund the rest and it's not as

6:26

helpful. Um, but a huge part of the

6:28

magic of what made YC work were uh was

6:32

the sort of the network effects inside

6:34

of the batch and that was an emergent

6:36

property at scale that just hadn't been

6:38

discovered before. No one had tried to

6:39

fund startups

6:41

at scale in the same way and and thus no

6:43

one had ever happened upon this

6:46

observation of when you do that um

6:48

there's

6:50

there's something important that happens

6:52

that just didn't exist at all at the

6:54

110th to 1/100th of a scale.

6:57

There's a bunch of other examples like

6:58

this. Uh I

7:03

and I'll skip them in the interest of

7:05

time, but I I would say again I offer no

7:08

explanation for why, but empirically

7:10

speaking, when you find a time that you

7:12

can push on,

7:15

you can push something to a scale people

7:16

have not tried before and it's already

7:18

working in some interesting way at the

7:20

smaller scale. More often than not, that

7:22

seems to be a good idea. And it also

7:24

seems to be something that

7:28

most people don't do enough.

7:30

>> And I don't offer an explanation for

7:32

this either, but like in, you know, when

7:34

we were like, we're really going to

7:35

scale AI models. Um, all of the like

7:38

geniuses in the field, most of them

7:40

were, oh, this isn't really working. You

7:41

know, that's that's barely a scientific

7:43

result. It's not interesting that it

7:44

gets better at scale. You've already

7:45

shown that. Why keep scaling it? I

7:47

mentioned the YC example. Um, I've seen

7:50

a lot of

7:52

startup founders where they're like,

7:53

well, you know, there might be something

7:55

interesting that would happen if I

7:56

scaled this up, but I'm a little worried

7:59

about it for non-specific reasons. And

8:01

again, looking back at like a huge data

8:04

set of people that have scaled their

8:07

companies in all these different ways.

8:08

There's almost always interesting stuff

8:09

there. So, I think directionally that's

8:12

like an interesting thing to push on and

8:16

severely underexplored.

8:18

Um on the systems design part of that uh

8:24

I think one reason people don't do it as

8:26

much is stuff breaks uh at an

8:31

accelerating rate and in an

8:32

unpredictable way as you scale it and if

8:35

you are going to really scale something

8:37

8:39

it's always like a little bit broken.

8:41

there are always like very smart people

8:43

who say why you shouldn't do this you

8:45

know don't get too ambitious don't get

8:46

too big let's try this smaller and so

8:49

breaking that down as a systems problem

8:51

I use the thing of when we were like

8:53

scaling up AI models there was

8:55

technically can we do this at all this

8:56

seems crazy like no one had ever thought

8:58

about trying to do a run across 10,000

9:00

or 100,000 GPUs and that was going to

9:01

require stacks of engineering talent um

9:04

there was the capital requirements and

9:06

what was going to take to do this and

9:08

like how is there ever going to be a

9:09

business how can you think about taking

9:10

this risk

9:12

uh there was this sort of like cultural

9:13

stuff of researchers saying well if

9:14

we're going to get all this comput

9:18

something why not have to divide it up

9:19

among all these all these projects and

9:21

this also happens in kind of every area

9:23

I've looked at almost every area for

9:25

scale and breaking it down into the sort

9:29

of each difficult area or each reason

9:32

not to do it and trying to address them

9:34

one at a time that's been really

9:35

important. Um,

9:38

I'm going to push on that a little bit

9:40

because

9:41

there's very few people who've been able

9:42

to sort of repeatedly scale new products

9:45

and systems the way uh the OpenAI team

9:48

has over the years. But it seems like

9:50

one of the issues is there are all these

9:53

prior conditioning sort of mental models

9:56

and expectations humans have. And you

9:58

said things break. And one of the things

10:00

it seems often breaks that's hard the

10:03

hardest to refactor is is human the

10:06

human side of the the systems design,

10:09

right? Wherever there's human

10:10

implementers or there's uh human

10:12

participants in that. And so what have

10:13

you learned about humans at scale like

10:15

organizing humans at scale to

10:17

participate in a system that may not be

10:19

uh like just a redo of some past system

10:22

that they they get naively on at a

10:25

priority on first blush. Um,

10:29

I think like clear a clear goal, a clear

10:33

plan to get there. Uh, and like a clear

10:40

answer to the way that you're going to

10:42

get there and kind of how you're going

10:43

to make decisions along the way. That's

10:44

that's very important. So, um, you know,

10:48

if we go back to the example of when we

10:49

decided to scale up models, there were a

10:51

lot of people who were like, ah, this

10:52

isn't really going to work. It's going

10:54

to have these problems. It's also not,

10:55

you know, we need a more diversified

10:57

portfolio. But once we say no, we're

10:58

going to make a bet on scaling deep

11:00

learning, like that's our thing. If

11:01

we're wrong, we'll fail, but we're going

11:03

to do that. Here's why we're going to do

11:04

that. Here's what we believe about what

11:06

the state of the world could be like if

11:07

we get there. Uh, that's very powerful.

11:11

And then

11:13

for whatever reason, um, we did not

11:16

evolve to be good at thinking about

11:19

exponentials. People have a hard time

11:22

imagining that scaling laws are going to

11:25

continue exponentially, that revenue

11:26

will grow exponentially, that an

11:28

organization can take on exponential

11:30

complexity. And in my experience, it

11:34

takes a lot of time to really reason

11:36

through first principles with people

11:37

about why why that can happen. Can we

11:40

take two examples uh to walk through

11:42

that? The first being tach and the

11:44

second being codeex. You know, both of

11:46

these have transformed. Can can everyone

11:48

hear? I'm going to try to project it.

11:50

Yeah. Okay. Um so let let me put in a

11:54

frame and you can challenge both the

11:55

assumption and then we can hopefully

11:56

reason to example what happened. In the

11:59

case of chat GP you know for a long time

12:00

in scaling of models a big mental block

12:03

that seem to be prevalent in the space

12:06

is what are these things going to be

12:07

useful for this is you know it's a

12:09

research uh sort of solution solution

12:12

chasing a problem research first

12:14

approach. It's not a product. Um and

12:16

then you know chat GPD came out and it

12:18

proved to the world that you know that

12:21

chat experience was a killer app for

12:23

general models um at scale for consumers

12:27

and then a couple of years later you

12:29

know it's clear that coding has been the

12:31

killer enterprise app. So what how would

12:34

you compare and contrast the systems you

12:35

guys used to discover those use cases

12:37

ship them scale them monetize them any

12:40

any salient learnings from those two

12:42

systems? Yes. Um,

12:46

so we had made GPT3 and

12:50

we needed to make money cuz we wanted to

12:52

go scale up to, you know, a billion and

12:53

multi-billion dollar computers and we

12:54

had GPT3 and it was kind of interesting.

12:56

It was a cool demo but we couldn't

12:58

figure out a product to build around it

13:00

and we had been thinking thinking we

13:02

just couldn't do it. We had tried a few

13:03

things. They they hadn't worked. Um, and

13:05

so we knew the models were gonna get

13:07

better, but we also wanted to like start

13:09

a revenue engine sooner. And we said,

13:12

well, since we can't figure out what

13:13

product to build, we're just going to

13:14

put this into an API and we're going to

13:17

hope that somebody else can figure out

13:18

what product to build. And so we

13:20

launched in like, I don't know,

13:21

something in the summer of 2020 the GPD3

13:24

API. And initially, it kind of got no

13:29

traction at all. And then about a month

13:32

later, randomly, as far as we can tell,

13:35

it went viral on Twitter on the same

13:37

day, uh, a few different developers kind

13:39

of found got it to do something cool,

13:41

posted it, other people started trying,

13:43

and and then like a lot of people

13:46

started trying the API. Um, but it was

13:49

shockingly bad. If you go back and use

13:51

GBT3 or 3.5 um you will be astonished at

13:56

how bad the models were then uh relative

13:58

to the amount of excitement they

14:00

generated at the time. Uh so people

14:02

tried all these things and really the

14:04

only business that people got to work in

14:07

a significant way with GPT3 was

14:09

copyrighting. Um and that was like not

14:12

that great and not that exciting and we

14:13

were kind of like you know h it's just

14:15

going to have to wait for a better

14:16

model. But although no that was the only

14:20

business that was working, developers

14:22

had figured out how to like put in a

14:23

prompt and get and be able to chat with

14:25

it. And we saw this a lot like more

14:29

people were using they couldn't get the

14:32

API to work for their business, but they

14:34

were using their API key to just chat.

14:35

And we said, well, we can build a good

14:37

chatbot. People clearly want that. And

14:40

we had a new model. We actually had UPV4

14:42

done, but we had a new model we were

14:43

ready to release in between called 3.5.

14:45

And we had figured out a new kind of

14:47

post training where we could get the

14:49

models to do like a good job with

14:50

instruction following so it can make it

14:52

easier to chat with. And we said, well,

14:54

you know, the API is not working great.

14:58

Maybe it was like a 10 or a $20 million

15:00

run rate kind of business, but there is

15:02

this thing that people love. Uh, and

15:05

under the YC principle of see what your

15:07

users love and do that, we said we'll

15:08

we'll build a chatbot around it. And we

15:10

put that out and we still didn't think

15:12

it was going to do that well. Uh there

15:14

was it was really meant as like a

15:15

research demo uh to convince other

15:18

people that they should build chat light

15:19

products and pay us for the API,

15:22

but that went like crazy viral. And

15:24

another thing I had learned from YC is

15:26

when something really starts growing and

15:28

it's not very good, you have like a

15:30

guaranteed hit on your hams. And so we

15:32

had like five days where the traffic

15:35

would shoot up, fall off, and everybody

15:37

be like, "Well, that was just a hype

15:38

cycle." But then the next day it would

15:39

get to a higher peak, fall off again

15:41

later in the day. People would say

15:42

that's a hype hype cycle. By the fourth

15:44

or fifth day, I was like, I know how

15:46

this works. I know what's going to

15:47

happen. Like, we have the potential here

15:50

>> at a killer product. Um, and we knew we

15:54

could make it much better. We knew we

15:55

could we knew we had GPT4. We knew we

15:57

could keep scaling. Um, but by that

16:02

fifth day, we got everybody together and

16:04

said, "This is an emergency. This is a

16:06

good kind of emergency, but we have to

16:08

build a company and a product all at

16:10

once.

16:11

Uh we then had like two months of crazy

16:14

scaling. Uh and then we said, you know,

16:17

we have to figure out a business model

16:18

later. For now, we're just going to

16:20

charge people so that we don't like run

16:21

out our compute bills. But that's

16:23

obviously not the long-term answer. That

16:25

also turned out just to work. Um and

16:28

that was the story of Chach. And then

16:29

there was so much utility that people

16:31

just had not gotten over the activation

16:34

energy to find that that has worked

16:36

really well. Um and then codeex.

16:40

Actually the plan before chatbt was that

16:42

we were going to go all in on code.

16:44

>> Um we knew these models could write

16:45

code. Uh we knew that they could be

16:48

really and we knew that that would be

16:50

like a valuable area. But then we had

16:51

this incredibly exciting thing happen.

16:53

Um but our kind of internal belief at

16:55

the time was that coding was how these

16:59

models would control things on computers

17:01

and robots were how these models would

17:04

control things in the physical world.

17:05

And if you made a smart enough model

17:07

that had sort of the actuators of

17:08

writing code and robot and driving a

17:11

robot, you could then kind of actually

17:14

get this intelligence to do stuff for

17:15

you in the world.

17:16

>> So, uh, then it took us a while to get

17:18

there. And then I think codeex got

17:21

really good by early this year, but with

17:24

5.5 is when we saw this real inflection

17:26

point where people are now like doing

17:29

just incredible things with it. And um

17:32

you know that we earlier in the class

17:34

we've talked about how the capabilities

17:36

pipeline uh is starting to look is

17:39

starting to become somewhat more legibly

17:41

standard across different research

17:42

groups. You got you know pre-training

17:45

mid-training post training. Then you got

17:46

the RL and supervised feedback loop. Is

17:48

do you think that's roughly like the

17:50

shape of the pipeline that allowed

17:52

codeex to you know go through a

17:54

capability jump and that will basically

17:55

stay stable now and consistent or are we

17:57

going to go through a major rewrite of

17:58

that pipeline? I think that is

18:00

definitely the current pipeline. I

18:02

expect we will go through a major

18:03

rewrite. I don't know when it'll happen

18:04

or exactly how. Um, but

18:08

it is a little odd to me that it's so

18:11

happens as a pipeline and doesn't quite

18:14

feel like the optimal solution. Um, what

18:18

would be an optimal solution in your

18:20

head?

18:20

>> I think that's a research problem for

18:21

the AIS to figure out. Um, I think we're

18:24

at a point where and we've set this goal

18:26

that by September of this year, we will

18:28

use 500,000 A100 equivalent GPUs, like a

18:31

lot of computing power, as an AI

18:33

research intern, and by March of 2028

18:35

that we will have a full end toend very

18:38

talented researcher like figuring out

18:40

complete new architectures. Um, so I

18:43

think we are going to get like with the

18:45

current pipeline, the current

18:46

architectures, I think we're going to

18:48

get over the line of when AIs can do

18:50

incredible incredible work. Um, you

18:53

know, one of the things that you you

18:56

just described there

18:59

is you you we we've been talking a lot

19:01

in the class about systems frameworks

19:03

and analogies to make concepts from one

19:05

domain legible to other people who may

19:06

not have all the context in another and

19:09

that sometimes because of the

19:11

translation problem, you know, reasoning

19:13

by analogy is not helpful because then

19:15

errors compound. Yeah. Um right there

19:17

you said you know our goal is to try to

19:19

use it as an AI intern which obviously

19:21

is a very useful metaphor within the

19:23

context of you know Silicon Valley a

19:25

class that understands how these

19:26

pipelines work and so on and then as as

19:28

you scale actually that metaphor

19:30

globally people who might not have all

19:32

that context go start analogizing these

19:34

models in ways that they shouldn't be

19:35

like how should we think about the

19:37

limits of of that of of what are the

19:40

limits to scale of um what are the

19:43

product analogies the research analogies

19:45

you find most useful

19:46

within the valley and which one of the

19:48

what have you found about the limits of

19:51

those analogies scaling and now how do

19:53

you navigate between those two problems?

19:56

I I've been very interested in studying

19:59

how like

20:02

I think what is happening is we are we

20:04

are in the process of creating a new

20:05

utility. This doesn't happen very often.

20:07

you know, electricity is utility,

20:08

internet's a utility, there water, I

20:10

guess there's not a lot of these. Uh,

20:12

and so there are not a lot of examples

20:14

that we can study for good metaphors or

20:17

learnings about how to explain this to

20:19

the world. Um, but I was recently

20:22

looking at what happened when

20:24

electricity became a utility. And it's a

20:28

good analogy for many reasons. It's

20:29

imperfect, of course, too. But the

20:31

electricity companies, at least the ones

20:33

I could find information about, they

20:35

didn't talk about selling electricity

20:36

cuz no one knew what that was or why

20:38

they wanted. It sounds like very scary.

20:39

It's this thing that's like going to

20:40

come into your house and it could kill

20:42

you in this like gruesome way and you

20:44

you know it feels sort of like very

20:46

different than the world before. Uh and

20:49

maybe they tried to sell electricity or

20:51

market electricity at first. I don't

20:52

know. But in any case, that didn't work.

20:54

And then what they started

20:56

marketing selling to people was light at

20:58

night. You know, we are going to what

21:01

you are getting from us is not

21:02

electricity. It's light at night. By the

21:04

way, you can use the same thing that

21:06

lets you get light for all these other

21:07

things. But people are like, well, why

21:09

would I want that? And they're like,

21:10

well, you know, it'll wash your clothes

21:11

for you someday. And no, no, it won't. I

21:13

can't. That's too far of a jump for me,

21:15

>> right?

21:15

>> Um, so I don't know what our analogy

21:20

for this should be. Um, but I suspect

21:23

that even if even if we're totally right

21:26

and intelligence is going to become this

21:28

new utility that every company, every

21:30

customer, every government just needs

21:34

access to and is going to use in all

21:35

sorts of incredible ways and you will

21:37

have like a OpenAI token subscription

21:39

that you will plug into everything and

21:41

use to access everything and you have

21:42

running for you all the time and doing

21:44

this amazing stuff. I kind of don't

21:46

think at least right now the right way

21:48

for us to analogize that is we're

21:50

selling intelligence because people are

21:51

just like somehow not resonating. I

21:54

don't know what our equivalent of we're

21:57

selling you light at night is going to

21:58

be. But I think if we're going to become

22:00

a new utility, we need to find a way to

22:03

explain to the world what it means to

22:05

have this like intelligence pike that

22:07

you can just do whatever you'd like with

22:09

>> it. So um one

22:13

question that has emerged an emerging

22:15

property of this class of of having a

22:17

diversity of different speakers is that

22:19

the utility analogy has come up several

22:20

times but in reference to different

22:22

things. So Jensen likened like compute

22:26

to a utility um and why there should be

22:29

access and so on and talked about how

22:30

Stanford should pull budget and so on

22:32

and and and procure that as a utility

22:34

for everybody on campus whereas you just

22:36

likened the intelligence part to util

22:38

are both of these things true is one of

22:40

them true one is one more likely to be

22:41

true how should people reason about

22:42

compute as a utility versus tokens as a

22:45

utility and and by comput I mean here

22:47

chips versus tokens does that make sense

22:50

>> I think as a consumer as like a business

22:53

or an individual um you will think in

22:56

something closer to tokens or probably

22:58

even one level up from tokens. I don't

23:01

think you'll care very much about you

23:03

know where the hardware is, what

23:05

particular chip it is, what's powering

23:07

it. I think that stuff will be

23:08

abstracted out and what you will care

23:10

about is when you're interacting with

23:12

the system. Um

23:15

can you use it a lot? Is it cheap? Is it

23:17

doing a good job? Um so right now it's

23:20

like tokens. It may get as we move into

23:23

a world where we all just have like this

23:25

constant agent running for us, being

23:27

useful to us all of the time. Um, you

23:29

may think about it as even one level up.

23:31

But yeah, my my guess is is you when you

23:34

like pay for your cell phone bill,

23:36

you're like, "All right, I'm buying

23:38

access to airtime and some number of

23:40

gigabytes and, you know, it's going to

23:42

do all these things and I'll use all

23:43

these apps and whatever else." But like

23:45

what you think about paying for the kind

23:47

of internet utility in this case is just

23:49

like access to the whole system and the

23:53

particular hardware at the base station

23:55

and how it connects to the internet. You

23:57

don't think about that as much.

23:58

>> Um I know I could nerd out about utility

24:01

infrastructure for a long time but I

24:02

want to make sure we switch a little bit

24:03

to being relevant for the students.

24:05

Usually we have uh questions where we're

24:07

not hearing those today unless you're

24:09

comfortable. Oh, okay. Great. How about

24:11

that? Improv. Okay. Uh so one final

24:15

question to start getting the creative

24:16

juices flowing is um the final project

24:18

for this class or fiber 183 is the

24:21

oneperson frontier lab. So everybody

24:23

here is working on projects where

24:25

they're simulating being an individual

24:28

uh as a lab with access to all the right

24:30

tools. They've got hundreds of thousands

24:31

of dollars of credits from Cloudflare. I

24:33

think we've got some open AI tokens

24:35

maybe. But there's a bunch of compute at

24:36

their disposal. Um, what would you, if

24:39

you were in the class, what would you be

24:41

working on for your oneperson Frontier

24:42

Lab project? First of all, I think

24:44

that's an awesome project. Um,

24:48

I think this is top of mind because uh

24:52

you we we were just like talking about

24:54

utility frame frameworks. I think

24:56

there's a lot of very smart people

24:58

working on uh great training ideas and

25:02

we're going to have incredible models.

25:04

No matter what you all do, we're going

25:05

to have incredible models. I promise

25:07

here uh like pretty quickly but

25:11

I I think we have not invested enough in

25:13

being able to deliver at scale huge

25:16

amounts of cheap intelligence. So maybe

25:17

I would go work on like the inference

25:19

part of the stack

25:20

>> and how are we going to get this

25:22

incredible intelligence to be cheap and

25:24

abundant? Uh I think that's

25:26

underinvested in and and I think all of

25:28

the frontier labs are going to have to

25:30

become inference companies to a

25:31

significant degree. Um, okay.

25:36

It might be too late to pivot your

25:37

projects, but better late than never.

25:39

>> Work on whatever you want to work on.

25:40

[laughter]

25:42

>> Uh, okay. Let's start taking questions

25:43

and I'm going to moderate and try to be

25:45

not, you know, please try to be

25:47

productive and not spicy, etc. Remember,

25:49

it's a CS class, but up to you Sam is

25:52

fine.

25:53

>> Oh, we've got questions. Oh, perfect.

25:54

All right. First one, the questions

25:56

about your views on Yan Lun's view that

25:59

LLMs are a dead end. Um,

26:03

first of all, in terms of achieving

26:05

human level intelligence, these models

26:07

have already far surpassed human

26:09

intelligence in some ways and then

26:11

they're wildly worse than others. Like

26:13

for example, they seem much worse than

26:15

people are at very long horizon

26:20

kind of high judgment signal and tasks.

26:24

Um on the other hand yesterday we had

26:28

one of our models uh discover or

26:31

disprove a conjecture one of the airish

26:33

problems that had smart people had

26:35

worked on for a long time and a lot of

26:37

people a lot of smart scientists I don't

26:39

know if lun was one of them or not had

26:41

even quite recently said something like

26:43

that was not going to happen. Uh and

26:45

then like the model just did it and you

26:47

know now you have all these

26:48

mathematicians saying like is math over?

26:50

What does this mean for our field? So

26:52

clearly LLMs are capable of figuring out

26:56

new knowledge and clearly they are

26:58

capable of doing some things that some

27:00

intelligence tasks that humans just

27:02

can't do. Um they are going to scale

27:04

much further. So how much better and

27:06

what distribution of the tasks they can

27:08

do better than humans. We'll find out

27:09

but I suspect it's a lot. And the you

27:12

know in terms of this like lack of a

27:15

belief in the exponential we were

27:16

talking about earlier. Um, I think the

27:18

field was honestly held back by a

27:21

generation of scientists who just were

27:23

way too certain on what wouldn't what

27:25

what scaling was not going to produce

27:27

and then some people just looked at the

27:29

graphs and said, "Well, it looks like

27:30

it's continuing beautifully. Let's keep

27:31

going." Um,

27:34

I think world models are clearly

27:37

important and to

27:39

we'll need that for things like

27:41

robotics. Uh but betting

27:44

against LLM scaling at this point

27:48

uh feels quite misguided to me.

27:53

>> Does it get annoying to be the I told

27:54

you so guy?

27:55

>> No. I mean

27:58

there are these like Twitter trolls that

28:00

you know for years have just been like

28:02

it's not going to work. It's not going

28:03

to work. This is so dumb. Like you know

28:04

this is a fraud. This company's going to

28:06

fail. This research approach is going to

28:07

fail. And I used to get more bothered by

28:09

them. But I don't even like feel the I

28:11

told you so at this point. It's like you

28:12

were like she was nervous.

28:14

>> You're still going on about it. Like the

28:16

data is

28:19

>> quite strong on our side and I don't

28:22

think it'd be that fun to say I told you

28:24

so. And also the fact that you're like

28:25

still saying we're wrong doesn't really

28:26

bother me.

28:27

>> I think there's that kind of move on.

28:29

>> There's that saying that like insanity

28:30

is doing the same thing over and over

28:32

again when presented with data that is

28:34

not working and they keep repeating

28:35

that. And in a sense it's it's it's a

28:37

form of insanity. I think

28:39

>> I I think there's something that happens

28:40

which is if you make your identity about

28:43

a particular

28:45

thing is going to work or not work

28:48

and you associate yourself with that

28:50

belief and then the science or the

28:52

empirical results disprove you and

28:55

you're like too hung up on your

28:57

identity, you can't let it go. You can't

28:58

see the truth.

28:59

>> Yeah.

29:00

>> And I think this is like a important

29:01

reminder in both directions.

29:03

>> Yeah.

29:04

>> How do you see education?

29:07

Um, it clearly has to super adapt and I

29:11

am worried. I I thought by now it would

29:13

have. Um, the the I think if we continue

29:17

to teach and evaluate students

29:20

as if we were in a pre-agi world, um,

29:23

it's not going to work and it is going

29:25

to lead to like atrophy of learning how

29:27

to think or whatever. And I thought that

29:29

was going to be obvious enough that I

29:31

wasn't that worried. You know, when

29:32

Chhatbt launched, I was like, "Yeah,

29:33

we're going to have one year of like

29:35

students like cheating and not learning

29:37

that much. And then the educational

29:39

system is just going to redesign

29:40

itself." And there's and we're going to

29:41

teach people so much better. You know,

29:43

people are going to really

29:45

get projects where they have to they

29:48

have to use AI to be able to do it, but

29:49

they still have to like stretch their

29:50

brain more and think more and figure out

29:52

new things to do. And honestly, I

29:56

struggle to point to any significant

29:58

systemic change that I've seen in the

30:00

education system at large in the three

30:03

and a half years since Chad launched.

30:04

And I that was a prediction error for

30:06

me. I thought I thought that would have

30:07

happened. So I have no doubt that we can

30:12

uh like we have done with every other

30:13

technological leap before redesign how

30:16

education works so that you still have

30:18

to learn how to think. And there will be

30:20

some things like I I I am a person who

30:23

thinks by writing and I write a lot of

30:26

stuff that I never show anyone else but

30:28

it's still important to me to figure

30:29

something out and so I'm grateful that I

30:30

I learned to write. People say the same

30:32

thing about programming. Um so there

30:35

will be some things that we teach people

30:36

to do that machines can do better just

30:38

because it's helpful to teach them the

30:42

meta skill of thinking and learning and

30:43

that makes sense. But there are a lot of

30:45

other things where we should just

30:46

totally teach totally change how we

30:49

teach or how we learn or how we

30:50

evaluate. And

30:53

if we don't do that, I think there will

30:54

be like significant atrophy in people's

30:57

critical thinking skills. Uh question is

30:59

what was your favorite class and what

31:00

what do you wish you had taken while

31:02

when you were at Stanford?

31:03

>> Does Stanford still do intro Sims? I did

31:06

like all the I did like three intros a

31:08

quarter my freshman year like and I

31:10

loved all of them. Uh they were all

31:12

super different. Uh I but looking back

31:17

the fact that I

31:19

was able to get such a broad exposure to

31:21

stuff and h have like a a very shallow

31:24

understanding of lots of different

31:25

fields was an incredible thing. If it

31:28

had not been for that I just would have

31:29

taken like CS and physics classes which

31:31

still would have been great. But um I I

31:35

think more about the stuff the classes I

31:37

took that were like totally random and

31:40

unrelated to what I do now but in some

31:42

important way

31:44

gave me a perspective than I do I think

31:46

I would have like learned to program no

31:49

matter what. Uh so I and I didn't think

31:53

that at the time I was like kind of like

31:54

yeah you know this is this stuff is all

31:57

cool but it's mostly going to be about

31:58

like learning CS. Um, I only did two

32:01

years of school. Uh, so there was a lot

32:03

of stuff I wanted to take that I didn't

32:04

get to. Um, but that's kind of the

32:06

surprising thing.

32:08

My question is, what is your spiciest

32:11

take of all?

32:17

I I think with more time to think uh I

32:20

could come up with a much

32:22

spicier one, but um

32:26

I think AI is just going to keep going.

32:29

And

32:31

I think this is considered

32:34

I don't I don't think this is like

32:35

widely believed yet. And I think if this

32:37

were widely believed, there would be

32:39

like significantly more reverberations

32:41

that are happening through society right

32:43

now. And maybe I don't have the spicer

32:45

tag. Actually, maybe this is the high

32:46

order bit that if AI progress continues

32:49

on the exponential that it's on for

32:52

another,

32:54

it's been three and a half years since

32:55

tragedy. If even if we're another three

32:56

and a half years on that same

32:57

trajectory, the world

33:00

the potential the way that society

33:03

what's society is capable of are just

33:05

completely different. Well, let me try

33:07

to prompt more thinking tokens on that

33:09

one. um you you have if we treated you

33:13

as a model like as a frontier model and

33:16

you have some inherent capabilities and

33:17

we're going we're going to try to elicit

33:19

capabilities that people don't know

33:21

about for the next few minutes. Um one

33:23

of them is that you've been postrained

33:24

now on you you've been continuously RL

33:27

on OpenAI as well as the external

33:29

feedback loop of the world on what

33:30

doesn't work and works and doesn't work.

33:33

So now if we're going to treat you as a

33:34

prediction engine for a sec, the prompt

33:36

is what are the three most likely forks

33:39

of the universe you see over the next 10

33:41

years and what is your what is your

33:43

probability assessment on each of those?

33:45

Does that make sense?

33:47

One that feels very important is uh like

33:52

how much is this technology going to be

33:54

very widely democratized versus how much

33:56

is it going to sit in a few companies. I

34:00

I think a world there are all of these

34:02

reasons why you could imagine the

34:03

default is that this gets concentrated

34:05

to a few companies and they become like

34:08

you know a significant fraction of the

34:10

wealth on earth that would obviously be

34:12

terrible and we work super hard to push

34:14

against that but I think that's going to

34:16

require like the will of the world to to

34:18

really avoid um because there is a sort

34:21

of a tractor state there and I think

34:22

part of the reason that we need to push

34:24

to this kind of utility model of the

34:25

world is that a it's quite unstable and

34:29

quite bad will feel quite unfair if a

34:31

few companies have all of this. But B, I

34:33

think there's a real alignment failure

34:34

and a very fragile world. Uh, and the

34:37

best way to get to a world we want that

34:39

represents like everybody winning and

34:41

everybody's values being represented,

34:43

everybody having agency is to just put

34:44

push this technology out into the world.

34:48

Um, but there will be a very strong

34:49

argument against that around sort of

34:51

safety and stability. And I think that

34:54

will be a big fork. And it's very

34:55

important and I encourage all of you in

34:57

your careers to push hard that this is a

34:59

technology.

35:01

It can bring us an incredible sci-fi

35:03

future. Life can be unbelievably much

35:05

better. We are going to incur some risk

35:07

to get there. But the risk of keeping

35:09

this concentrated in a handful of

35:11

companies even though we would be one of

35:12

these companies is not something we

35:14

should tolerate. So I think that would

35:15

be a big fork. uh in terms of

35:17

probability I think it's

35:21

the world should have such an interest

35:23

in it happening this way that I think

35:25

it's like 80% we end up on the

35:27

democratic path but there will be a very

35:30

strong safety message and you know there

35:32

will be a lot of power seeking people

35:34

who who want to concentrate the power

35:36

and

35:38

one of the problems with forecasting

35:41

this or that you have and we all have as

35:43

humans is once you make that forecast

35:45

then you of agency to affect the

35:48

forecasts, right? And the forecast for

35:49

>> Well, I mean, we're clear on what we're

35:51

going to use our agency for. Like, this

35:52

is what we believe in. We think that uh,

35:55

you know, we're going to do everything

35:56

we can to push it in this direction. We

35:58

just we see the forces in the other

36:00

direction. Maybe a related fork. Uh,

36:04

there's a lot of talk about like future

36:05

economic models and are we going to do

36:07

universal basic income? Are we going to

36:09

have everybody gets to like own a slice

36:11

of every company? Like, are we going to

36:13

is it capitalism with no change? Is it

36:16

like fullon communism? There's like a

36:18

lot of talk about this. One thing that I

36:20

think is not talked about much is how

36:23

specifically how we distribute compute.

36:26

>> So maybe a lot of the economy can work

36:29

in a way that it's going to work. And

36:31

I've actually I've become much less of a

36:32

even short-term jobs doomer. I've always

36:34

been optimistic we find new things to

36:36

do. But this may not be dup as disrupted

36:38

as I originally thought in the short

36:40

term. Um but we are seeing compute

36:43

shortages now. I can imagine them

36:46

getting much worse and I can imagine

36:48

compute being like the most important

36:50

utility that people need. Uh so if the

36:53

price of compute from a supply and

36:54

demand perspective gets way out of whack

36:56

then I think there will be a very

36:58

interesting fork about what it means to

37:00

equitably distribute compute. So you did

37:03

two very interesting things there which

37:04

you said on the economic side we might

37:07

have need universal basic income.

37:09

Everybody owns a piece of shares. You

37:11

know, one of the speakers in this class

37:12

is um Nikolai Tangjen who runs the

37:16

Norwegian sovereign wealth fund. He's

37:17

awesome. He's awesome. You know, the

37:19

Norwegian Sovereign Wealth Fund owns

37:21

1.5% of all publicly traded companies on

37:23

the planet. They also have effectively

37:25

universal basic income. You could argue

37:26

there's flavors of this already today

37:28

because, you know, the largest employer

37:30

now in the United States is the

37:31

government and you could argue like

37:32

large sections of that are a way for the

37:34

government to redistribute income from

37:36

taxpayers. So are these solutions that

37:39

actually need to be novel or just

37:41

reimplemented for this era? How do you

37:43

think about the novelty of those

37:44

solutions where we often you know in

37:46

Silicon Valley make have this tendency

37:48

to be like reinvent you know old things

37:51

from first principles and so should we

37:53

just look to existing systems and tweak

37:54

them. Um yeah, I don't think that these

37:55

things require

37:57

deeply new ideas. Although I will say um

38:02

I am much more excited about people

38:04

having some sort of ownership stake than

38:07

a fixed monthly cash dividend,

38:09

>> right?

38:09

>> Um and I I funded like a big universal

38:14

basic income study a while ago. I've

38:17

also watched what happens when people

38:18

like invest in startups and I know which

38:21

model I think like hits human psychology

38:23

better. So what I would love to see is

38:26

as leverage in the world shifts from

38:30

labor to capital which I think is going

38:31

to keep happening

38:33

that we find a way to have something

38:36

like a citizens wealth fund in the

38:38

country or in the world eventually where

38:41

you like you basically own a slice of

38:43

capitalism right a slice of these

38:44

companies. And then on the second fork

38:46

there on compute bottlenecks, you said

38:49

uh at some point when compute prices get

38:51

out of whack between January and this

38:53

year, my my current understanding is

38:55

based on data we've seen that H100

38:57

prices and Blackwell prices the spreads

39:00

between long-term reservations and spot

39:02

is like 5x.

39:04

>> I don't know if it's that high anymore.

39:05

I think it got a little better. But

39:07

yeah, tell me.

39:08

>> Or if you can even find H100s cuz

39:10

they're pretty much all gone for this

39:11

year. Does that sound right?

39:13

>> No argument. there's a gigantic comput

39:15

shortage. Yeah. So that that's a good

39:18

example of an of a systems problem right

39:20

now that's live. At least to some folks,

39:23

it feels like co, you know, for the

39:25

comput era, like all the toilet paper's

39:27

gone.

39:27

>> Yeah.

39:29

>> Why are people not freaking out about

39:30

this?

39:31

>> Well, I think people assume we will make

39:33

big inference gains on the hardware we

39:35

have. Uh I also think there is a tsunami

39:37

of hardware coming

39:38

>> but maybe the demand tsunami is even

39:41

bigger and people I think people should

39:42

be freaking out somewhat

39:43

>> and and would you say it's fair like how

39:46

long are we going to exist in a comput

39:47

shortage

39:48

at least you know based on current data

39:51

you have

39:53

>> I think like other you you can't talk

39:56

really about like worldwide demand for

39:58

electricity without talking about the

40:00

price like it's there's an extremely

40:03

different demand about how much energy

40:05

people want to use in the world if the

40:06

price comes down by a factor of 10 or

40:07

goes up by a factor of 10 and I think AI

40:12

is like that too.

40:14

>> Uh the

40:16

if we can make models

40:21

sufficiently smart and a sufficiently

40:23

low cost. I think demand is like kind of

40:26

uncapped and so in some sense as long as

40:28

we can continue to make progress on this

40:30

there will be a shortage forever and

40:32

things will be bid among above what the

40:35

price we think we think the price should

40:37

be even though people are getting better

40:39

smarter more whatever intelligence just

40:42

because you can use like

40:45

if we make really great personal agents

40:47

and you can have 10 of them running and

40:48

working for you all the time or 100 and

40:50

you know you want the hundred I think

40:53

>> it's a lot of inference

40:54

lot of conflict.

40:55

>> Awesome. With that, I'm going to give

40:56

you the swag for the class, which is

41:00

[applause]

41:02

Thank you for coming. Thank you. Thank

41:04

you all.

More transcripts

Explore other videos transcribed with YouTLDR.

Get the TLDR of any YouTube video

Transcribe, summarize, and repurpose videos in 125+ languages — free, no signup required.

Try YouTLDR Free