0:09
Please join me in welcoming Sam Olen.
0:18
This class was designed as an
0:19
inspiration from a you know from a set
0:21
of different experiences uh while I was
0:23
a student here. One of them was Terry
0:24
Wintergrad's uh intro seminar CS47N
0:29
computers and the open society. Uh but a
0:31
second one that was a pretty formative
0:33
experience uh for me and a lot of my
0:35
friends and peers on campus at the time
0:40
how to start a startup by SAM. Um and so
0:43
it's really cool to have you back. Uh
0:45
what's it like? How how's it feeling for
0:47
you to be back? I was thinking as I was
0:49
walking in, if I had just a little more
0:50
time, I would do uh an update to that
0:52
class because I think everything about
0:55
starting a startup has changed so much
0:57
and I have not seen anyone do a good
1:00
version of how you're supposed to make a
1:01
startup now. Uh so I had that like just
1:04
walking in here I had that like ah it'd
1:05
be fun to do it again.
1:06
>> So uh timeline wise yeah you you taught
1:09
that in 14 I think open was founded in
1:13
>> 16 basically 16. Okay. So, so then you
1:16
went, you know, it was like you were it
1:18
it it felt to me from the from an
1:20
observer perspective that you had like
1:22
come up with your working theory for how
1:23
to do it right and then you went and
1:25
tried to implement it. Is that is that a
1:27
spare assessment or is that not the
1:28
case? Obi was like the strangest startup
1:32
of the last maybe couple of decades in
1:35
the Silicon Valley because it started as
1:37
a research lab. It was it was really not
1:39
>> right? Um, and that
1:44
the kind of normal course of of startups
1:46
is that you start a product company and
1:48
then it like grows for a while and then
1:49
growth slows down and then you start a
1:51
research lab and you like bolt that on
1:52
and you try to figure out the next thing
1:53
to do. And we were the opposite of that.
1:56
We were a research lab first that later
1:58
had to bolt on a startup,
2:00
>> And uh I don't really recommend that.
2:02
It's kind of an unusual thing, but that
2:04
that's not quite what I meant. What I
2:07
meant is like we still followed the
2:09
preAI rules of a startup because we were
2:11
trying to make AI. We didn't have it
2:13
>> But now like watching what the best
2:15
startups do is so different than how
2:18
startups worked even a couple of years
2:19
ago. Um that I think someone I'm
2:23
probably not going to do it. Someone
2:23
should do that class again. And what
2:25
would be the biggest updates you you'd
2:27
make based on your data? Um,
2:34
you with like an affordable amount of
2:37
spend on tokens, you can do what a 100
2:43
great engineering team would do as a
2:45
startup and that was just totally
2:46
impossible. That was like not in the set
2:48
of options for a startup and now it is.
2:52
So, so I think what you can take on, uh,
2:54
the level of ambition you can have, the
2:55
speed of which you can move, the amount
2:57
of stuff you can do at once, uh, is just
2:59
totally different. And, um, does that
3:02
change the shape of the problems you
3:04
feel like you'd assign at the end of the
3:06
class for people to attack, you know, at
3:08
the end of that quarter if you were
3:10
teaching it again? I don't think
3:12
assigning problems to attack ever works
3:14
because if you like if I can think of a
3:17
problem, if I can think of like a really
3:18
great startup idea, uh if it's like
3:20
obvious enough to me, uh then it's
3:23
probably obvious to a lot of people.
3:25
When we started OpenAI, we were we were
3:26
like the uh you know, one of maybe
3:29
generously speaking four AGI efforts in
3:32
the world, right? And you want to find
3:33
something like that. And I'm sure that
3:35
there exists something today that just
3:37
wasn't possible at all pre like
3:40
automated coding era uh that is totally
3:43
unobvious that will be you know a multi-
3:46
trillion dollar market soon uh and that
3:48
only four companies are working on right
3:49
now. But I don't know what that is. It's
3:50
much more likely you all know what that
3:52
is than I know what that is. I just you
3:54
know my brain is like taken over by open
3:55
AAI. Um but you know the kind of idea
3:59
someone can assign you to work on is
4:01
probably not what you want.
4:02
>> Yep. Um, okay. So, that that's fair. Um,
4:06
but I think it would be helpful since
4:07
this is a systems class to maybe uh
4:10
reason about a particular problem that
4:12
you have to reason through so that they
4:14
can then apply the shape of the
4:16
techniques used to break down from a
4:18
systems perspective that problem into
4:19
solutions to their own problem.
4:21
>> Yeah. Um and a and a concept that uh you
4:24
had started to tease in the class you
4:25
know back in 2014 and then uh clearly
4:28
you've talked about publicly over the
4:29
years is um scale right scale is its own
4:33
beast it's it's you know quantity is its
4:35
own quality what scale as a concept has
4:39
been something it seems like you've um
4:42
empirically investigated in all kinds of
4:44
ways over the last 10 years.
4:46
Um could you help help us first unpack
4:48
like what you mean by scale now 10 years
4:50
later how would you deconstruct that as
4:52
a systems design uh attribute to apply
4:55
whether it's a as as a tool um can can
4:58
we start there yes uh so I don't know
5:03
why the following observation is true I
5:09
that I find satisfying to explain it and
5:11
that makes me a little bit nervous to
5:14
suggest trust you follow it, but I'm
5:16
going to anyway because empirically it
5:18
does seem to be true, which is all of
5:20
the most interesting things I have
5:22
observed in my career in watching other
5:25
uh things happen. All of the most
5:27
interesting ones uh have had something
5:31
to do with emergent properties that
5:34
scale or scale continuing to provide
5:36
returns far beyond what the consensus
5:39
thinks will work. And this obviously
5:41
happens with like scaling loss for AI
5:43
models. Um but this happens with uh you
5:48
know getting more smart people together
5:50
to think about one problem. This h in a
5:52
in a research setting. Um this happens
5:54
with uh companies and the sort of
5:57
economy of scale. You can get all the in
5:58
all these different ways. I really
6:00
learned this at Y Combinator when uh it
6:03
became clear to me that everybody was
6:06
saying, "Oh, Y Combinator's gotten too
6:07
big. It should shrink. We should film
6:08
less companies per batch." you know, the
6:10
best times of Y Combinator when it was
6:12
like 10 companies per batch. And a lot
6:14
of like very smart people were saying
6:18
and it was like tempting because it
6:20
would have been like much less work. And
6:21
the theory was that, you know, the best
6:23
companies are always kind of obvious and
6:25
then you fund the rest and it's not as
6:26
helpful. Um, but a huge part of the
6:28
magic of what made YC work were uh was
6:32
the sort of the network effects inside
6:34
of the batch and that was an emergent
6:36
property at scale that just hadn't been
6:38
discovered before. No one had tried to
6:41
at scale in the same way and and thus no
6:43
one had ever happened upon this
6:46
observation of when you do that um
6:50
there's something important that happens
6:52
that just didn't exist at all at the
6:54
110th to 1/100th of a scale.
6:57
There's a bunch of other examples like
7:03
and I'll skip them in the interest of
7:05
time, but I I would say again I offer no
7:08
explanation for why, but empirically
7:10
speaking, when you find a time that you
7:15
you can push something to a scale people
7:16
have not tried before and it's already
7:18
working in some interesting way at the
7:20
smaller scale. More often than not, that
7:22
seems to be a good idea. And it also
7:24
seems to be something that
7:28
most people don't do enough.
7:30
>> And I don't offer an explanation for
7:32
this either, but like in, you know, when
7:34
we were like, we're really going to
7:35
scale AI models. Um, all of the like
7:38
geniuses in the field, most of them
7:40
were, oh, this isn't really working. You
7:41
know, that's that's barely a scientific
7:43
result. It's not interesting that it
7:44
gets better at scale. You've already
7:45
shown that. Why keep scaling it? I
7:47
mentioned the YC example. Um, I've seen
7:52
startup founders where they're like,
7:53
well, you know, there might be something
7:55
interesting that would happen if I
7:56
scaled this up, but I'm a little worried
7:59
about it for non-specific reasons. And
8:01
again, looking back at like a huge data
8:04
set of people that have scaled their
8:07
companies in all these different ways.
8:08
There's almost always interesting stuff
8:09
there. So, I think directionally that's
8:12
like an interesting thing to push on and
8:16
severely underexplored.
8:18
Um on the systems design part of that uh
8:24
I think one reason people don't do it as
8:26
much is stuff breaks uh at an
8:31
accelerating rate and in an
8:32
unpredictable way as you scale it and if
8:35
you are going to really scale something
8:39
it's always like a little bit broken.
8:41
there are always like very smart people
8:43
who say why you shouldn't do this you
8:45
know don't get too ambitious don't get
8:46
too big let's try this smaller and so
8:49
breaking that down as a systems problem
8:51
I use the thing of when we were like
8:53
scaling up AI models there was
8:55
technically can we do this at all this
8:56
seems crazy like no one had ever thought
8:58
about trying to do a run across 10,000
9:00
or 100,000 GPUs and that was going to
9:01
require stacks of engineering talent um
9:04
there was the capital requirements and
9:06
what was going to take to do this and
9:08
like how is there ever going to be a
9:09
business how can you think about taking
9:12
uh there was this sort of like cultural
9:13
stuff of researchers saying well if
9:14
we're going to get all this comput
9:18
something why not have to divide it up
9:19
among all these all these projects and
9:21
this also happens in kind of every area
9:23
I've looked at almost every area for
9:25
scale and breaking it down into the sort
9:29
of each difficult area or each reason
9:32
not to do it and trying to address them
9:34
one at a time that's been really
9:38
I'm going to push on that a little bit
9:41
there's very few people who've been able
9:42
to sort of repeatedly scale new products
9:45
and systems the way uh the OpenAI team
9:48
has over the years. But it seems like
9:50
one of the issues is there are all these
9:53
prior conditioning sort of mental models
9:56
and expectations humans have. And you
9:58
said things break. And one of the things
10:00
it seems often breaks that's hard the
10:03
hardest to refactor is is human the
10:06
human side of the the systems design,
10:09
right? Wherever there's human
10:10
implementers or there's uh human
10:12
participants in that. And so what have
10:13
you learned about humans at scale like
10:15
organizing humans at scale to
10:17
participate in a system that may not be
10:19
uh like just a redo of some past system
10:22
that they they get naively on at a
10:25
priority on first blush. Um,
10:29
I think like clear a clear goal, a clear
10:33
plan to get there. Uh, and like a clear
10:40
answer to the way that you're going to
10:42
get there and kind of how you're going
10:43
to make decisions along the way. That's
10:44
that's very important. So, um, you know,
10:48
if we go back to the example of when we
10:49
decided to scale up models, there were a
10:51
lot of people who were like, ah, this
10:52
isn't really going to work. It's going
10:54
to have these problems. It's also not,
10:55
you know, we need a more diversified
10:57
portfolio. But once we say no, we're
10:58
going to make a bet on scaling deep
11:00
learning, like that's our thing. If
11:01
we're wrong, we'll fail, but we're going
11:03
to do that. Here's why we're going to do
11:04
that. Here's what we believe about what
11:06
the state of the world could be like if
11:07
we get there. Uh, that's very powerful.
11:13
for whatever reason, um, we did not
11:16
evolve to be good at thinking about
11:19
exponentials. People have a hard time
11:22
imagining that scaling laws are going to
11:25
continue exponentially, that revenue
11:26
will grow exponentially, that an
11:28
organization can take on exponential
11:30
complexity. And in my experience, it
11:34
takes a lot of time to really reason
11:36
through first principles with people
11:37
about why why that can happen. Can we
11:40
take two examples uh to walk through
11:42
that? The first being tach and the
11:44
second being codeex. You know, both of
11:46
these have transformed. Can can everyone
11:48
hear? I'm going to try to project it.
11:50
Yeah. Okay. Um so let let me put in a
11:54
frame and you can challenge both the
11:55
assumption and then we can hopefully
11:56
reason to example what happened. In the
11:59
case of chat GP you know for a long time
12:00
in scaling of models a big mental block
12:03
that seem to be prevalent in the space
12:06
is what are these things going to be
12:07
useful for this is you know it's a
12:09
research uh sort of solution solution
12:12
chasing a problem research first
12:14
approach. It's not a product. Um and
12:16
then you know chat GPD came out and it
12:18
proved to the world that you know that
12:21
chat experience was a killer app for
12:23
general models um at scale for consumers
12:27
and then a couple of years later you
12:29
know it's clear that coding has been the
12:31
killer enterprise app. So what how would
12:34
you compare and contrast the systems you
12:35
guys used to discover those use cases
12:37
ship them scale them monetize them any
12:40
any salient learnings from those two
12:46
so we had made GPT3 and
12:50
we needed to make money cuz we wanted to
12:52
go scale up to, you know, a billion and
12:53
multi-billion dollar computers and we
12:54
had GPT3 and it was kind of interesting.
12:56
It was a cool demo but we couldn't
12:58
figure out a product to build around it
13:00
and we had been thinking thinking we
13:02
just couldn't do it. We had tried a few
13:03
things. They they hadn't worked. Um, and
13:05
so we knew the models were gonna get
13:07
better, but we also wanted to like start
13:09
a revenue engine sooner. And we said,
13:12
well, since we can't figure out what
13:13
product to build, we're just going to
13:14
put this into an API and we're going to
13:17
hope that somebody else can figure out
13:18
what product to build. And so we
13:20
launched in like, I don't know,
13:21
something in the summer of 2020 the GPD3
13:24
API. And initially, it kind of got no
13:29
traction at all. And then about a month
13:32
later, randomly, as far as we can tell,
13:35
it went viral on Twitter on the same
13:37
day, uh, a few different developers kind
13:39
of found got it to do something cool,
13:41
posted it, other people started trying,
13:43
and and then like a lot of people
13:46
started trying the API. Um, but it was
13:49
shockingly bad. If you go back and use
13:51
GBT3 or 3.5 um you will be astonished at
13:56
how bad the models were then uh relative
13:58
to the amount of excitement they
14:00
generated at the time. Uh so people
14:02
tried all these things and really the
14:04
only business that people got to work in
14:07
a significant way with GPT3 was
14:09
copyrighting. Um and that was like not
14:12
that great and not that exciting and we
14:13
were kind of like you know h it's just
14:15
going to have to wait for a better
14:16
model. But although no that was the only
14:20
business that was working, developers
14:22
had figured out how to like put in a
14:23
prompt and get and be able to chat with
14:25
it. And we saw this a lot like more
14:29
people were using they couldn't get the
14:32
API to work for their business, but they
14:34
were using their API key to just chat.
14:35
And we said, well, we can build a good
14:37
chatbot. People clearly want that. And
14:40
we had a new model. We actually had UPV4
14:42
done, but we had a new model we were
14:43
ready to release in between called 3.5.
14:45
And we had figured out a new kind of
14:47
post training where we could get the
14:49
models to do like a good job with
14:50
instruction following so it can make it
14:52
easier to chat with. And we said, well,
14:54
you know, the API is not working great.
14:58
Maybe it was like a 10 or a $20 million
15:00
run rate kind of business, but there is
15:02
this thing that people love. Uh, and
15:05
under the YC principle of see what your
15:07
users love and do that, we said we'll
15:08
we'll build a chatbot around it. And we
15:10
put that out and we still didn't think
15:12
it was going to do that well. Uh there
15:14
was it was really meant as like a
15:15
research demo uh to convince other
15:18
people that they should build chat light
15:19
products and pay us for the API,
15:22
but that went like crazy viral. And
15:24
another thing I had learned from YC is
15:26
when something really starts growing and
15:28
it's not very good, you have like a
15:30
guaranteed hit on your hams. And so we
15:32
had like five days where the traffic
15:35
would shoot up, fall off, and everybody
15:37
be like, "Well, that was just a hype
15:38
cycle." But then the next day it would
15:39
get to a higher peak, fall off again
15:41
later in the day. People would say
15:42
that's a hype hype cycle. By the fourth
15:44
or fifth day, I was like, I know how
15:46
this works. I know what's going to
15:47
happen. Like, we have the potential here
15:50
>> at a killer product. Um, and we knew we
15:54
could make it much better. We knew we
15:55
could we knew we had GPT4. We knew we
15:57
could keep scaling. Um, but by that
16:02
fifth day, we got everybody together and
16:04
said, "This is an emergency. This is a
16:06
good kind of emergency, but we have to
16:08
build a company and a product all at
16:11
Uh we then had like two months of crazy
16:14
scaling. Uh and then we said, you know,
16:17
we have to figure out a business model
16:18
later. For now, we're just going to
16:20
charge people so that we don't like run
16:21
out our compute bills. But that's
16:23
obviously not the long-term answer. That
16:25
also turned out just to work. Um and
16:28
that was the story of Chach. And then
16:29
there was so much utility that people
16:31
just had not gotten over the activation
16:34
energy to find that that has worked
16:36
really well. Um and then codeex.
16:40
Actually the plan before chatbt was that
16:42
we were going to go all in on code.
16:44
>> Um we knew these models could write
16:45
code. Uh we knew that they could be
16:48
really and we knew that that would be
16:50
like a valuable area. But then we had
16:51
this incredibly exciting thing happen.
16:53
Um but our kind of internal belief at
16:55
the time was that coding was how these
16:59
models would control things on computers
17:01
and robots were how these models would
17:04
control things in the physical world.
17:05
And if you made a smart enough model
17:07
that had sort of the actuators of
17:08
writing code and robot and driving a
17:11
robot, you could then kind of actually
17:14
get this intelligence to do stuff for
17:16
>> So, uh, then it took us a while to get
17:18
there. And then I think codeex got
17:21
really good by early this year, but with
17:24
5.5 is when we saw this real inflection
17:26
point where people are now like doing
17:29
just incredible things with it. And um
17:32
you know that we earlier in the class
17:34
we've talked about how the capabilities
17:36
pipeline uh is starting to look is
17:39
starting to become somewhat more legibly
17:41
standard across different research
17:42
groups. You got you know pre-training
17:45
mid-training post training. Then you got
17:46
the RL and supervised feedback loop. Is
17:48
do you think that's roughly like the
17:50
shape of the pipeline that allowed
17:52
codeex to you know go through a
17:54
capability jump and that will basically
17:55
stay stable now and consistent or are we
17:57
going to go through a major rewrite of
17:58
that pipeline? I think that is
18:00
definitely the current pipeline. I
18:02
expect we will go through a major
18:03
rewrite. I don't know when it'll happen
18:04
or exactly how. Um, but
18:08
it is a little odd to me that it's so
18:11
happens as a pipeline and doesn't quite
18:14
feel like the optimal solution. Um, what
18:18
would be an optimal solution in your
18:20
>> I think that's a research problem for
18:21
the AIS to figure out. Um, I think we're
18:24
at a point where and we've set this goal
18:26
that by September of this year, we will
18:28
use 500,000 A100 equivalent GPUs, like a
18:31
lot of computing power, as an AI
18:33
research intern, and by March of 2028
18:35
that we will have a full end toend very
18:38
talented researcher like figuring out
18:40
complete new architectures. Um, so I
18:43
think we are going to get like with the
18:45
current pipeline, the current
18:46
architectures, I think we're going to
18:48
get over the line of when AIs can do
18:50
incredible incredible work. Um, you
18:53
know, one of the things that you you
18:56
just described there
18:59
is you you we we've been talking a lot
19:01
in the class about systems frameworks
19:03
and analogies to make concepts from one
19:05
domain legible to other people who may
19:06
not have all the context in another and
19:09
that sometimes because of the
19:11
translation problem, you know, reasoning
19:13
by analogy is not helpful because then
19:15
errors compound. Yeah. Um right there
19:17
you said you know our goal is to try to
19:19
use it as an AI intern which obviously
19:21
is a very useful metaphor within the
19:23
context of you know Silicon Valley a
19:25
class that understands how these
19:26
pipelines work and so on and then as as
19:28
you scale actually that metaphor
19:30
globally people who might not have all
19:32
that context go start analogizing these
19:34
models in ways that they shouldn't be
19:35
like how should we think about the
19:37
limits of of that of of what are the
19:40
limits to scale of um what are the
19:43
product analogies the research analogies
19:45
you find most useful
19:46
within the valley and which one of the
19:48
what have you found about the limits of
19:51
those analogies scaling and now how do
19:53
you navigate between those two problems?
19:56
I I've been very interested in studying
20:02
I think what is happening is we are we
20:04
are in the process of creating a new
20:05
utility. This doesn't happen very often.
20:07
you know, electricity is utility,
20:08
internet's a utility, there water, I
20:10
guess there's not a lot of these. Uh,
20:12
and so there are not a lot of examples
20:14
that we can study for good metaphors or
20:17
learnings about how to explain this to
20:19
the world. Um, but I was recently
20:22
looking at what happened when
20:24
electricity became a utility. And it's a
20:28
good analogy for many reasons. It's
20:29
imperfect, of course, too. But the
20:31
electricity companies, at least the ones
20:33
I could find information about, they
20:35
didn't talk about selling electricity
20:36
cuz no one knew what that was or why
20:38
they wanted. It sounds like very scary.
20:39
It's this thing that's like going to
20:40
come into your house and it could kill
20:42
you in this like gruesome way and you
20:44
you know it feels sort of like very
20:46
different than the world before. Uh and
20:49
maybe they tried to sell electricity or
20:51
market electricity at first. I don't
20:52
know. But in any case, that didn't work.
20:54
And then what they started
20:56
marketing selling to people was light at
20:58
night. You know, we are going to what
21:01
you are getting from us is not
21:02
electricity. It's light at night. By the
21:04
way, you can use the same thing that
21:06
lets you get light for all these other
21:07
things. But people are like, well, why
21:09
would I want that? And they're like,
21:10
well, you know, it'll wash your clothes
21:11
for you someday. And no, no, it won't. I
21:13
can't. That's too far of a jump for me,
21:15
>> Um, so I don't know what our analogy
21:20
for this should be. Um, but I suspect
21:23
that even if even if we're totally right
21:26
and intelligence is going to become this
21:28
new utility that every company, every
21:30
customer, every government just needs
21:34
access to and is going to use in all
21:35
sorts of incredible ways and you will
21:37
have like a OpenAI token subscription
21:39
that you will plug into everything and
21:41
use to access everything and you have
21:42
running for you all the time and doing
21:44
this amazing stuff. I kind of don't
21:46
think at least right now the right way
21:48
for us to analogize that is we're
21:50
selling intelligence because people are
21:51
just like somehow not resonating. I
21:54
don't know what our equivalent of we're
21:57
selling you light at night is going to
21:58
be. But I think if we're going to become
22:00
a new utility, we need to find a way to
22:03
explain to the world what it means to
22:05
have this like intelligence pike that
22:07
you can just do whatever you'd like with
22:13
question that has emerged an emerging
22:15
property of this class of of having a
22:17
diversity of different speakers is that
22:19
the utility analogy has come up several
22:20
times but in reference to different
22:22
things. So Jensen likened like compute
22:26
to a utility um and why there should be
22:29
access and so on and talked about how
22:30
Stanford should pull budget and so on
22:32
and and and procure that as a utility
22:34
for everybody on campus whereas you just
22:36
likened the intelligence part to util
22:38
are both of these things true is one of
22:40
them true one is one more likely to be
22:41
true how should people reason about
22:42
compute as a utility versus tokens as a
22:45
utility and and by comput I mean here
22:47
chips versus tokens does that make sense
22:50
>> I think as a consumer as like a business
22:53
or an individual um you will think in
22:56
something closer to tokens or probably
22:58
even one level up from tokens. I don't
23:01
think you'll care very much about you
23:03
know where the hardware is, what
23:05
particular chip it is, what's powering
23:07
it. I think that stuff will be
23:08
abstracted out and what you will care
23:10
about is when you're interacting with
23:15
can you use it a lot? Is it cheap? Is it
23:17
doing a good job? Um so right now it's
23:20
like tokens. It may get as we move into
23:23
a world where we all just have like this
23:25
constant agent running for us, being
23:27
useful to us all of the time. Um, you
23:29
may think about it as even one level up.
23:31
But yeah, my my guess is is you when you
23:34
like pay for your cell phone bill,
23:36
you're like, "All right, I'm buying
23:38
access to airtime and some number of
23:40
gigabytes and, you know, it's going to
23:42
do all these things and I'll use all
23:43
these apps and whatever else." But like
23:45
what you think about paying for the kind
23:47
of internet utility in this case is just
23:49
like access to the whole system and the
23:53
particular hardware at the base station
23:55
and how it connects to the internet. You
23:57
don't think about that as much.
23:58
>> Um I know I could nerd out about utility
24:01
infrastructure for a long time but I
24:02
want to make sure we switch a little bit
24:03
to being relevant for the students.
24:05
Usually we have uh questions where we're
24:07
not hearing those today unless you're
24:09
comfortable. Oh, okay. Great. How about
24:11
that? Improv. Okay. Uh so one final
24:15
question to start getting the creative
24:16
juices flowing is um the final project
24:18
for this class or fiber 183 is the
24:21
oneperson frontier lab. So everybody
24:23
here is working on projects where
24:25
they're simulating being an individual
24:28
uh as a lab with access to all the right
24:30
tools. They've got hundreds of thousands
24:31
of dollars of credits from Cloudflare. I
24:33
think we've got some open AI tokens
24:35
maybe. But there's a bunch of compute at
24:36
their disposal. Um, what would you, if
24:39
you were in the class, what would you be
24:41
working on for your oneperson Frontier
24:42
Lab project? First of all, I think
24:44
that's an awesome project. Um,
24:48
I think this is top of mind because uh
24:52
you we we were just like talking about
24:54
utility frame frameworks. I think
24:56
there's a lot of very smart people
24:58
working on uh great training ideas and
25:02
we're going to have incredible models.
25:04
No matter what you all do, we're going
25:05
to have incredible models. I promise
25:07
here uh like pretty quickly but
25:11
I I think we have not invested enough in
25:13
being able to deliver at scale huge
25:16
amounts of cheap intelligence. So maybe
25:17
I would go work on like the inference
25:20
>> and how are we going to get this
25:22
incredible intelligence to be cheap and
25:24
abundant? Uh I think that's
25:26
underinvested in and and I think all of
25:28
the frontier labs are going to have to
25:30
become inference companies to a
25:31
significant degree. Um, okay.
25:36
It might be too late to pivot your
25:37
projects, but better late than never.
25:39
>> Work on whatever you want to work on.
25:42
>> Uh, okay. Let's start taking questions
25:43
and I'm going to moderate and try to be
25:45
not, you know, please try to be
25:47
productive and not spicy, etc. Remember,
25:49
it's a CS class, but up to you Sam is
25:53
>> Oh, we've got questions. Oh, perfect.
25:54
All right. First one, the questions
25:56
about your views on Yan Lun's view that
25:59
LLMs are a dead end. Um,
26:03
first of all, in terms of achieving
26:05
human level intelligence, these models
26:07
have already far surpassed human
26:09
intelligence in some ways and then
26:11
they're wildly worse than others. Like
26:13
for example, they seem much worse than
26:15
people are at very long horizon
26:20
kind of high judgment signal and tasks.
26:24
Um on the other hand yesterday we had
26:28
one of our models uh discover or
26:31
disprove a conjecture one of the airish
26:33
problems that had smart people had
26:35
worked on for a long time and a lot of
26:37
people a lot of smart scientists I don't
26:39
know if lun was one of them or not had
26:41
even quite recently said something like
26:43
that was not going to happen. Uh and
26:45
then like the model just did it and you
26:47
know now you have all these
26:48
mathematicians saying like is math over?
26:50
What does this mean for our field? So
26:52
clearly LLMs are capable of figuring out
26:56
new knowledge and clearly they are
26:58
capable of doing some things that some
27:00
intelligence tasks that humans just
27:02
can't do. Um they are going to scale
27:04
much further. So how much better and
27:06
what distribution of the tasks they can
27:08
do better than humans. We'll find out
27:09
but I suspect it's a lot. And the you
27:12
know in terms of this like lack of a
27:15
belief in the exponential we were
27:16
talking about earlier. Um, I think the
27:18
field was honestly held back by a
27:21
generation of scientists who just were
27:23
way too certain on what wouldn't what
27:25
what scaling was not going to produce
27:27
and then some people just looked at the
27:29
graphs and said, "Well, it looks like
27:30
it's continuing beautifully. Let's keep
27:34
I think world models are clearly
27:39
we'll need that for things like
27:41
robotics. Uh but betting
27:44
against LLM scaling at this point
27:48
uh feels quite misguided to me.
27:53
>> Does it get annoying to be the I told
27:58
there are these like Twitter trolls that
28:00
you know for years have just been like
28:02
it's not going to work. It's not going
28:03
to work. This is so dumb. Like you know
28:04
this is a fraud. This company's going to
28:06
fail. This research approach is going to
28:07
fail. And I used to get more bothered by
28:09
them. But I don't even like feel the I
28:11
told you so at this point. It's like you
28:12
were like she was nervous.
28:14
>> You're still going on about it. Like the
28:19
>> quite strong on our side and I don't
28:22
think it'd be that fun to say I told you
28:24
so. And also the fact that you're like
28:25
still saying we're wrong doesn't really
28:27
>> I think there's that kind of move on.
28:29
>> There's that saying that like insanity
28:30
is doing the same thing over and over
28:32
again when presented with data that is
28:34
not working and they keep repeating
28:35
that. And in a sense it's it's it's a
28:37
form of insanity. I think
28:39
>> I I think there's something that happens
28:40
which is if you make your identity about
28:45
thing is going to work or not work
28:48
and you associate yourself with that
28:50
belief and then the science or the
28:52
empirical results disprove you and
28:55
you're like too hung up on your
28:57
identity, you can't let it go. You can't
29:00
>> And I think this is like a important
29:01
reminder in both directions.
29:04
>> How do you see education?
29:07
Um, it clearly has to super adapt and I
29:11
am worried. I I thought by now it would
29:13
have. Um, the the I think if we continue
29:17
to teach and evaluate students
29:20
as if we were in a pre-agi world, um,
29:23
it's not going to work and it is going
29:25
to lead to like atrophy of learning how
29:27
to think or whatever. And I thought that
29:29
was going to be obvious enough that I
29:31
wasn't that worried. You know, when
29:32
Chhatbt launched, I was like, "Yeah,
29:33
we're going to have one year of like
29:35
students like cheating and not learning
29:37
that much. And then the educational
29:39
system is just going to redesign
29:40
itself." And there's and we're going to
29:41
teach people so much better. You know,
29:43
people are going to really
29:45
get projects where they have to they
29:48
have to use AI to be able to do it, but
29:49
they still have to like stretch their
29:50
brain more and think more and figure out
29:52
new things to do. And honestly, I
29:56
struggle to point to any significant
29:58
systemic change that I've seen in the
30:00
education system at large in the three
30:03
and a half years since Chad launched.
30:04
And I that was a prediction error for
30:06
me. I thought I thought that would have
30:07
happened. So I have no doubt that we can
30:12
uh like we have done with every other
30:13
technological leap before redesign how
30:16
education works so that you still have
30:18
to learn how to think. And there will be
30:20
some things like I I I am a person who
30:23
thinks by writing and I write a lot of
30:26
stuff that I never show anyone else but
30:28
it's still important to me to figure
30:29
something out and so I'm grateful that I
30:30
I learned to write. People say the same
30:32
thing about programming. Um so there
30:35
will be some things that we teach people
30:36
to do that machines can do better just
30:38
because it's helpful to teach them the
30:42
meta skill of thinking and learning and
30:43
that makes sense. But there are a lot of
30:45
other things where we should just
30:46
totally teach totally change how we
30:49
teach or how we learn or how we
30:53
if we don't do that, I think there will
30:54
be like significant atrophy in people's
30:57
critical thinking skills. Uh question is
30:59
what was your favorite class and what
31:00
what do you wish you had taken while
31:02
when you were at Stanford?
31:03
>> Does Stanford still do intro Sims? I did
31:06
like all the I did like three intros a
31:08
quarter my freshman year like and I
31:10
loved all of them. Uh they were all
31:12
super different. Uh I but looking back
31:19
was able to get such a broad exposure to
31:21
stuff and h have like a a very shallow
31:24
understanding of lots of different
31:25
fields was an incredible thing. If it
31:28
had not been for that I just would have
31:29
taken like CS and physics classes which
31:31
still would have been great. But um I I
31:35
think more about the stuff the classes I
31:37
took that were like totally random and
31:40
unrelated to what I do now but in some
31:44
gave me a perspective than I do I think
31:46
I would have like learned to program no
31:49
matter what. Uh so I and I didn't think
31:53
that at the time I was like kind of like
31:54
yeah you know this is this stuff is all
31:57
cool but it's mostly going to be about
31:58
like learning CS. Um, I only did two
32:01
years of school. Uh, so there was a lot
32:03
of stuff I wanted to take that I didn't
32:04
get to. Um, but that's kind of the
32:08
My question is, what is your spiciest
32:17
I I think with more time to think uh I
32:20
could come up with a much
32:26
I think AI is just going to keep going.
32:31
I think this is considered
32:34
I don't I don't think this is like
32:35
widely believed yet. And I think if this
32:37
were widely believed, there would be
32:39
like significantly more reverberations
32:41
that are happening through society right
32:43
now. And maybe I don't have the spicer
32:45
tag. Actually, maybe this is the high
32:46
order bit that if AI progress continues
32:49
on the exponential that it's on for
32:54
it's been three and a half years since
32:55
tragedy. If even if we're another three
32:56
and a half years on that same
32:57
trajectory, the world
33:00
the potential the way that society
33:03
what's society is capable of are just
33:05
completely different. Well, let me try
33:07
to prompt more thinking tokens on that
33:09
one. um you you have if we treated you
33:13
as a model like as a frontier model and
33:16
you have some inherent capabilities and
33:17
we're going we're going to try to elicit
33:19
capabilities that people don't know
33:21
about for the next few minutes. Um one
33:23
of them is that you've been postrained
33:24
now on you you've been continuously RL
33:27
on OpenAI as well as the external
33:29
feedback loop of the world on what
33:30
doesn't work and works and doesn't work.
33:33
So now if we're going to treat you as a
33:34
prediction engine for a sec, the prompt
33:36
is what are the three most likely forks
33:39
of the universe you see over the next 10
33:41
years and what is your what is your
33:43
probability assessment on each of those?
33:45
Does that make sense?
33:47
One that feels very important is uh like
33:52
how much is this technology going to be
33:54
very widely democratized versus how much
33:56
is it going to sit in a few companies. I
34:00
I think a world there are all of these
34:02
reasons why you could imagine the
34:03
default is that this gets concentrated
34:05
to a few companies and they become like
34:08
you know a significant fraction of the
34:10
wealth on earth that would obviously be
34:12
terrible and we work super hard to push
34:14
against that but I think that's going to
34:16
require like the will of the world to to
34:18
really avoid um because there is a sort
34:21
of a tractor state there and I think
34:22
part of the reason that we need to push
34:24
to this kind of utility model of the
34:25
world is that a it's quite unstable and
34:29
quite bad will feel quite unfair if a
34:31
few companies have all of this. But B, I
34:33
think there's a real alignment failure
34:34
and a very fragile world. Uh, and the
34:37
best way to get to a world we want that
34:39
represents like everybody winning and
34:41
everybody's values being represented,
34:43
everybody having agency is to just put
34:44
push this technology out into the world.
34:48
Um, but there will be a very strong
34:49
argument against that around sort of
34:51
safety and stability. And I think that
34:54
will be a big fork. And it's very
34:55
important and I encourage all of you in
34:57
your careers to push hard that this is a
35:01
It can bring us an incredible sci-fi
35:03
future. Life can be unbelievably much
35:05
better. We are going to incur some risk
35:07
to get there. But the risk of keeping
35:09
this concentrated in a handful of
35:11
companies even though we would be one of
35:12
these companies is not something we
35:14
should tolerate. So I think that would
35:15
be a big fork. uh in terms of
35:17
probability I think it's
35:21
the world should have such an interest
35:23
in it happening this way that I think
35:25
it's like 80% we end up on the
35:27
democratic path but there will be a very
35:30
strong safety message and you know there
35:32
will be a lot of power seeking people
35:34
who who want to concentrate the power
35:38
one of the problems with forecasting
35:41
this or that you have and we all have as
35:43
humans is once you make that forecast
35:45
then you of agency to affect the
35:48
forecasts, right? And the forecast for
35:49
>> Well, I mean, we're clear on what we're
35:51
going to use our agency for. Like, this
35:52
is what we believe in. We think that uh,
35:55
you know, we're going to do everything
35:56
we can to push it in this direction. We
35:58
just we see the forces in the other
36:00
direction. Maybe a related fork. Uh,
36:04
there's a lot of talk about like future
36:05
economic models and are we going to do
36:07
universal basic income? Are we going to
36:09
have everybody gets to like own a slice
36:11
of every company? Like, are we going to
36:13
is it capitalism with no change? Is it
36:16
like fullon communism? There's like a
36:18
lot of talk about this. One thing that I
36:20
think is not talked about much is how
36:23
specifically how we distribute compute.
36:26
>> So maybe a lot of the economy can work
36:29
in a way that it's going to work. And
36:31
I've actually I've become much less of a
36:32
even short-term jobs doomer. I've always
36:34
been optimistic we find new things to
36:36
do. But this may not be dup as disrupted
36:38
as I originally thought in the short
36:40
term. Um but we are seeing compute
36:43
shortages now. I can imagine them
36:46
getting much worse and I can imagine
36:48
compute being like the most important
36:50
utility that people need. Uh so if the
36:53
price of compute from a supply and
36:54
demand perspective gets way out of whack
36:56
then I think there will be a very
36:58
interesting fork about what it means to
37:00
equitably distribute compute. So you did
37:03
two very interesting things there which
37:04
you said on the economic side we might
37:07
have need universal basic income.
37:09
Everybody owns a piece of shares. You
37:11
know, one of the speakers in this class
37:12
is um Nikolai Tangjen who runs the
37:16
Norwegian sovereign wealth fund. He's
37:17
awesome. He's awesome. You know, the
37:19
Norwegian Sovereign Wealth Fund owns
37:21
1.5% of all publicly traded companies on
37:23
the planet. They also have effectively
37:25
universal basic income. You could argue
37:26
there's flavors of this already today
37:28
because, you know, the largest employer
37:30
now in the United States is the
37:31
government and you could argue like
37:32
large sections of that are a way for the
37:34
government to redistribute income from
37:36
taxpayers. So are these solutions that
37:39
actually need to be novel or just
37:41
reimplemented for this era? How do you
37:43
think about the novelty of those
37:44
solutions where we often you know in
37:46
Silicon Valley make have this tendency
37:48
to be like reinvent you know old things
37:51
from first principles and so should we
37:53
just look to existing systems and tweak
37:54
them. Um yeah, I don't think that these
37:57
deeply new ideas. Although I will say um
38:02
I am much more excited about people
38:04
having some sort of ownership stake than
38:07
a fixed monthly cash dividend,
38:09
>> Um and I I funded like a big universal
38:14
basic income study a while ago. I've
38:17
also watched what happens when people
38:18
like invest in startups and I know which
38:21
model I think like hits human psychology
38:23
better. So what I would love to see is
38:26
as leverage in the world shifts from
38:30
labor to capital which I think is going
38:33
that we find a way to have something
38:36
like a citizens wealth fund in the
38:38
country or in the world eventually where
38:41
you like you basically own a slice of
38:43
capitalism right a slice of these
38:44
companies. And then on the second fork
38:46
there on compute bottlenecks, you said
38:49
uh at some point when compute prices get
38:51
out of whack between January and this
38:53
year, my my current understanding is
38:55
based on data we've seen that H100
38:57
prices and Blackwell prices the spreads
39:00
between long-term reservations and spot
39:04
>> I don't know if it's that high anymore.
39:05
I think it got a little better. But
39:08
>> Or if you can even find H100s cuz
39:10
they're pretty much all gone for this
39:11
year. Does that sound right?
39:13
>> No argument. there's a gigantic comput
39:15
shortage. Yeah. So that that's a good
39:18
example of an of a systems problem right
39:20
now that's live. At least to some folks,
39:23
it feels like co, you know, for the
39:25
comput era, like all the toilet paper's
39:29
>> Why are people not freaking out about
39:31
>> Well, I think people assume we will make
39:33
big inference gains on the hardware we
39:35
have. Uh I also think there is a tsunami
39:38
>> but maybe the demand tsunami is even
39:41
bigger and people I think people should
39:42
be freaking out somewhat
39:43
>> and and would you say it's fair like how
39:46
long are we going to exist in a comput
39:48
at least you know based on current data
39:53
>> I think like other you you can't talk
39:56
really about like worldwide demand for
39:58
electricity without talking about the
40:00
price like it's there's an extremely
40:03
different demand about how much energy
40:05
people want to use in the world if the
40:06
price comes down by a factor of 10 or
40:07
goes up by a factor of 10 and I think AI
40:16
if we can make models
40:21
sufficiently smart and a sufficiently
40:23
low cost. I think demand is like kind of
40:26
uncapped and so in some sense as long as
40:28
we can continue to make progress on this
40:30
there will be a shortage forever and
40:32
things will be bid among above what the
40:35
price we think we think the price should
40:37
be even though people are getting better
40:39
smarter more whatever intelligence just
40:42
because you can use like
40:45
if we make really great personal agents
40:47
and you can have 10 of them running and
40:48
working for you all the time or 100 and
40:50
you know you want the hundred I think
40:53
>> it's a lot of inference
40:55
>> Awesome. With that, I'm going to give
40:56
you the swag for the class, which is
41:02
Thank you for coming. Thank you. Thank