Artifacts: Versioned storage that speaks Git
Hello everybody and welcome to this
today in agents week. Uh I am here to
talk about something that's very
exciting that has been uh redacted in
the past. We've been doing a lot of
stuff about this, but uh with me before
we even get started, I want to talk to
one of the blog authors of what we are
looking at today, a thing called
artifacts. But but uh first a person who
doesn't need any introduction at all,
but I'm going to make him do it anyway.
Matt Silverlock, can you introduce
yourself?
>> Of course. Uh thanks Greg. Um super
excited obviously to talk through what's
shipping today um and artifacts. Um I'm
Matt Silverlock. I lead product for
storage and databases and a bunch of
other stuff at Cloudflare. Um been here
many years. Yeah, super excited to kind
of talk about and whether you've read
the blog or not to talk about uh
artifacts and what it actually means and
what it is.
>> Yeah, totally. We've been we've been
pushing this for a while. We've been
saying redacted for a while. I kind of
like that the social media buzz that's
happening. It's nice to finally be able
to talk about this without it being all
all blocked out. Uh um I was thinking so
we have the blog post. Great blog post.
Read the blog post. Uh you're not the
only author. Let's give a little shout
out at the start. Who who are the other
authors on this?
>> Um it takes a village. So the authors
are even only a slight reflection of
everyone that's put the work in. But I
spent a lot of time with Matt Kerry and
uh Dylan on our team really working on
the blog and working a lot of what
Artifacts is today, but there's been a
lot of people behind the scenes. But um
yeah, huge shout out to Matt Matt and
Dylan who's been kind of grinding away
helping us get this over the line.
>> Awesome. And I wanted to share, if this
is okay, I'm going to share the new
product page that uh just launched
today. So this is super awesome and I
was thinking that we could kind of use
this as walk through this a little bit.
Man, I feel like I I am I'm just now
breathing in. It's been redacted to me.
So, I'm breathing this in. So, I've got
some questions about what's going on
here. So, let's just go from the top.
Why are why are we building this? What
What is this gorgeous page talking
about?
>> Um the page is one thing. So, you know,
I think it's probably no surprise to
anyone as we have all of our coding
agents, code review agents, sandboxes,
right? Um all of our harnesses, right?
um a lot of them rely on git repos for
you know actually managing and and
obviously committing the code and
sharing it with others right um even
just managing state before that sandbox
shuts down um and I think the challenge
is is like everything we built today for
version control hasn't really scaled for
agents it was all built for humans right
every kind of
>> social code network at GitHub or anybody
else right um
>> kind of seeing this like unprecedented
scale I think the uh the CO of GitHub um
posted about a week ago actually that
they said a 14 times year-over-year
increase in uh like traffic volume um
like git operations I think it was
>> on the network right and that's not like
>> 14 from a base of like a small startup
that's been around a week where like you
just scale up a few more VMs right this
is like 14x from a company already
operating at internet scale so
>> we kind of wondered
>> yeah I think there's something we can go
and solve here if we do it differently
for agents it might actually work
um what what is why is it a hard thing
for agents with git like using git just
at the if if we're just using standard
git what what is what's the problem
there
>> so I mean the good thing is actually
sort of lesser problem is like agents
are really good at git
>> um but
we're running tens of agents maybe in
the background right we get a bunch of
open code or codeex or cl code sessions
right you've got a bunch of sub aents
right
>> right
>> maybe you actually want to commit more
so you can always roll back so the
agents It's not like blowing away a
bunch of work. We've all seen the horror
story of someone like burning, you know,
a ton of tokens for an hour and
forgetting to commit, right? It's great
if you can get away from that. Um, but
also kind of sucks if you go to commit
and push and like, you know, uh, the
upstream is down and not available.
We've all been seeing that as well. And
so,
>> again, we're just not ready for the
volume um, that we have, but also maybe
the volume we actually want is we want
to commit more often. We want to push
more often. um how do you give every
agent you have like an isolated repo
that it can act on without crushing the
other one that's a shared infrastructure
um
>> right
>> at this kind of scale
>> avoiding avoiding merge conflicts and
all that right
>> in many cases you may not even want that
that social element right you may want
to actually have the
>> agent pull down an independent copy of
that repo say for code review go and
review that right isolate it act on that
right in its own right and then maybe it
post some comments up to a centralized
platform, right? Maybe your internal CI
system, things like that, right? But
like if you can isolate its actions
while it's actually working as much as
possible, um it ends up being really
great, right? Um if you don't have to
kind of have every agent clone your repo
from GitHub or get blocked if things are
down, that obviously makes your team
more effective as well.
>> That's awesome. So it's it's changing a
little bit, but thinking about how it
works. What does it Oh, I guess I was
just going to ask what does it look like
in code here? So, so should we should we
walk this a little bit about what it
feels like?
>> Yeah. So, um we thought you know again
you take away the human element. We're
not building
you know a social network for code.
We're not thinking about pull requests.
Right.
>> Right.
>> But agents again are really good at git.
Um
so there's sort of three ways we think
about this. Right. There's the
programmatic control plane. How do I
spin up one 10 millions of repos at
scale? one for every agent on the fly as
they need them or clone something from
GitHub on the fly, right? Um so what we
call artifacts, a few lines of code,
like millions of repos, right? You can
create that through workers. Um you can
issue read tokens, write tokens, you
say, "Hey, straight to my git client,
straight to my agent harness. Here's a
regular git remote. I just clone it. It
works on it. It pushes from it, pulls
from it. The agent doesn't have to know
anything other than git, which is huge
because agents really do know git. It's
in the training sets, right? you're not
teaching it a bunch of skills or hoping
that it like understands this magic new
API, right? It just talks git and your
orchestration can take care of that. If
you kind of look at the next example,
well, yeah, there's still a bunch of
stuff that you might want to pull down
and seed your agents, but then have them
work independently, right? Like again,
for code review or for snapshotting
something or maybe a template, right?
>> Yeah, we can still pull something down
and kind of clone from say GitHub. Say
you want to go and work on like the
workers SDK which is our Wrangler tool
chain. Maybe work on some stuff there.
Isolate and have your agent do a bunch
of work that commits it and persists it,
right? It's in a sandbox. So you don't
want on your local disc where it can get
lost. You don't want to push it up to a
pull request yet because you're not
ready to do that, right? Do you want to
work in isolation? Um artifacts let you
go and fork from something upstream as
well or even clone itself and have you
know 5,000 copies of the same repo
inside artifacts. Um, and then with the
last example, I was like, well,
>> it's kind of like a more more
disposable, right? It feels like you get
>> Exactly. Yeah.
>> Yeah.
>> Yeah.
>> Yeah. And like maybe it's disposable,
maybe it's not, but you kind of get to
make decision as you go.
>> Yeah. Yeah. Did it Did it get it? Oh, it
did. This isn't it. This isn't it. This
isn't it. This one's it. Yeah. And
you're Yeah. Yeah. Cool. Correct me if
I'm wrong. We don't need to just run
this in workers.
>> Yes. Exactly. Like right if you do, but
you might run your control plane
somewhere else somewhere else, right? In
another cloud, right? you might want to
still orchestrate creating these repos
using artifacts before we've even used
workers, right? Um we expose this over a
regular HTB API. Um we have language
specific SDKs, you know, TypeScript, Go
uh and Python. Um you obviously can
co-generate these day particularly in
the world of agents, right? Uh language
specific SDK if you have something in
Rust or Elixia um or any other sort of
language that you you know that your
team might be using, right? Um and use
that to manage artifacts. like we want
this to be a case where the control
plate side can be operated from anywhere
and then the part that I'll sort of talk
through now is also the actual like git
side right like git is a really really
powerful protocol
>> for managing like versioning like again
why teach agents anything else if git is
right there it's very very good they're
very good at it
>> um but it's also kind of nice if your
environment say maybe you actually are
in a worker or maybe you're in a nodejs
somewhere you can host nodejs test or a
Python application. Um what if you don't
want to run a full git or spawn out to
git um in that process, right? It's kind
of heavy weight. It doesn't really work
in a lot of service environments in a
way that you might expect. Well, with
artifacts, we also let you interrupt uh
interoperate with it um via language
case and via those APIs too. So you can
do some you can commit files, you can
clone those repos down inside that
language and have that as like you know
say a JavaScript object for those files
as well. Um there's a lot of those sort
of sandbox or lightweight use cases say
like with dynamic workers which we just
dish up recently right to sort of have
sort of sort of lightweight workers that
you sort of explore on the fly they
still might want some gitl like concepts
where they're acting on files and
committing files just without the the
full git client embedded and so we've
sort of taken two approaches there.
>> Super cool. So walk walk me through that
real quick. Let's let walk me through
the we've got a lot of lot of new
concepts coming in. So so walk me
through a dynamic worker that has this
git or is making use of this git. walk
me through that really quick through
this artifacts.
>> Yeah. Um, great question. So, you know,
I've got uh an artifacts namespace,
right? I can create as many repositories
inside that as I want. Could be
thousands to millions or more.
>> Um, say I'm spinning up a dynamic worker
that I want to go and execute some code
or tool as part of an agent harness.
>> Um, but I want to give it a place to
maybe persist its output. Um, persist
some files, right? Um maybe it I wanted
to take some files that it's already
worked on in the past in that dynamic
worker and that artifacts repository
could potentially be tied to the same
customer or agent right that have the
same ID. I I can address that how
however I want. When I spawn that
dynamic worker I can pass in a reference
we call a binding um to that repository
and that dynamic worker can then act on
just that repository on the fly. I if it
wasn't created before, I can create it
at the same time I create dynamic
worker. If I just need to persist those
files and then pull them out outside of
the user code, then I have access to
that in my control plane.
>> And I've got again not just git, but
I've got this like versioned file
system.
>> I was talking to a friend who's a
founder about this and he's like
>> the get stuff's really cool, but I
actually have this use case where like
my customers build workflows and I have
these config files and these workflow
definitions and they want to be able to
roll back to them in different cases,
right? for like part of their product
surface and it's like a version file
system that I can go and do that with is
really kind of powerful, right? And get
semantics just make it kind of easy to
understand.
>> Yeah. Yeah. Yeah. But and for the humans
as well, I think it's also it's readable
for us humans and also for the agents. I
had I hadn't thought about that. I think
that's neat. So like a dynamic worker
could be versioned, you know, I've been
I've been playing build some stuff and
like I've been wanting to like find the
right way to show code and then let them
go and edit that. And so that's that's a
super that's how to do that. I feel like
yeah yeah yeah iterate inside of a dyn
dynamic worker on on building cod that's
awesome super cool now that totally
landed for me thank you for that okay so
um
uh here's some use cases that we got so
agent workspaces we've been talking
about that one let's talk about config
versioning here a little bit give me
give me walk walk me one of those
>> yeah so I sort of partially touched on
that but I think we go a little bit
deeper like you know if I'm talking
about you know um again a startup that
like wants subversion like customer data
but like again there's a lot of cases
like where if you think about git what's
git really good at git is really good at
storing lossful objects in versioning
them and letting you version control
those versions but also like figure them
out chronologically you can roll back
you can revert right you can go and
check out a previous commit and so kind
of turns out to be particularly powerful
if you have like again as part of your
product surface right um maybe user
generated configuration right for their
products or their product surface
You could spin up a artifacts repo for
every user on your platform. They never
have to know about the git. You're not
exposing them git, but you're using it
to version the artifacts or the files or
the objects that they're operating them
on. You can diff them like code if you
want to dip them like code. You can roll
back. You can expose that roll back
capability to users. And now you have
this way for users to maybe even flip
between versions of config and snapshot
and make your platform a little bit more
programmable. As a result, we've even
been talking about ways we want to use
this inside cler for where there's kind
of git semantics and the way artifacts
work which is distributed
>> um for some of this for our internal
services where it'd be great if I have
this like gitl like place where I can
roll back to a commit that's yesterday
for known working um that is config. I'm
not even thinking about git. It's just
again a really good way to express this
kind of versioning system.
>> Yeah, I think like I we've all built
those, right? We've all gone and built
like a change log sort of thing, right?
And I guess it would totally make sense
if it was in Git. That that's awesome. I
I my You got me going. My My brain is my
brain's going like crazy right now on
what we could do. And then of course
platform uh manage repos is the other
one here. Uh
>> Oh yeah, that's true. Right. So like so
you're going to go you're going to go do
your Terraform stuff. That's what that's
what we're talking about here, right?
>> Yeah. Exactly.
>> Yeah. Again, you've got could be like it
could be notebooks, could be
infrastructure as code use cases, right?
Um content that's generated like web
assets or JavaScript CSS assets, right?
As like part of maybe like a web
platform you're building, right? All of
those things
>> really work when they're on like version
control. Again, get just a really
expressive way to sort of solve that.
But it's really nice. You don't have to
run the get infrastructure. You have to
secure it. You don't have to scale it.
You don't have to think about how you
spawn them at runtime and clean them up,
right? Or make them durable. I think it
obviously if um you know a particular
user or uh agent is dependent on a
particular repo well like you want that
to be like distributed you want that to
be highly available um if you're then
having to replicate it and deal with all
the storage constraints and do that like
you know that is a lot of toil and work
on your teams like our job is to just
make that work all the time. If you want
to go and check that repo out it should
be available no matter where that
agent's running in the world.
>> Awesome. I I I love it. What are some of
the unlocks that this gave you that you
didn't see coming? Like uh cuz I'm
feeling it. You're telling me these
unlocks and and you're you're I didn't
think about that.
>> Yeah. So, one thing we've been starting
to use this for uh internally and
actually talked to a bunch of customers
about is in our sandbox use case, right?
And so, um you know, we've got this
internal sort of like what called
background agents inside cloud players
that can be driven from chat and you can
say, "Hey, go work on this ticket,
right? Go and solve this problem. don't
grab these logs from like our elastic
search stack and then go figure out give
me a first pass at like a diagnosis on
this problem that I'm trying to solve.
Really, really powerful, right? Kind of
these SR type pieces sort of like you
know ad hoc engineering task you can go
solve. There's a lot of cases where I
want to go and share a link to somebody
else of like here's a session I was just
running. I kind of got stuck or like
here's my thinking on this like what is
your thought?
>> And you know I'll get to this point in a
moment you're like okay cool great like
I've seen that before. How cool would it
be if I could just from that link go
fork and get an isolated clone of that
whole session and all the working files,
right? All the files that
>> that it was working on, all of the
potentially the config or the notes that
it's take taken or any other files from
other upstream repos at that exact point
in time and go, hey Ben, here's a here's
a version of this. Can you go like work
on this instead of this like four call
this multiplayer type environment? Turns
out artifacts is really really powerful
there because artifacts can just clone
its own repos. You can just call clone
as many times as you want on any
independent repo. You get a fully
isolated copy. And now I or say Ben can
go and work on the thing that I was
working on without touching or messing
with my workspace, right? With his own
set of privileges. He can go work on
that and go, "Hey, actually by the way,
I was working on this overnight or I was
in a different time zone and I think
I've like kind of cracked this problem
and like show me back the results,
right?" That's really really powerful.
So we can kind of separate this stuff
out. And so it's not just about I said
not just about the git part. That's just
a really really powerful primitive and I
think you know being able to speak git
is sort of the language of agents in
many ways but you end up being able to
again like if you can just arbitrarily
copy these on the fly without having to
like pull all the like bigger repo
constraints or things like that it can
just fork fork fork fork um yeah now
I've got this kind of like multiplayer
like concept right and I think that's
comes back to how we built this out on
durable objects which has always sort of
been inherently this sort of multiplayer
concept of of coordination
>> right and so I was that that was my
question. It's all durable objects all
the way down. This is a durable object.
>> It always is. Yeah.
>> Yeah, it always is. Can you tell me a
little bit more about the implementation
details? So, how do how do we make it
work with Git? Let's get a little nerdy
here.
>> Yeah. So, this is kind of actually how
we kind of brought this together really
really fast. You know, we've got dribble
objects, our sort of, you know, best way
to describe it is sort of like stateful
workers or stateful serverless
functions. I think we found that's the
the best way to resonate. It's it's not
quite comparable to anything else, but
they have an embedded SQLite database.
They have state. They can spin up in any
place in the world. They're It's in the
name. They are durable, right? Like
they're they're the storage is
replicated under the hood. Great. Today
they speak, you know, workers. They
speak HTTP. You can speak RPC to them.
Like gets another protocol. Git works
over HTTP. Git has its git server is is
non-trivial, but we can also run wasn't
inside Drupal objects. And so the way
this came together is that um one of the
folks on team Matt actually had a zigg
implementation of a get server and we
said what if we ran that inside Drupal
objects so that you get
>> as much git API coverage as possible. So
again it's so any git client just works
you're not thinking about like does this
particular git feature work does this
other not work like we have almost
complete coverage as much as possible
right we use a git client real git
clients to test our API service which is
awesome too so that it's fully end to
end tested but it's fundamentally like a
git server running inside a durable
object um and every repo is represented
by a durable object the cool thing is
gets hyperefficient in terms of storage
so even like some of the larger repos
like say like our own like workerd or or
workers SDK for sort of 200 to 500 B.
Something like Next.js is about 2 and a
half gigs. Um, so they fit really well
within sort of like the 10 gigs at a
boundary of a durable object. Again, the
cloning semantics is powerful if you
want to sort of clone them on the fly.
That works really well with durable
objects. And we've already kind of
proven that this can scale to, you know,
tens of millions to hundreds of millions
of objects under the hood. And so we
didn't have to go and rethink this whole
architecture in this world of agents. We
had this primitive that really, really
works and that's really efficient. Um,
there's a lot of optimization we need to
do still on the WASM layer and make that
kind of work, right? We're launching in
beta today, but we spent a lot of time
kind of testing this instead of banging
against it. It turns out this kind of
primitive where how can I create this
thing on the fly almost instantly and
then address it from any part in the
world. That could be a durable object, a
multiplayer server, right? It could be a
git repository that any agent in my
fleet can talk to and pull as it needs
or my 10,000 agents can go and have
10,000 repos on the fly. Whichever way I
want to sort of split that and so it's
really really powerful for us of not
having to worry about these
infrastructure concerns. I think my last
thought there is really all the
parameters we've been building as sort
of part of the val platform at Fler and
part of workers right let us go and do
these things really really quickly which
is great. We're not spending two years
building the foundations on the
infrastructure to go and ship this. We
say, "Hey, actually, I think there's a
problem we can go solve for customers."
>> Nice. So awesome. And I think people are
going to build on top of it as people
do, right? When we get we give them the
primitive, they start building on top of
it. They figure out what the use case is
even before we That's a great use case
for it. So super excited.
>> Yeah. I' love to see someone build like,
you know, something in the social code
sharing space or more in the multiplayer
side, right? I think there's going to be
a tons of ideas that come out of this
and hopefully we've seated a few of them
in the docs and in the blog posts and in
some of the examples we're working on.
We'll show some more of this stuff over
the coming weeks as well. But I'm really
really excited to see what people build.
>> Cool. Awesome. Same. Same. Thanks so
much for jumping on here and thanks
everybody for watching from home. Any
last things to drop here, Matt, before
anything you want want them to do? The
people watching this?
>> Yeah, go read the blog if you haven't
read the blog. That's probably the most
important blog. U today it's one of the
feature blogs. You can go to
developerscloud.com/artifacts
uh for the docs and just go get started.
Um if you're wondering how do I get
started? Is there a wait list? No wait
list. It's public beta. You can go start
today. You just need a workers paid
plan. That's the only restriction right
now. And it eventually it'll come to
free as we work through the beta as
well.
>> Let's talk about workers paid plan
really quick. What does that mean?
>> Workers play. Uh great question. $5 a
month gives you access to workers, D1,
durable objects, KV, pretty much
everything on the workers platform. Tons
of included usage up front before you
even start paying you about that $5. Um
and that does include artifacts as well.
Um and so that's really the only area to
get started and not again not going to
put you on a wait list to make you wait.
Just want you to get started and start
building. So,
>> I know that people have been asking for
this and trying to solve problems that
this is probably going to solve for
them. So, I'm I'm super excited and I'm
so glad that it's out of redacted time
and we can talk about it. We can start
getting the conversation happening
around this. Uh, thanks everybody for
hanging out uh during agents week again.
More stuff about artifacts and all sorts
of stuff that we announced today coming
out real soon. So, uh thanks for hanging
out. Thank you Matt for being here with
us and uh we'll see you next time.
Get the TLDR of any YouTube video
Transcribe, summarize, and repurpose videos in 125+ languages — free, no signup required.