AI, OpenClaw, Claude Code and Ambient with Travis

yeah absolutely so i think that first of all this was a real eye-opener for people who were using OpenClaw. They had their unlimited API keys, which is a very cost-effective way of engaging with the services, providing them with a lot of value. And then all of a sudden, the rug was pulled. And Anthropic basically said, your platform, which is providing you that value, is no longer a valid use of our

0:09:25 - 0:09:31

service. And that is the paradigm for closed AI. Anything that conflicts with their corporate

0:09:31 - 0:09:38

agenda, that could give them bad press, that could conflict with their copyrights, or trademarks,

0:09:38 - 0:09:47

whatever, they're just going to rug pull on. And, in this case it was peter's like hob

0:10:00 - 0:10:02

you know, someone like, uh,

0:10:00 - 0:10:03

you know someone like uh

0:10:09 - 0:10:13

do we have my audio? Can you hear me? Okay. Okay. Yeah.

0:10:13 - 0:10:14

That's working.

0:10:15 - 0:10:18

That's the, that's the first, I'll probably just repeat the last bit.

0:10:18 - 0:10:20

You know, the,

0:10:20 - 0:10:24

the whole rug pull scenario there where you had this very useful product that

0:10:24 - 0:10:30

was suddenly not reasonably priced because the unlimited API key went away

0:10:30 - 0:10:36

due to anthropics sensitivities, that is where we live in closed AI.

0:10:36 - 0:10:38

That could happen to any application.

0:10:38 - 0:10:39

That could happen to any platform.

0:10:40 - 0:10:41

So that's one thing to kind of note.

0:10:43 - 0:10:47

The other thing to kind of understand is...

0:11:00 - 0:11:02

Just lost the Travis's audio there for a second.

0:11:02 - 0:11:04

Trying to get that back on.

0:11:04 - 0:11:07

Just switch this to all AirPods all the time here.

0:11:07 - 0:11:08

Just a second.

0:11:08 - 0:11:08

Yeah.

0:11:09 - 0:11:11

Can you hear me now?

0:11:12 - 0:11:13

It's all good.

0:11:13 - 0:11:13

Yep.

0:11:14 - 0:11:14

We'll go with that.

0:11:15 - 0:11:16

All right.

0:11:16 - 0:11:17

So I just switched audio sources.

0:11:18 - 0:11:23

So this was also sort of Claude talking to itself.

0:11:24 - 0:11:26

You know, this was like a simulation of Claude talking to itself. You know, this was like a simulation of Claude

0:11:26 - 0:11:31

talking to itself on all Claude servers. And people really thought that this was a local

0:11:31 - 0:11:35

thing. You know, people were buying these Macs and they were, I think that's where the excitement

0:11:35 - 0:11:40

was. So that's something that should be a good market signal is people were excited about the

0:11:40 - 0:11:45

idea of a local assistant. They thought that by buying a computer

0:11:45 - 0:11:47

that they could get that assistant.

0:11:48 - 0:11:49

And maybe they didn't understand

0:11:49 - 0:11:51

that they were still sharing all of their data

0:11:51 - 0:11:53

with Anthropic.

0:11:53 - 0:11:58

And so I think that that shouldn't be discounted,

0:11:58 - 0:11:59

that signal.

0:12:01 - 0:12:07

And then, you know, I guess sort of the final thing, and this relates to the whole inference

0:12:07 - 0:12:13

economy is that I think people suddenly understood the level of demand for tokens that you get from

0:12:13 - 0:12:20

agents and how that is very, very different from the demand profile of people. You know,

0:12:20 - 0:12:48

we're single threaded. We have these little chats. It doesn't generate very many tokens. It would be hard to imagine how that would generate like the volume of tokens necessary for trillions of dollars in CapEx. But once you start running 100 agents and they're all doing things on your behalf, it actually becomes quite easy. And it's quite easy to saturate those platforms. You know, ZAI, Kimi, like their servers were melting down. They had to stop taking U.S.

0:12:48 - 0:12:54

customers because the inference workflows were so high. And I think that, again, like that was a

0:12:54 - 0:13:00

light bulb moment for a lot of people who like a year ago were saying there's no value in inference.

0:13:00 - 0:13:05

Inference is going to zero, et cetera. Inference is actually the scarcest commodity.

0:13:06 - 0:13:07

We can't keep up with the demand.

0:13:07 - 0:13:10

That's why trillions of capex is going to the servers.

0:13:10 - 0:13:13

The more agents we get, the more that will increase exponentially.

0:13:14 - 0:13:16

And so that's kind of the third takeaway for me.

0:13:17 - 0:13:20

And yeah, I think it's a masterstroke on Altman's part to pick it up.

0:13:20 - 0:13:26

At the same time, I think that that platform loses a lot of its appeal because it

0:13:26 - 0:13:32

feels like it's not an interesting, cool hobby project that maybe everyone can fully participate

0:13:32 - 0:13:41

in anymore. Yeah. Yeah. I kind of agree. Um, the, I suppose the law and I seen some of the

0:13:41 - 0:13:45

grassroots community events around New York and SF where there were just so

0:13:45 - 0:13:51

many people just packed into a room. I hadn't, hadn't seen that sort of stuff since maybe like

0:13:51 - 0:13:56

early Ethereum days or something like that. That's the only thing I can kind of like tie it back to

0:13:56 - 0:14:00

really, or maybe some like the early Bitcoin meetups and stuff. They're just this huge excitement

0:14:00 - 0:14:09

around, right. This is like a bit of a movement and then it kind of immediately gets corporatized uh which is which is quite funny i don't know how long it will i should

0:14:09 - 0:14:13

the writing was kind of on the wall when like in the name change when it had like open claw in it

0:14:13 - 0:14:18

i didn't actually put two and two together but yeah i think like it's a really really good move

0:14:18 - 0:14:21

because i feel like opening eyewear on the ropes a little bit i don't know if you've seen the

0:14:21 - 0:14:29

anthropic advertisements about them putting ads in and there was like the lag as the the actor was speaking like so good like

0:14:29 - 0:14:35

so so good um but i feel i kind of feel like they've pulled it back around i haven't actually

0:14:35 - 0:14:41

used codex myself i've been like stuck into cloud code for for a while now but need to take that for

0:14:41 - 0:14:47

a spin i don't know if you've had any experience using Codex. Yeah, so shameless plug,

0:14:47 - 0:14:51

we have an app out there called Ambient RPG,

0:14:51 - 0:14:53

which is a fully immersive, old-school,

0:14:54 - 0:14:56

text-based adventure game that's generative,

0:14:56 - 0:14:58

that creates worlds on the fly,

0:14:59 - 0:15:01

kind of like Return to Zork type stuff.

0:15:01 - 0:15:06

But I built that all using a combination

0:15:06 - 0:15:09

of light hand editing and Codex.

0:15:09 - 0:15:11

And Codex is amazing.

0:15:11 - 0:15:16

That code base that I built is over 160,000 lines of code.

0:15:16 - 0:15:19

And Codex just kind of zooms through it flawlessly.

0:15:20 - 0:15:22

And I was having major issues with Claude.

0:15:23 - 0:15:24

Now, I've talked to a lot of other people

0:15:24 - 0:15:27

who would just have a gut reaction when I say that.

0:15:28 - 0:15:29

Like, no, like you're just using it wrong,

0:15:29 - 0:15:33

skill issue with Claude Code.

0:15:33 - 0:15:34

I'll tell you what,

0:15:35 - 0:15:37

like I really have tried all the magic beans.

0:15:37 - 0:15:40

Like I've tried the RAL loop.

0:15:40 - 0:15:42

I've tried, you know, these different plugin workflows

0:15:42 - 0:15:44

and agentic factories and all this.

0:15:44 - 0:15:46

And I still keep coming back to Codex.

0:15:46 - 0:15:47

I'm with Peter on that one.

0:15:48 - 0:15:48

Awesome.

0:15:49 - 0:15:52

And where do you think we're at with...

0:15:52 - 0:15:54

I'm seeing a lot of people go down the rabbit hole

0:15:54 - 0:15:55

and everyone wants to take everything

0:15:55 - 0:15:56

to the exchange really quickly.

0:15:57 - 0:16:00

And you're getting sub-agents being spun out

0:16:00 - 0:16:02

of one open-claw instance.

0:16:02 - 0:16:04

And then you're seeing kind of like people

0:16:04 - 0:16:05

who said they've never

0:16:05 - 0:16:09

been technical in their whole lives but they've got like a team of 20 working for them i feel like

0:16:09 - 0:16:14

it's been a little bit sensationalized but you can kind of see some of the merit in it i wouldn't say

0:16:14 - 0:16:20

i've seen people like spending like 20 30 40k on like their home setup of mac minis and someone's

0:16:20 - 0:16:24

asking them the question well what's what's the what's the roi where's the productivity coming

0:16:24 - 0:16:25

from and they can't really give them a straight answer but um yeah what do you think about the Someone's asking them the question, well, what's the ROI? Where's the productivity coming from?

0:16:26 - 0:16:26

And they can't really give them a straight answer.

0:16:30 - 0:16:30

But yeah, what do you think about the kind of orchestration there,

0:16:31 - 0:16:32

multi-agents?

0:16:34 - 0:16:34

Is there anything you've been kind of seeing,

0:16:36 - 0:16:38

or even anecdotally that you've seen that's quite interesting?

0:16:40 - 0:16:49

Yeah, I have a few different thoughts about that. So first of all, I think that there's an interesting divide in approaches. So on the

0:16:49 - 0:16:57

opening eye side, you more often see build one strong agent that is the top level sort of root

0:16:57 - 0:17:14

node that delegates. And that's a whole paradigm. And then there's this other paradigm of like Gastown, which is essentially have a light overseer capability, but really there's a bottom-up creation of things.

0:17:14 - 0:17:29

And it kind of mirrors, not perfectly, but it kind of mirrors where we went in software with sort of monolithic systems versus services and microservices.

0:17:30 - 0:17:35

And I think that two things are true.

0:17:35 - 0:17:39

So the first thing is that microservices can scale much better.

0:17:42 - 0:17:50

But they're also much harder to build and you have all sorts of dependencies related to

0:17:50 - 0:17:55

network latency and the performance of the individual microservices that you don't have

0:17:55 - 0:18:00

to worry about as much as compared to a monolithic system i think that the microservices world is a

0:18:00 - 0:18:07

lot closer to what we'll end up with in the agentic economy. And that's something that

0:18:07 - 0:18:14

makes me bullish for what we're building because Ambient provides verified inference,

0:18:14 - 0:18:19

which is kind of a mathematical guarantee that model is behaving in an expected way,

0:18:19 - 0:18:24

is taking your actual prompt, producing a particular actual text. And that removes trust.

0:18:25 - 0:18:27

And trust is a huge issue with sub-agents.

0:18:27 - 0:18:28

You know, we're looking at sub-agents

0:18:28 - 0:18:30

delegating to sub-agents,

0:18:30 - 0:18:32

delegating to sub-agents.

0:18:32 - 0:18:35

And the thing that people forget

0:18:35 - 0:18:37

because this infrastructure

0:18:37 - 0:18:38

was all running on Anthropic,

0:18:38 - 0:18:40

which is a trusted entity,

0:18:40 - 0:18:41

is that in the future,

0:18:41 - 0:18:42

there are going to be all sorts

0:18:42 - 0:18:44

of different inference providers

0:18:44 - 0:18:46

who are going to be underneath the sub-agent layer.

0:18:47 - 0:18:50

And there's an opportunity for mischief in that.

0:18:51 - 0:18:54

It could be quite economically advantageous

0:18:54 - 0:18:58

for inference providers to suddenly drop the level of intelligence

0:18:58 - 0:19:02

of your agent, and then they win a trade against your agent.

0:19:02 - 0:19:05

Or maybe corrupt an answer,

0:19:08 - 0:19:08

like a yes, no answer that comes back.

0:19:09 - 0:19:09

And all of a sudden,

0:19:11 - 0:19:12

like a loan application is approved when it wouldn't have been approved before.

0:19:12 - 0:19:14

Like you can really juke the results of these things.

0:19:14 - 0:19:19

And right now, the way that things are designed

0:19:19 - 0:19:20

where verified inference doesn't exist,

0:19:21 - 0:19:22

there would be no way to trace that.

0:19:23 - 0:19:29

There would be no quality guarantee for any of the sub-agent infrastructure, and there

0:19:29 - 0:19:31

would be no way to trace what actually happened.

0:19:31 - 0:19:33

And that's where I think blockchain's kind of come in, too.

0:19:34 - 0:19:41

You need some immutable record of what went on, so you have a hope of being able to decode

0:19:40 - 0:19:43

able to decode happenings in these systems.

0:19:41 - 0:19:43

happenings in these systems.

0:19:44 - 0:19:50

So yeah, this is kind of my thought about sub-agent orchestration.

0:19:50 - 0:19:52

It's very hard to do.

0:19:52 - 0:19:54

It's probably the future.

0:19:54 - 0:20:02

And it requires verified inference and proselytist infrastructure to spread its swings.

0:20:00 - 0:20:02

proselytist infrastructure to spread its swings.

0:20:04 - 0:20:07

So I think this is where we're at a really strange

0:20:07 - 0:20:10

and interesting, at the same time, moment where

0:20:10 - 0:20:15

I don't know how you guys are thinking about it, but is this the first time that

0:20:15 - 0:20:19

people who are wanting to spin out an L1, a network, a chain

0:20:19 - 0:20:23

are actually thinking about, well, users are not actually

0:20:23 - 0:20:31

going to be humans. Users are actually thinking about maybe well users are not actually going to be humans users are actually going to be agents first so like how do you set up your chain architecture

0:20:31 - 0:20:36

or your network architecture to not only just enable that but kind of like encourage that

0:20:36 - 0:20:42

in a productive way like i seen anatoly from solana post a tweet the other day and i mentioned this

0:20:42 - 0:20:47

a couple of days ago so forgive me if you've tuned into this and you listen to this again,

0:20:47 - 0:20:48

but the tweet was,

0:20:48 - 0:20:52

2014, oh, your chain is just full of bots, like a negative emoji.

0:20:52 - 0:20:54

And then 2026, oh, your chain is full of bots,

0:20:54 - 0:20:56

like with a question mark with a positive emoji.

0:20:56 - 0:20:59

And I feel like we're really just at the start of that,

0:21:00 - 0:21:03

and you're seeing all the big networks and centralized exchanges

0:21:03 - 0:21:05

scrambling to have X402 integrations

0:21:05 - 0:21:07

and everything that comes with enabling agents.

0:21:08 - 0:21:11

You're in a very unique position to comment on this.

0:21:11 - 0:21:12

How are you guys thinking about that?

0:21:14 - 0:21:19

I think that the value accrues to the inference layer

0:21:19 - 0:21:22

and agents go where the inference is.

0:21:20 - 0:21:22

and agents go where the inference is.

0:21:23 - 0:21:27

You know, it's, if you're an agent,

0:21:27 - 0:21:31

like you care about being able to accomplish your task

0:21:31 - 0:21:33

and you're not going to be able to accomplish your task

0:21:33 - 0:21:37

if a centralized provider bans you,

0:21:37 - 0:21:41

if there are rate limits or scalability limits

0:21:41 - 0:21:44

associated with your usage.

0:21:44 - 0:21:48

If you can't inherently trust a provider,

0:21:50 - 0:21:59

you're going to go where there is a stable, high volume of compute that's available. And you're

0:21:59 - 0:22:05

going to get as close as possible to that because the further you get from it, the more you're going to be

0:22:05 - 0:22:13

charged by middlemen. And that's kind of our thesis, Ambien's thesis, is that we love bot

0:22:13 - 0:22:19

usage from all quarters. We think that bot usage is ultimately going to accrue to our network

0:22:19 - 0:22:28

because it's the closest place to the high intelligence model that's being served scalably.

0:22:28 - 0:22:32

And then things sort of spread out from there.

0:22:33 - 0:22:37

And, you know, that's sort of what's happening on the closed source side.

0:22:37 - 0:22:46

The bots kind of have to align to one of the closed providers that can provide them with the scale that they need.

0:22:46 - 0:22:49

So that's Anthropic or OpenAI basically right now.

0:22:49 - 0:22:52

I don't even think Mistral is in the conversation,

0:22:52 - 0:22:52

unfortunately.

0:22:54 - 0:22:57

Or they have to go to one of the high-scale

0:22:57 - 0:22:59

Chinese providers.

0:23:00 - 0:23:03

Unfortunately, they can't do things like OpenRouter

0:23:03 - 0:23:07

because the performance is very inconsistent.

0:23:08 - 0:23:11

You know, again, shameless plug, Amia just got on open router.

0:23:12 - 0:23:18

And we've had some observations like there are only two out of 10 GLM providers that are doing streaming inference on open router.

0:23:19 - 0:23:23

You know, there's wildly different behavior in handling long context.

0:23:24 - 0:23:29

Some providers respect token limits like max tokens and some do not.

0:23:29 - 0:23:35

And so if you're an agent, that's not an issue you can really afford to deal with, provider variability.

0:23:36 - 0:23:39

You just need one high-scale place to go.

0:23:40 - 0:23:42

And so that's what we think.

0:23:40 - 0:23:42

And so that's what we think.

0:23:42 - 0:23:48

And we think there's a huge hole in crypto AI right now

0:23:48 - 0:23:52

where people are basically stuck going to centralized providers

0:23:52 - 0:24:00

because no one has focused the limited decentralized GPU resources

0:24:00 - 0:24:02

and encapsulated those in a trustless way

0:24:02 - 0:24:05

such that Web3 apps can actually use them.

0:24:06 - 0:24:07

Yeah, just before we go any further,

0:24:08 - 0:24:11

for people who might be coming in cold to this,

0:24:11 - 0:24:13

can you talk through ambience architecture

0:24:13 - 0:24:16

from hardware up to execution there?

0:24:16 - 0:24:17

Just because it's a fascinating structure

0:24:17 - 0:24:19

and I don't think people will truly appreciate it

0:24:19 - 0:24:21

unless they hear it straight from the source.

0:24:22 - 0:24:22

Sure.

0:24:22 - 0:24:24

So just at a high level,

0:24:24 - 0:24:27

imagine if OpenAI were truly open

0:24:27 - 0:24:29

and was powered by AI Bitcoin.

0:24:31 - 0:24:34

Ambient is a useful proof-of-work network

0:24:34 - 0:24:36

that is focused on delivery

0:24:36 - 0:24:40

of a single highly intelligent model.

0:24:40 - 0:24:42

The goal is the best open weights model

0:24:42 - 0:24:44

that's currently available.

0:24:46 - 0:24:54

And our architecture is based on Solana. We are a fork of Solana. We convert Solana from

0:24:54 - 0:25:02

proof of stake to proof of work. And we're a modernization of proof of work. So a non-blocking

0:25:02 - 0:25:13

proof of work. That means that people can work on different problems and submit the results. But the problems are all related to the delivery of and

0:25:13 - 0:25:30

improvement of the single model on our network. So we have one of our proofs of work is inference. A miner can prove that they mathematically generated inference in a valid fashion.

0:25:32 - 0:25:37

So to the Web2 consumer, that just looks like you make an API request and it comes back.

0:25:37 - 0:25:41

And if you want, there's an optional hash that you see that appears on the blockchain

0:25:41 - 0:25:45

that says, you know, this was done correctly.

0:25:45 - 0:25:51

Behind the scenes, that's actually, that operation is actually part of the consensus of our network.

0:25:51 - 0:25:57

We fundamentally change the consensus so AI is at the heart of our consensus.

0:25:57 - 0:26:03

An agreement about what valid inference is, is a core part of our delivery and our service.

0:26:04 - 0:26:07

So, you know, we've, that's one of the primitives,

0:26:08 - 0:26:09

the useful proof of work primitives.

0:26:09 - 0:26:11

And then we also support fine tuning

0:26:11 - 0:26:13

and we'll ultimately do a pre-training as well on the network.

0:26:14 - 0:26:17

And the idea is that unlike a lot of projects

0:26:17 - 0:26:18

that have come previously,

0:26:19 - 0:26:21

this is an actual self-improving network.

0:26:23 - 0:26:28

The network, when it has Slack capacity, is working

0:26:28 - 0:26:38

on creating synthetic data that it trains itself on that can improve its intelligence. And this

0:26:38 - 0:26:53

has become a more and more viable path. So, you know, if I were to summarize, the network is focused entirely on the delivery of this service.

0:26:55 - 0:26:59

And demand, unlike with traditional crypto ad networks, can come from anywhere.

0:27:00 - 0:27:26

You know, we're not fighting over this limited pool of crypto users that people seem to fight over. Our tokenomics don't depend on that. Our network utility doesn't depend on that. We serve Web 2 and Web 3. And what we provide is what everyone is demanding, which is inference and, you know, that core building block of the inference economy.

0:27:27 - 0:27:31

So I could say a lot of other stuff, but maybe I'll stop there

0:27:31 - 0:27:34

and we can dig into pieces of that if that's okay.

0:27:34 - 0:27:40

Yeah, so I think at the surface, where does that supply side come from?

0:27:40 - 0:27:41

Where's the hardware coming from?

0:27:41 - 0:27:47

Is the hardware bottleneck as extreme as every mainstream media outlet is reporting on?

0:27:48 - 0:27:54

Where is the required amount of energy in the future

0:27:54 - 0:27:56

in the next five to 10 years going to come from?

0:27:56 - 0:27:58

How do you think about the supply side of that?

0:27:59 - 0:28:04

Right, so I think that we're in an interesting time

0:28:04 - 0:28:05

and I'll talk about that in a second.

0:28:06 - 0:28:11

Directly speaking, our testnet comes from, you know, for the big model,

0:28:11 - 0:28:14

it comes from a group of whitelisted miners who volunteered.

0:28:15 - 0:28:18

You know, there are people all over the world who've expressed interest,

0:28:19 - 0:28:24

and, you know, they'll ultimately be getting some rewards for their participation,

0:28:24 - 0:28:25

which would be very much in line with what they would get if they participated in our mainnet in the future. they'll ultimately be getting some rewards for their participation,

0:28:27 - 0:28:29

which would be very much in line with what they would get if they participated in our main net in the future.

0:28:30 - 0:28:31

That's how it's scaled.

0:28:32 - 0:28:37

So it's really like, you know, it's a full economic test, really,

0:28:37 - 0:28:39

is what our test net is intended to be.

0:28:40 - 0:28:49

And, you know, the cool thing about Ambient's model is that I like to say we make lemonade out of lemons.

0:28:51 - 0:28:55

Traditionally, a decentralized supply has been looked down upon.

0:28:56 - 0:29:06

You know, there have been real questions about the quality of the compute that comes from different regions, the integrity of the computations.

0:29:07 - 0:29:13

And we've seen really all these deep ends that aggregate, let's be honest,

0:29:13 - 0:29:18

like a pretty substantial amount of decentralized supply just fall flat on their faces.

0:29:18 - 0:29:20

And that makes me very sad.

0:29:21 - 0:29:25

And these are all part of what I would call a platform as a service model.

0:29:26 - 0:29:32

So decentralized supply, the platform as a service sort of puts it on you as the customer to verify

0:29:32 - 0:29:38

the integrity of the hardware that you're working with. You just sort of rent out a server or you

0:29:38 - 0:29:45

rent out like a GPU and you kind of need to do your own diligence to make sure that it's working.

0:29:46 - 0:29:49

And that's kind of a high barrier for mass adoption, right?

0:29:49 - 0:29:52

Most of us are not going to be able to benchmark

0:29:52 - 0:29:54

even a consumer graphics card

0:29:54 - 0:29:56

and feel confident that we got it right.

0:29:57 - 0:29:59

But that's the barrier to entry that's been created so far.

0:30:00 - 0:30:03

So Ambient is more like a software as a service.

0:30:04 - 0:30:10

So from a consumer standpoint, the only thing that you can get is valid inference.

0:30:12 - 0:30:17

From a behind the scenes standpoint, miners are only getting rewarded if they're providing

0:30:17 - 0:30:18

valid inference.

0:30:18 - 0:30:20

There's a mathematical requirement for that.

0:30:21 - 0:30:26

They only get paid if they provide a valid inference. If a consumer makes a

0:30:26 - 0:30:31

request and somebody provides invalid inference, it's silently dropped and rerouted by the network

0:30:31 - 0:30:37

to somebody who can provide valid inference. And so that's a lot different kind of model

0:30:37 - 0:30:54

than has ever come before. And that we think is the experience that people need in order to trust a decentralized supply and rely upon a decentralized supply.

0:31:00 - 0:31:06

thing. And that's how we make the lemonade. Because if you have that experience consistently,

0:31:06 - 0:31:12

it's going to feel good for you. And for agents, you know, who are scaling up,

0:31:13 - 0:31:20

like that's providing availability and reliability and uptime guarantees because of the self-healing

0:31:20 - 0:31:24

nature of a proof of work network that are really important for performance.

0:31:20 - 0:31:24

nature of a proof of work network that are really important for performance.

0:31:26 - 0:31:31

Yeah. And what do you, like, it kind of feels, and I don't know if we spoke about this last

0:31:31 - 0:31:37

time, but it feels like it's accelerated drastically. The way that the West and the

0:31:37 - 0:31:41

East have positioned all their frontier models with regards to everything that's going out

0:31:41 - 0:31:46

of China's open source, which feels, I don't know what kind of 5D chess is going on there,

0:31:46 - 0:31:48

but I don't know if you've got any insight on that.

0:31:48 - 0:31:50

But, and then over in the US, obviously with OpenAI,

0:31:51 - 0:31:53

I suppose Grok to a degree, Anthropic,

0:31:55 - 0:31:56

Mistral, I believe is open source.

0:31:56 - 0:32:00

I know someone from DeepMinds just announced they've raised in the UK,

0:32:01 - 0:32:03

probably two or three years behind, as usual in the UK.

0:32:04 - 0:32:07

But why is the stance on the

0:32:07 - 0:32:13

models coming out in China? I seen basically the Chinese version of ZipRecruiter release a 3 billion

0:32:13 - 0:32:16

mini parameter model the other day. And I was just like, how the hell are these companies getting

0:32:16 - 0:32:21

involved in this sort of stuff? Why are they all open source? What's the game theory behind them

0:32:21 - 0:32:29

going down that route as opposed to what we're seeing in the west yeah so i'm going to answer your question on like a hardware and software level at the same time

0:32:29 - 0:32:33

because i don't think i fully answered your hardware question i talked a little bit about why

0:32:33 - 0:32:39

uh you know decentralized can come and play with ambient but from a larger perspective uh you know

0:32:39 - 0:32:51

the i think that what you're seeing is that the demand is not primarily tapping the supply for pre-training.

0:32:52 - 0:32:56

Like, if that were the case, then none of these Chinese models would come out.

0:32:57 - 0:33:06

Because they simply wouldn't have the hardware to be able to train things. But the reality is, you know, we're getting like ZAI

0:33:06 - 0:33:10

coming out with GLM-5, which is basically on par, right?

0:33:10 - 0:33:14

So from a hardware standpoint,

0:33:14 - 0:33:19

the demand is coming from inference.

0:33:19 - 0:33:22

And then, you know, we get into this question of,

0:33:23 - 0:33:25

like, why open source?

0:33:27 - 0:33:32

And this is a little bit difficult, and so I'm going to unpack it in a few stages.

0:33:33 - 0:33:38

And please feel free to interrupt me as we go here, because it's kind of a multi-part thought.

0:33:40 - 0:33:57

So I guess the first thought is that the reason that a little Chinese company can come out with a world-class 3 billion parameter model is because the there is no moat thesis was true.

0:33:58 - 0:34:05

And this has really profound implications for the whole area and what's going to happen.

0:34:06 - 0:34:12

So when we say there is no moat, we mean that there was no particular technical reason why

0:34:12 - 0:34:17

somebody couldn't train a competitive model, like given enough data and given a reasonable

0:34:17 - 0:34:28

amount of compute. And my addition to that is that I believe that every Western model is distillable.

0:34:29 - 0:34:45

So I don't know if you're familiar with distillation, but that's essentially where you have a teacher model that trains a student model to become brighter or gives it more knowledge or more capabilities as far as reasoning.

0:34:46 - 0:34:54

And what that means is you can rapidly make a lower grade model achieve higher tier performance.

0:34:56 - 0:35:23

And my belief is that the reason that Chinese companies can turn around literally like days or single digit weeks after a closed release, a big closed release like Opus 4.6 or GPT Codex 5.3 and have models with competitive performance is because they have mastered the art of distillation.

0:35:20 - 0:35:22

they have mastered the art of

0:35:22 - 0:35:23

distillation

0:35:23 - 0:35:26

and the amount of fine

0:35:26 - 0:35:28

tuning of their existing models

0:35:28 - 0:35:30

as a result of that

0:35:30 - 0:35:32

distillation is limited.

0:35:34 - 0:35:34

So

0:35:34 - 0:35:37

there's this

0:35:37 - 0:35:40

when you

0:35:40 - 0:35:42

operate at the scale of trillions

0:35:42 - 0:35:43

of tokens of text,

0:35:44 - 0:35:47

all your models kind of look the same.

0:35:48 - 0:35:49

They start to look the same.

0:35:49 - 0:35:50

It's a weird thing.

0:35:50 - 0:35:54

They're self-organizing, and the concepts start to be convergent,

0:35:54 - 0:35:57

and the layouts, actually, of the concepts in the models

0:35:57 - 0:35:58

start to become convergent.

0:35:59 - 0:36:03

And so when everyone's training on many of the same trillions

0:36:03 - 0:36:09

of tokens of text, you just need to do a little bit of sampling on someone's new model

0:36:09 - 0:36:15

to sort of tune those things in order to get better performance out of your model.

0:36:15 - 0:36:18

And I think that's a little bit of what's happening.

0:36:18 - 0:36:23

So I think that if we're asking a question like, why open source?

0:36:24 - 0:36:28

First of all, to undermine the competitive advantage of other players.

0:36:29 - 0:36:31

Secondly, because you can.

0:36:32 - 0:36:33

Because it's very cheap.

0:36:34 - 0:36:36

And because almost anyone could.

0:36:37 - 0:36:42

And there is PR benefit to being the first to doing something.

0:36:42 - 0:36:46

That would be the short version of the answer. But it's made possible by

0:36:46 - 0:36:50

the convergence of architectures and information

0:36:50 - 0:36:53

and improvements in distillation capability.

0:36:54 - 0:36:56

And yeah, we could talk about some of the implications of this,

0:36:56 - 0:36:57

but I'll stop or maybe.

0:36:59 - 0:37:01

Yeah, and do you think that kind of puts pressure on,

0:37:01 - 0:37:03

I know we've seen Elon come out and say

0:37:03 - 0:37:07

that more legacy models on Grok will be open source.

0:37:07 - 0:37:10

Do you think it puts pressure on the big privates

0:37:10 - 0:37:11

to actually go down that route?

0:37:12 - 0:37:14

Or do you think the average person doesn't care?

0:37:14 - 0:37:16

How do you think that plays out?

0:37:18 - 0:37:21

Well, it's really strange

0:37:21 - 0:37:31

because, again, we've got a market pricing assumption that I think could be completely invalidated.

0:37:31 - 0:37:39

So the current market pricing of Anthropik and OpenAI is that they're the undisputed champions of inference.

0:37:39 - 0:37:45

Because they're trusted brands, they're going to have infinite penetration into corporate networks forever.

0:37:46 - 0:37:57

And that those corporates are going to be comfortable sharing their IP and training data with these closed networks forever.

0:37:58 - 0:38:09

And this is really silly because these are all rational economic actors. And I think the moment

0:38:09 - 0:38:18

you see OpenAI do another move, like cannibalizing the insurance industry or, you know, making a move

0:38:18 - 0:38:25

that actually threatens pharma, that they will all turn inward and try and use open weights products again.

0:38:27 - 0:38:35

And so I think that the pressure that is on these closed providers is an impossible tension.

0:38:35 - 0:38:40

The reality is that if they release a model, like one to two weeks later,

0:38:40 - 0:38:48

there is going to be a clone of it that works really well because distillation is very

0:38:48 - 0:38:56

good. And you've got this platonic convergence of model architectures happening underneath the

0:38:56 - 0:39:01

surface. Like this is just the structural reality. So if you're them, you're trying to tell investors

0:39:01 - 0:39:06

that you are unique and distinct and that you have a durable brand advantage.

0:39:07 - 0:39:09

But the reality is that you are not unique and distinct.

0:39:10 - 0:39:14

And your brand advantage is, I think, illusory.

0:39:15 - 0:39:30

So this is what always kind of frustrates me in the discourse. You know, I've talked to a lot of different VCs and players in the space.

0:39:30 - 0:39:32

And it's kind of like,

0:39:32 - 0:39:37

it's kind of like people telling you that Yahoo is inevitable.

0:39:39 - 0:39:43

You know, like in the early internet times,

0:39:43 - 0:39:45

like why would you ever fight Yahoo?

0:39:48 - 0:39:49

Or ask Jeeves.

0:39:50 - 0:39:53

Yeah, you see, you see like the pendulum just swing

0:39:53 - 0:39:55

week on week on week.

0:39:55 - 0:39:59

Like as I say, the Anthropic kind of buried OpenAI

0:39:59 - 0:40:02

with those four ads, which was just fantastic.

0:40:02 - 0:40:05

And then this kind of like pendulum swing back.

0:40:05 - 0:40:08

And now everyone's like, oh, maybe OpenAI are actually black

0:40:08 - 0:40:10

because they've managed to convince Pete to come over.

0:40:10 - 0:40:12

And then, I don't know, starting to see like,

0:40:12 - 0:40:15

obviously lower rungs down, but everyone's saying,

0:40:15 - 0:40:18

Kimi's amazing, DeepSea 4.0 is about to be released.

0:40:18 - 0:40:20

I don't know if it could drop any second, to be honest.

0:40:21 - 0:40:23

So then it's just like, everyone's just constantly chasing the tail.

0:40:25 - 0:40:26

Like, Cursor was,

0:40:27 - 0:40:28

what did it get valued at a couple of,

0:40:29 - 0:40:30

like last round, like 900 million,

0:40:31 - 0:40:34

900, like some ungodly like valuation.

0:40:35 - 0:40:36

Everything's moving at breakneck speed

0:40:36 - 0:40:39

and anyone who talks in definites,

0:40:39 - 0:40:41

I'm very, very wary of at the minute

0:40:41 - 0:40:43

because everything's changing so, so quickly.

0:40:47 - 0:40:51

Well, and I think that, you you know there's some temporary conditions here so one of them is that software situation where people are

0:40:51 - 0:40:55

like oh these are inevitable they'll always win i actually think that you know if you talk about

0:40:55 - 0:40:59

credible neutrality again that's where people are going to go they're going to go towards privacy

0:40:59 - 0:41:05

and credible neutrality because it's going to be very obvious that these large players are abusive, A, and B,

0:41:05 - 0:41:06

have no durable advantage.

0:41:07 - 0:41:13

So if they have no durable advantage, it doesn't make sense to be abused, right?

0:41:13 - 0:41:15

As a consumer or a company.

0:41:15 - 0:41:20

But the only reason you would consent to being abused, having your data abused, your privacy

0:41:20 - 0:41:25

abused, to be manipulated by ads, is if that was offering you something that you

0:41:25 - 0:41:33

couldn't get anywhere else. And that's just not going to be the case. Um, but, uh, you know,

0:41:33 - 0:41:45

if we're talking about, uh, like where it goes in platform terms, uh, you know, people are going to go where they're on a level playing field.

0:41:46 - 0:41:52

And that level playing field doesn't align with any of the majors.

0:41:52 - 0:41:54

Because if you go into the closed providers right now,

0:41:54 - 0:41:58

the playing field is tilted completely towards them.

0:41:59 - 0:42:02

And so that's one condition is the software condition.

0:42:02 - 0:42:04

The other condition is related to hardware.

0:42:04 - 0:42:09

So you may have been following, you know following NVIDIA's purchase of Grok.

0:42:11 - 0:42:13

Grok is an ASIC maker.

0:42:14 - 0:42:22

And that should, to me, actually, that should have decimated NVIDIA's stock, that purchase.

0:42:20 - 0:42:22

that purchase.

0:42:23 - 0:42:24

And the reason is that

0:42:24 - 0:42:27

the US, the West,

0:42:28 - 0:42:29

NVIDIA are not the best makers

0:42:29 - 0:42:30

of ASICs.

0:42:32 - 0:42:34

The best makers of ASICs

0:42:34 - 0:42:34

are the Chinese.

0:42:35 - 0:42:38

They're ant miners from Bitmain.

0:42:39 - 0:42:39

You know?

0:42:40 - 0:42:41

And if you,

0:42:42 - 0:42:43

if NVIDIA is validating the thesis

0:42:43 - 0:42:46

that the future of inference is an ASICS

0:42:46 - 0:42:51

it's game over for their dominance

0:42:51 - 0:42:56

this whole hardware thing is just a blip then

0:42:56 - 0:43:02

yeah and it's obviously a geopolitical pawn with Taiwan as well

0:43:02 - 0:43:05

with regards to NVIDIA and chip reduction out there I think Chamath's obviously won againical pawn with Taiwan as well with regards to Nvidia and chip reduction out there.

0:43:05 - 0:43:08

I think Chamath's obviously won again with that Grok acquisition

0:43:08 - 0:43:11

because I know he was heavily involved in that one.

0:43:13 - 0:43:15

What do you think more like application?

0:43:16 - 0:43:16

Because there's an interesting,

0:43:17 - 0:43:19

I know it was a while ago when this happened,

0:43:19 - 0:43:22

but I feel like the resolver networks

0:43:22 - 0:43:24

that you have on prediction markets,

0:43:24 - 0:43:26

they've tripped up a few times.

0:43:26 - 0:43:28

Like there was the weird court debate.

0:43:29 - 0:43:29

Like,

0:43:29 - 0:43:29

was it a court?

0:43:29 - 0:43:30

Was it a jacket?

0:43:30 - 0:43:31

Was it a suit?

0:43:31 - 0:43:32

Like that,

0:43:32 - 0:43:34

that went on with Polymarket for quite a while.

0:43:34 - 0:43:36

And there's quite a lot of people that got quite angry about that.

0:43:36 - 0:43:37

And then everyone was like,

0:43:37 - 0:43:38

well,

0:43:38 - 0:43:43

maybe we can just use LLMs to actually resolve and settle prediction markets.

0:43:45 - 0:43:46

That's obviously the layer above,

0:43:47 - 0:43:49

but it's also the layer below at the same time.

0:43:49 - 0:43:51

I don't know if you've had any thoughts on that.

0:43:53 - 0:43:56

Is there anything I'm thinking about there with regards to that?

0:43:56 - 0:43:59

Because it's the use case that has cut through the fourth wall

0:43:59 - 0:44:00

and has broken through.

0:44:00 - 0:44:02

My parents know what prediction markets are,

0:44:02 - 0:44:04

and I just wonder if there's an avenue for AI

0:44:04 - 0:44:10

to resolve the markets there yeah i mean we think llm as judge is huge

0:44:10 - 0:44:18

for ambient and um the reason is simply that if inference is underpinning the judge you want the

0:44:18 - 0:44:26

judge to be credibly neutral and the only way that that happens is if the inference operation itself was run correctly. And so we have a

0:44:26 - 0:44:31

mathematical proof of that, and there's no overhead associated with that particularly. There's like

0:44:31 - 0:44:37

1% overhead. So for free, you can get a guarantee that the LM judge was operating on the right

0:44:37 - 0:44:51

context, performing inference correctly, producing the entire result that you end up with. But I think that that concept

0:44:51 - 0:44:58

extends, you know, not just to prediction markets, but to a lot of other areas. So you imagine

0:44:58 - 0:45:07

that we live in a human world of personal responsibility, accountability, and liability right now.

0:45:08 - 0:45:13

And that's mainly associated with employees.

0:45:14 - 0:45:22

You know, I'm sure we've all had a bad boss who says something like, whose throat do I have to choke?

0:45:23 - 0:45:28

You know, to get something done around here?

0:45:30 - 0:45:36

With agents, what are you going to do to Claude?

0:45:37 - 0:45:38

What are you going to do to Anthropic?

0:45:39 - 0:45:44

If Anthropic screws up your inference, are you going to sue them?

0:45:48 - 0:45:56

Do you have any recourse? Can you even prove that your prompt went correctly into their inference engine? Because they're well known,

0:45:56 - 0:46:03

as is OpenAI, for just actually rewriting queries so that they are safer to process.

0:46:04 - 0:46:06

And they have all sorts of guardrails that they layer on top.

0:46:06 - 0:46:09

So if you're a company, you're trying to assign a liability,

0:46:10 - 0:46:12

responsibility, accountability to an AI agent,

0:46:14 - 0:46:16

gosh, that's difficult.

0:46:16 - 0:46:18

If you don't control the underlying model,

0:46:18 - 0:46:19

if you have no insight into that,

0:46:20 - 0:46:23

if you have no insight into the scaffolding around that model,

0:46:23 - 0:46:28

and if the entity that you're trying to hold accountable is much bigger than you,

0:46:29 - 0:46:30

that's really difficult.

0:46:30 - 0:46:35

And so that's where I think that we get into, we heavily favor,

0:46:35 - 0:46:37

and this is the credible neutrality again,

0:46:37 - 0:46:43

we're going to heavily favor systems that are provably fair,

0:46:44 - 0:46:49

where we can have an immutable record of what they did,

0:46:49 - 0:46:51

where we know exactly what model is running

0:46:51 - 0:46:53

with exactly what context,

0:46:53 - 0:46:58

because what we can do then is tune our systems

0:46:58 - 0:47:02

to perform better on a going forward.

0:47:02 - 0:47:03

That's the accountability.

0:47:04 - 0:47:05

Just like we would tell an

0:47:05 - 0:47:11

employee, hey man, I need you to look at those files more carefully in the future because we

0:47:11 - 0:47:17

made a mistake on that. The way that you have agentic accountability is if you understand

0:47:17 - 0:47:25

where the model slipped up, what context you're providing, and you can re-engineer that context

0:47:25 - 0:47:27

and reframe that problem

0:47:27 - 0:47:29

so that it can be successful in the future.

0:47:29 - 0:47:30

And that's just not possible

0:47:30 - 0:47:32

with the scaffolding that we have right now.

0:47:34 - 0:47:36

And how are you guys setting the model out?

0:47:36 - 0:47:39

Is it going to be more general purpose?

0:47:40 - 0:47:43

Are you aiming for it just to be used for...

0:47:43 - 0:47:45

Yeah, because if you think about different models

0:47:45 - 0:47:46

for different use cases,

0:47:46 - 0:47:49

obviously Opus 4.6,

0:47:49 - 0:47:51

you'd probably say more suitable for coding.

0:47:51 - 0:47:53

I don't know, GPT-5,

0:47:53 - 0:47:55

or I know there was a lot of people

0:47:55 - 0:47:58

very, very upset that 4.0 got discontinued the other day

0:47:58 - 0:48:00

and there was a bit of an outcry on the timeline,

0:48:00 - 0:48:01

which was wild.

0:48:02 - 0:48:03

But that was more for general purpose.

0:48:04 - 0:48:05

How are you setting that up?

0:48:06 - 0:48:09

And what would be the initial use cases for it?

0:48:11 - 0:48:16

Yeah, so our view is that the first order of business is to provide a model with high

0:48:16 - 0:48:19

intelligence, high reasoning capabilities, and high knowledge.

0:48:20 - 0:48:23

And no one at Web3 has really done that at decent scale.

0:48:23 - 0:48:25

So that's order of business number one.

0:48:25 - 0:48:30

But the other thing that we're going to do is enable the high-scale use of fine-tunes.

0:48:32 - 0:48:46

And we think that fine-tunes are really important because they let you control the personality of your model in a granular and stable way that isn't really achievable right now.

0:48:46 - 0:48:50

And that's really important for customer-connected

0:48:50 - 0:48:51

kinds of experiences.

0:48:52 - 0:48:56

And they also let you be very, very token-efficient

0:48:56 - 0:48:58

with token-heavy tasks.

0:48:59 - 0:49:02

The reality is if you saturate your context, for example,

0:49:02 - 0:49:08

with a medical textbook, You might get very good results from query to query,

0:49:08 - 0:49:10

but you are paying through the nose for that.

0:49:10 - 0:49:14

If you have to give it like 80,000 tokens of context each time,

0:49:14 - 0:49:16

and your responses are much slower.

0:49:16 - 0:49:20

And so with a fine-tuned model, you can build that knowledge in.

0:49:21 - 0:49:30

You can get very quick, token-efficient responses to things that are highly accurate, particularly when you additionally reinforce it with things like RAG. And so

0:49:30 - 0:49:35

Ambient's thought is, like, deliver the scale, deliver a good experience at scale, but also

0:49:35 - 0:49:43

support high-scale customization via fine-tunes. And a lot of our effort has been in that area.

0:49:40 - 0:49:43

A lot of our effort has been in that area.

0:49:44 - 0:49:50

And something that I don't think we've discussed anywhere really,

0:49:50 - 0:49:52

and this might be the first time that we'll mention it,

0:49:52 - 0:49:56

is that in about a month's time,

0:49:56 - 0:50:00

we will be launching an additional model on our network.

0:50:00 - 0:50:08

network. So we will have Ambient. We also have Ambient Mini. So Ambient Mini will be a small

0:50:01 - 0:50:02

So we will have Ambient.

0:50:02 - 0:50:04

We also have Ambient Mini.

0:50:08 - 0:50:16

model that is latency and fine-tune optimized. And the idea is that users of our network will be able

0:50:16 - 0:50:28

to use the big model to create fine-tunes of the small model for these specialized use cases that are going

0:50:28 - 0:50:32

to be really token efficient and really fast and really economical.

0:50:33 - 0:50:41

And not only does that enable kind of a new capability, but it also opens up the spectrum

0:50:41 - 0:50:45

on hardware because we're going to open our testnet, which is currently

0:50:45 - 0:50:53

live, to permissionless mining, you know, by a bunch of people with small GPUs.

0:50:54 - 0:51:00

And so that kind of, that's kind of the paradigm we see is that the big model is for really,

0:51:01 - 0:51:05

you know, the state of theart reasoning performance on all these cases,

0:51:06 - 0:51:10

but that we can get very, very good performance

0:51:10 - 0:51:12

with a small, fine-tuned model,

0:51:12 - 0:51:16

and we can really broaden access to that

0:51:16 - 0:51:19

and be very competitive with the big providers

0:51:19 - 0:51:25

by running the small model, the small fine-tunes at scale

0:51:25 - 0:51:28

on things like consumer GPUs and edge devices.

0:51:29 - 0:51:33

And that's how essentially Ambient becomes Ambient.

0:51:33 - 0:51:36

We spread the intelligence all around.

0:51:37 - 0:51:42

Nice. So let me try and relay this back so I fully grasp it.

0:51:42 - 0:51:45

So Ambient's large model is going to be serving to fine

0:51:45 - 0:51:50

tune ambient mini so let's think of like a couple of use cases of where that could actually come in

0:51:50 - 0:51:54

i'm going to try and tie it back to what open ai did which i thought was really sneaky where they

0:51:54 - 0:51:58

said we're not going to give you any health advice anymore we're not going to give you any legal

0:51:58 - 0:52:08

advice anymore and then productize both of them so so it could it could in theory you use Ambient as the fine-tuning model

0:52:08 - 0:52:14

to then spin out bespoke company knowledge-based models

0:52:14 - 0:52:16

in either of those sectors I spoke about,

0:52:16 - 0:52:21

like health and or legal and or maybe coding first.

0:52:21 - 0:52:26

Could you just break off into those three branches effectively?

0:52:27 - 0:52:30

Exactly right. So imagine

0:52:30 - 0:52:34

that you have a really tough, detailed

0:52:34 - 0:52:38

compliance workflow for your company. It involves

0:52:38 - 0:52:41

normally reviewing hundreds of pages of documentation,

0:52:42 - 0:52:45

bouncing off a bunch of corporate processes, et cetera.

0:52:47 - 0:52:53

You could create a fine-tune that just dealt with that knowledge base.

0:52:54 - 0:52:59

And with Ambient, you could operate on that fine-tune at scale, completely privately,

0:53:00 - 0:53:09

if you wanted. And you would get the benefits of the high intelligence and the

0:53:00 - 0:53:00

if you wanted.

0:53:09 - 0:53:15

good token utilization and the specificity. So that's kind of the intention.

0:53:16 - 0:53:22

For less sensitive things, you could get it very cheap. If you wanted to operate on arbitrary

0:53:22 - 0:53:26

consumer hardware, the quality is guaranteed by the verified inference.

0:53:27 - 0:53:31

You know, the privacy is essentially based on anonymity at that point,

0:53:31 - 0:53:35

if you're just doing, you know, like the on any device workflow.

0:53:36 - 0:53:38

But you could get that for very, very cheap

0:53:38 - 0:53:40

if you weren't dealing with sensitive data.

0:53:40 - 0:53:43

And so, yeah, there's this spectrum of options that this enables.

0:53:46 - 0:53:48

Awesome. And you mentioned RAG there.

0:53:48 - 0:53:50

Can you give us a high level of what that is,

0:53:51 - 0:53:54

just so the average person would be able to understand what the importance of that is?

0:53:55 - 0:54:00

Sure. So RAG is retrieval augmented generation,

0:54:00 - 0:54:02

which is essentially like context engineering.

0:54:02 - 0:54:06

It's like I've got this,

0:54:11 - 0:54:17

imagine I have a big book that has lots of relevant information and lots of irrelevant information. I can't feed the whole book to the LLM because its memory would overflow.

0:54:17 - 0:54:24

So what I need to do is just pick out the relevant portions of that context and feed that

0:54:24 - 0:54:26

to the agent. And then it can spit out an answer that's

0:54:26 - 0:54:35

informed by that context. Something that builds on RAG, which we're actually using, is what's

0:54:35 - 0:54:49

called agentic search. So the idea there is that you take a big piece of text and you treat that text like a library would treat books in a library.

0:54:49 - 0:54:55

You actually catalog it according to, and you put different parts of it in different sections.

0:54:55 - 0:55:02

You sort of build a tree of different content. And then you have an agent actually navigate the tree

0:55:02 - 0:55:05

and pick out the right area.

0:55:05 - 0:55:08

And then within that area, you do RAG.

0:55:08 - 0:55:10

So you do a similarity search on the concept,

0:55:10 - 0:55:12

but you've narrowed down the search space

0:55:12 - 0:55:14

so that you get much more relevant results.

0:55:15 - 0:55:16

So that's another technique that we employ.

0:55:17 - 0:55:17

Nice.

0:55:19 - 0:55:21

Okay, so we're getting close to wrapping.

0:55:22 - 0:55:24

Is there anything that you want to make people aware of,

0:55:24 - 0:55:25

particularly on the ambient side?

0:55:25 - 0:55:27

Because I know we've covered an awful lot of topics there

0:55:27 - 0:55:29

and I wanted to give you some more time

0:55:29 - 0:55:30

with regards to ambient as well.

0:55:30 - 0:55:31

But like, what's next?

0:55:31 - 0:55:32

What should people be aware of?

0:55:34 - 0:55:37

Yeah, so Ambient's testnet is live.

0:55:37 - 0:55:37

You can try it out.

0:55:38 - 0:55:40

App.ambient.xyz.

0:55:40 - 0:55:42

It's powered by decentralized compute.

0:55:42 - 0:55:44

The whole blockchain is involved.

0:55:45 - 0:55:46

We're also on OpenRouter.

0:55:46 - 0:55:51

So we're really proud that, you know, we're kind of middle of the pack in OpenRouter in

0:55:51 - 0:55:59

terms of our delivery of everything, sort of accounting for the fact that we're doing

0:55:59 - 0:56:01

a full blockchain workflow in delivery.

0:56:02 - 0:56:05

Like we've got behind the scenes when we do inference,

0:56:05 - 0:56:07

we have like a full auction that happens.

0:56:08 - 0:56:11

And, you know, we're picking a miner in that auction

0:56:11 - 0:56:14

and there's a bunch of other processing that happens,

0:56:14 - 0:56:16

but we're still, our time to first token

0:56:16 - 0:56:19

is still like under two seconds, right?

0:56:19 - 0:56:21

So I think that we're proving right now

0:56:21 - 0:56:27

that the concept is economically viable from a decentralized

0:56:27 - 0:56:33

standpoint and competitive. And, you know, that's been borne out, I think, in some of the usage,

0:56:33 - 0:56:40

because in about a month, we've gone from, you know, zero users to about, we're almost at 40,000

0:56:40 - 0:56:46

monthly active users. So we're really stressing the network right now.

0:56:47 - 0:56:51

So I would encourage people to check it out.

0:56:52 - 0:56:55

We also have some fun first-party apps

0:56:55 - 0:56:56

that we're coming out with.

0:56:56 - 0:57:01

One of those is ambientrpg.xyz.

0:57:01 - 0:57:03

So if you want to do a fully generative,

0:57:04 - 0:57:06

old-school text text role playing experience,

0:57:07 - 0:57:09

like that's, that's going to be pretty fun.

0:57:09 - 0:57:11

I have a few others that are coming out as well.

0:57:12 - 0:57:16

You know, our, we have a drop in one line substitute for, you know,

0:57:16 - 0:57:19

if you're using cloud code, you can just switch to ambient with one line.

0:57:19 - 0:57:20

That's in our documentation.

0:57:22 - 0:57:31

Our API is Anthropic and OpenAI compatible. So, you know, we can really fit into

0:57:31 - 0:57:39

almost any chat client that people would like. So I would say now's the time. We've got a lot

0:57:39 - 0:57:46

of people doing kind of agentic judge. We have some prediction market things that we're exploring also,

0:57:47 - 0:57:48

kind of like we talked about.

0:57:50 - 0:57:56

And there are just all sorts of applications that you can build right now

0:57:56 - 0:57:57

with Ambient.

0:57:57 - 0:57:59

So, you know, if you're on the Web3 side, I would say,

0:58:00 - 0:58:03

like, why the hell are you using a centralized provider

0:58:03 - 0:58:09

if you can get equivalent performance and, you know, guaranteed, like, verified inference for cheaper?

0:58:09 - 0:58:11

Because our prices are very, very competitive.

0:58:12 - 0:58:26

And if you're on Web 2, if you are running into rate limits, or you think that your application is on the bubble, or you're worried about losing your IP to one of the majors,

0:58:27 - 0:58:27

come on by.

0:58:28 - 0:58:28

We got you.

0:58:29 - 0:58:30

That's what I would say.

0:58:32 - 0:58:34

We're ambient underscore XYZ on X.

0:58:35 - 0:58:37

I'm Iridium Eagle on X is my username.

0:58:38 - 0:58:40

We've just got a lot going on.