Boys and girls, women, children, me and you
The dice are loaded Boys and girls, women, children, me and you.
Even a hobo would tell you this.
Welcome to hard times and feeling low. Do you like sinning? No. Where you will be before you go. We got lots of gambling. Oh, and we're telling lies
You're certainly welcome to hard times Take a look in my eyes
Besides the bright blinking lights
Stretched out in front of me.
I wonder if you'll notice, would you even care?
If I told you my life just isn't fair.
Well you will be before you go.
Oh, and we're telling lies.
You're certainly welcome to hard times.
Hope you're feeling welcome to Hard Times.
No idea where we got that song from.
Am I in for Hard Times, Grant?
We did a more markets focused outlook episode about an hour ago, two hours hours ago so we we had okay that makes sense more
now yeah yeah kind of a theme going on there but uh travis i think it was probably over a year ago
now but how the how the hell have you been i've been great man it's uh it's been uh fun seeing
some of the ai stuff uh materializing kind of the way I was expecting.
Maybe not so fun watching the markets materialize in unexpected ways, but.
Actually, let's use this because I've got a very good friend of mine who's at Figma.
So the front end design UI UX type software.
So we have a back and forth constantly going on about how,
obviously SaaS has been having a huge sell-off.
I think the idea from, we've built a lot of our CRM
from the ground up internally now.
There's a lot of bloat on Salesforce side and HubSpot side
that we just didn't need a lot of the bells and whistles
and it cost an awful lot of money.
We can now use things like CloudCode
and actually build a CRM outreach tool,
kind of Twitter scraper to see the most interesting projects
coming online and introduce ourselves in that way.
Do you think we're just in this romanticized time now
where that is possible or do you think that's sustainable?
And what do you think happens's sustainable and what do you
think happens with sas markets with regards to how everything's coming online with regards to
cloud code and things like that it's a great question so let's unpack it a little bit uh if
we look at uh the different products that are out there know, one that's taken like a real kicking has been
Atlassian, you know, Jira. We've all suffered through Jira. It started as a really good
product and it's just, I think it's gotten worse and worse over the years. And now I think people
are just cobbling together their own alternatives and it isn't that hard. You know, you just,
the thing about Jira that was special for enterprises
was that it had infinite customizability.
But that came at the cost of a lack of usability.
So if people just make their own specific workflows
from a Vibe coding standpoint, then that becomes a real threat
to their business. And I actually do believe that we're going to see an incredible commodification
of SaaS software. And that is the exact opposite of the story that we were hearing like a year ago when we were talking.
What we're hearing a year ago is that inference is going to zero.
There's no value in inference.
You better just build value-added apps.
That's where all the value will accrue.
If you are lucky, you can build in a niche that the big players won't get into.
And if you're unlucky, then they're just going to gobble you up because they'll copy all your stuff.
That was the message we heard.
And I think that's where all the VC investments went as well.
People were just heavily investing in LLM wrappers.
investing in LLM wrappers. And, you know, it's funny to see the market do like a complete 180
on that and then almost pretend like it never happened. So, you know, what I think is that
inference is always the driver.
We're in an inference economy, and the inference platform is the most important thing.
And from our standpoint, it's really important
that that platform be neutral and friendly
to businesses and consumers.
And we don't think that the current platforms meet that definition, really. So,
you know, to answer your question, I think SaaS is going to be kind of wiped out going forward.
I think we will see all custom software. But I think that to the extent that platforms can be created for credibly neutral high-scale inference that
they're going to benefit yeah so you you mentioned something there about kind of like credibly neutral
platforms in the age of i feel like this is the most perfect time to speak about this
it couldn't have actually landed any better is we've obviously seen OpenClaw,
like a series of iterative names
because it couldn't actually align all the domains
because people were sniping them.
So, but we've landed on OpenClaw now,
open source platform and protocol built by Peter.
And basically now you've had Anthropic kind of throw their toys out the pram a little
bit from a legal side and chase them down, cause a name change.
I'd argue that they were one of the largest contributors to a lot of Anthropic API keys
getting pinged constantly.
And they were probably making a load of money off people adopting OpenClaw.
But I think like they created this hugely hostile environment
where Sam Altman from OpenAI has smelt a little bit of blood in the water,
put an arm around Peter and said,
hey, why don't you come over here?
We're going to just put Open Claw into a foundation.
He's a nice little paycheck.
I feel like Anthropic have shot themselves in the foot a little bit when they had the momentum over the past few weeks yeah like from the position of
what you guys are building out like what how do you see that through your lens because
there's quite a lot to unpack there but i feel like i'll just give you the floor
yeah absolutely so i think that first of all this was a real eye-opener for people who were using OpenClaw. They had their unlimited API keys, which is a very cost-effective way of engaging with the services, providing them with a lot of value. And then all of a sudden, the rug was pulled. And Anthropic basically said, your platform, which is providing you that value, is no longer a valid use of our
service. And that is the paradigm for closed AI. Anything that conflicts with their corporate
agenda, that could give them bad press, that could conflict with their copyrights, or trademarks,
whatever, they're just going to rug pull on. And, in this case it was peter's like hob
you know, someone like, uh,
do we have my audio? Can you hear me? Okay. Okay. Yeah.
That's the, that's the first, I'll probably just repeat the last bit.
the whole rug pull scenario there where you had this very useful product that
was suddenly not reasonably priced because the unlimited API key went away
due to anthropics sensitivities, that is where we live in closed AI.
That could happen to any application.
That could happen to any platform.
So that's one thing to kind of note.
The other thing to kind of understand is...
Just lost the Travis's audio there for a second.
Trying to get that back on.
Just switch this to all AirPods all the time here.
So I just switched audio sources.
So this was also sort of Claude talking to itself.
You know, this was like a simulation of Claude talking to itself. You know, this was like a simulation of Claude
talking to itself on all Claude servers. And people really thought that this was a local
thing. You know, people were buying these Macs and they were, I think that's where the excitement
was. So that's something that should be a good market signal is people were excited about the
idea of a local assistant. They thought that by buying a computer
that they could get that assistant.
And maybe they didn't understand
that they were still sharing all of their data
And so I think that that shouldn't be discounted,
And then, you know, I guess sort of the final thing, and this relates to the whole inference
economy is that I think people suddenly understood the level of demand for tokens that you get from
agents and how that is very, very different from the demand profile of people. You know,
we're single threaded. We have these little chats. It doesn't generate very many tokens. It would be hard to imagine how that would generate like the volume of tokens necessary for trillions of dollars in CapEx. But once you start running 100 agents and they're all doing things on your behalf, it actually becomes quite easy. And it's quite easy to saturate those platforms. You know, ZAI, Kimi, like their servers were melting down. They had to stop taking U.S.
customers because the inference workflows were so high. And I think that, again, like that was a
light bulb moment for a lot of people who like a year ago were saying there's no value in inference.
Inference is going to zero, et cetera. Inference is actually the scarcest commodity.
We can't keep up with the demand.
That's why trillions of capex is going to the servers.
The more agents we get, the more that will increase exponentially.
And so that's kind of the third takeaway for me.
And yeah, I think it's a masterstroke on Altman's part to pick it up.
At the same time, I think that that platform loses a lot of its appeal because it
feels like it's not an interesting, cool hobby project that maybe everyone can fully participate
in anymore. Yeah. Yeah. I kind of agree. Um, the, I suppose the law and I seen some of the
grassroots community events around New York and SF where there were just so
many people just packed into a room. I hadn't, hadn't seen that sort of stuff since maybe like
early Ethereum days or something like that. That's the only thing I can kind of like tie it back to
really, or maybe some like the early Bitcoin meetups and stuff. They're just this huge excitement
around, right. This is like a bit of a movement and then it kind of immediately gets corporatized uh which is which is quite funny i don't know how long it will i should
the writing was kind of on the wall when like in the name change when it had like open claw in it
i didn't actually put two and two together but yeah i think like it's a really really good move
because i feel like opening eyewear on the ropes a little bit i don't know if you've seen the
anthropic advertisements about them putting ads in and there was like the lag as the the actor was speaking like so good like
so so good um but i feel i kind of feel like they've pulled it back around i haven't actually
used codex myself i've been like stuck into cloud code for for a while now but need to take that for
a spin i don't know if you've had any experience using Codex. Yeah, so shameless plug,
we have an app out there called Ambient RPG,
which is a fully immersive, old-school,
text-based adventure game that's generative,
that creates worlds on the fly,
kind of like Return to Zork type stuff.
But I built that all using a combination
of light hand editing and Codex.
That code base that I built is over 160,000 lines of code.
And Codex just kind of zooms through it flawlessly.
And I was having major issues with Claude.
Now, I've talked to a lot of other people
who would just have a gut reaction when I say that.
Like, no, like you're just using it wrong,
skill issue with Claude Code.
like I really have tried all the magic beans.
Like I've tried the RAL loop.
I've tried, you know, these different plugin workflows
and agentic factories and all this.
And I still keep coming back to Codex.
I'm with Peter on that one.
And where do you think we're at with...
I'm seeing a lot of people go down the rabbit hole
and everyone wants to take everything
to the exchange really quickly.
And you're getting sub-agents being spun out
of one open-claw instance.
And then you're seeing kind of like people
been technical in their whole lives but they've got like a team of 20 working for them i feel like
it's been a little bit sensationalized but you can kind of see some of the merit in it i wouldn't say
i've seen people like spending like 20 30 40k on like their home setup of mac minis and someone's
asking them the question well what's what's the what's the roi where's the productivity coming
from and they can't really give them a straight answer but um yeah what do you think about the Someone's asking them the question, well, what's the ROI? Where's the productivity coming from?
And they can't really give them a straight answer.
But yeah, what do you think about the kind of orchestration there,
Is there anything you've been kind of seeing,
or even anecdotally that you've seen that's quite interesting?
Yeah, I have a few different thoughts about that. So first of all, I think that there's an interesting divide in approaches. So on the
opening eye side, you more often see build one strong agent that is the top level sort of root
node that delegates. And that's a whole paradigm. And then there's this other paradigm of like Gastown, which is essentially have a light overseer capability, but really there's a bottom-up creation of things.
And it kind of mirrors, not perfectly, but it kind of mirrors where we went in software with sort of monolithic systems versus services and microservices.
And I think that two things are true.
So the first thing is that microservices can scale much better.
But they're also much harder to build and you have all sorts of dependencies related to
network latency and the performance of the individual microservices that you don't have
to worry about as much as compared to a monolithic system i think that the microservices world is a
lot closer to what we'll end up with in the agentic economy. And that's something that
makes me bullish for what we're building because Ambient provides verified inference,
which is kind of a mathematical guarantee that model is behaving in an expected way,
is taking your actual prompt, producing a particular actual text. And that removes trust.
And trust is a huge issue with sub-agents.
You know, we're looking at sub-agents
delegating to sub-agents,
delegating to sub-agents.
And the thing that people forget
because this infrastructure
was all running on Anthropic,
which is a trusted entity,
there are going to be all sorts
of different inference providers
who are going to be underneath the sub-agent layer.
And there's an opportunity for mischief in that.
It could be quite economically advantageous
for inference providers to suddenly drop the level of intelligence
of your agent, and then they win a trade against your agent.
Or maybe corrupt an answer,
like a yes, no answer that comes back.
like a loan application is approved when it wouldn't have been approved before.
Like you can really juke the results of these things.
And right now, the way that things are designed
where verified inference doesn't exist,
there would be no way to trace that.
There would be no quality guarantee for any of the sub-agent infrastructure, and there
would be no way to trace what actually happened.
And that's where I think blockchain's kind of come in, too.
You need some immutable record of what went on, so you have a hope of being able to decode
able to decode happenings in these systems.
happenings in these systems.
So yeah, this is kind of my thought about sub-agent orchestration.
It's probably the future.
And it requires verified inference and proselytist infrastructure to spread its swings.
proselytist infrastructure to spread its swings.
So I think this is where we're at a really strange
and interesting, at the same time, moment where
I don't know how you guys are thinking about it, but is this the first time that
people who are wanting to spin out an L1, a network, a chain
are actually thinking about, well, users are not actually
going to be humans. Users are actually thinking about maybe well users are not actually going to be humans users are actually going to be agents first so like how do you set up your chain architecture
or your network architecture to not only just enable that but kind of like encourage that
in a productive way like i seen anatoly from solana post a tweet the other day and i mentioned this
a couple of days ago so forgive me if you've tuned into this and you listen to this again,
2014, oh, your chain is just full of bots, like a negative emoji.
And then 2026, oh, your chain is full of bots,
like with a question mark with a positive emoji.
And I feel like we're really just at the start of that,
and you're seeing all the big networks and centralized exchanges
scrambling to have X402 integrations
and everything that comes with enabling agents.
You're in a very unique position to comment on this.
How are you guys thinking about that?
I think that the value accrues to the inference layer
and agents go where the inference is.
and agents go where the inference is.
You know, it's, if you're an agent,
like you care about being able to accomplish your task
and you're not going to be able to accomplish your task
if a centralized provider bans you,
if there are rate limits or scalability limits
associated with your usage.
If you can't inherently trust a provider,
you're going to go where there is a stable, high volume of compute that's available. And you're
going to get as close as possible to that because the further you get from it, the more you're going to be
charged by middlemen. And that's kind of our thesis, Ambien's thesis, is that we love bot
usage from all quarters. We think that bot usage is ultimately going to accrue to our network
because it's the closest place to the high intelligence model that's being served scalably.
And then things sort of spread out from there.
And, you know, that's sort of what's happening on the closed source side.
The bots kind of have to align to one of the closed providers that can provide them with the scale that they need.
So that's Anthropic or OpenAI basically right now.
I don't even think Mistral is in the conversation,
Or they have to go to one of the high-scale
Unfortunately, they can't do things like OpenRouter
because the performance is very inconsistent.
You know, again, shameless plug, Amia just got on open router.
And we've had some observations like there are only two out of 10 GLM providers that are doing streaming inference on open router.
You know, there's wildly different behavior in handling long context.
Some providers respect token limits like max tokens and some do not.
And so if you're an agent, that's not an issue you can really afford to deal with, provider variability.
You just need one high-scale place to go.
And so that's what we think.
And so that's what we think.
And we think there's a huge hole in crypto AI right now
where people are basically stuck going to centralized providers
because no one has focused the limited decentralized GPU resources
and encapsulated those in a trustless way
such that Web3 apps can actually use them.
Yeah, just before we go any further,
for people who might be coming in cold to this,
can you talk through ambience architecture
from hardware up to execution there?
Just because it's a fascinating structure
and I don't think people will truly appreciate it
unless they hear it straight from the source.
imagine if OpenAI were truly open
and was powered by AI Bitcoin.
Ambient is a useful proof-of-work network
that is focused on delivery
of a single highly intelligent model.
The goal is the best open weights model
that's currently available.
And our architecture is based on Solana. We are a fork of Solana. We convert Solana from
proof of stake to proof of work. And we're a modernization of proof of work. So a non-blocking
proof of work. That means that people can work on different problems and submit the results. But the problems are all related to the delivery of and
improvement of the single model on our network. So we have one of our proofs of work is inference. A miner can prove that they mathematically generated inference in a valid fashion.
So to the Web2 consumer, that just looks like you make an API request and it comes back.
And if you want, there's an optional hash that you see that appears on the blockchain
that says, you know, this was done correctly.
Behind the scenes, that's actually, that operation is actually part of the consensus of our network.
We fundamentally change the consensus so AI is at the heart of our consensus.
An agreement about what valid inference is, is a core part of our delivery and our service.
So, you know, we've, that's one of the primitives,
the useful proof of work primitives.
And then we also support fine tuning
and we'll ultimately do a pre-training as well on the network.
And the idea is that unlike a lot of projects
that have come previously,
this is an actual self-improving network.
The network, when it has Slack capacity, is working
on creating synthetic data that it trains itself on that can improve its intelligence. And this
has become a more and more viable path. So, you know, if I were to summarize, the network is focused entirely on the delivery of this service.
And demand, unlike with traditional crypto ad networks, can come from anywhere.
You know, we're not fighting over this limited pool of crypto users that people seem to fight over. Our tokenomics don't depend on that. Our network utility doesn't depend on that. We serve Web 2 and Web 3. And what we provide is what everyone is demanding, which is inference and, you know, that core building block of the inference economy.
So I could say a lot of other stuff, but maybe I'll stop there
and we can dig into pieces of that if that's okay.
Yeah, so I think at the surface, where does that supply side come from?
Where's the hardware coming from?
Is the hardware bottleneck as extreme as every mainstream media outlet is reporting on?
Where is the required amount of energy in the future
in the next five to 10 years going to come from?
How do you think about the supply side of that?
Right, so I think that we're in an interesting time
and I'll talk about that in a second.
Directly speaking, our testnet comes from, you know, for the big model,
it comes from a group of whitelisted miners who volunteered.
You know, there are people all over the world who've expressed interest,
and, you know, they'll ultimately be getting some rewards for their participation,
which would be very much in line with what they would get if they participated in our mainnet in the future. they'll ultimately be getting some rewards for their participation,
which would be very much in line with what they would get if they participated in our main net in the future.
So it's really like, you know, it's a full economic test, really,
is what our test net is intended to be.
And, you know, the cool thing about Ambient's model is that I like to say we make lemonade out of lemons.
Traditionally, a decentralized supply has been looked down upon.
You know, there have been real questions about the quality of the compute that comes from different regions, the integrity of the computations.
And we've seen really all these deep ends that aggregate, let's be honest,
like a pretty substantial amount of decentralized supply just fall flat on their faces.
And that makes me very sad.
And these are all part of what I would call a platform as a service model.
So decentralized supply, the platform as a service sort of puts it on you as the customer to verify
the integrity of the hardware that you're working with. You just sort of rent out a server or you
rent out like a GPU and you kind of need to do your own diligence to make sure that it's working.
And that's kind of a high barrier for mass adoption, right?
Most of us are not going to be able to benchmark
even a consumer graphics card
and feel confident that we got it right.
But that's the barrier to entry that's been created so far.
So Ambient is more like a software as a service.
So from a consumer standpoint, the only thing that you can get is valid inference.
From a behind the scenes standpoint, miners are only getting rewarded if they're providing
There's a mathematical requirement for that.
They only get paid if they provide a valid inference. If a consumer makes a
request and somebody provides invalid inference, it's silently dropped and rerouted by the network
to somebody who can provide valid inference. And so that's a lot different kind of model
than has ever come before. And that we think is the experience that people need in order to trust a decentralized supply and rely upon a decentralized supply.
thing. And that's how we make the lemonade. Because if you have that experience consistently,
it's going to feel good for you. And for agents, you know, who are scaling up,
like that's providing availability and reliability and uptime guarantees because of the self-healing
nature of a proof of work network that are really important for performance.
nature of a proof of work network that are really important for performance.
Yeah. And what do you, like, it kind of feels, and I don't know if we spoke about this last
time, but it feels like it's accelerated drastically. The way that the West and the
East have positioned all their frontier models with regards to everything that's going out
of China's open source, which feels, I don't know what kind of 5D chess is going on there,
but I don't know if you've got any insight on that.
But, and then over in the US, obviously with OpenAI,
I suppose Grok to a degree, Anthropic,
Mistral, I believe is open source.
I know someone from DeepMinds just announced they've raised in the UK,
probably two or three years behind, as usual in the UK.
But why is the stance on the
models coming out in China? I seen basically the Chinese version of ZipRecruiter release a 3 billion
mini parameter model the other day. And I was just like, how the hell are these companies getting
involved in this sort of stuff? Why are they all open source? What's the game theory behind them
going down that route as opposed to what we're seeing in the west yeah so i'm going to answer your question on like a hardware and software level at the same time
because i don't think i fully answered your hardware question i talked a little bit about why
uh you know decentralized can come and play with ambient but from a larger perspective uh you know
the i think that what you're seeing is that the demand is not primarily tapping the supply for pre-training.
Like, if that were the case, then none of these Chinese models would come out.
Because they simply wouldn't have the hardware to be able to train things. But the reality is, you know, we're getting like ZAI
coming out with GLM-5, which is basically on par, right?
So from a hardware standpoint,
the demand is coming from inference.
And then, you know, we get into this question of,
And this is a little bit difficult, and so I'm going to unpack it in a few stages.
And please feel free to interrupt me as we go here, because it's kind of a multi-part thought.
So I guess the first thought is that the reason that a little Chinese company can come out with a world-class 3 billion parameter model is because the there is no moat thesis was true.
And this has really profound implications for the whole area and what's going to happen.
So when we say there is no moat, we mean that there was no particular technical reason why
somebody couldn't train a competitive model, like given enough data and given a reasonable
amount of compute. And my addition to that is that I believe that every Western model is distillable.
So I don't know if you're familiar with distillation, but that's essentially where you have a teacher model that trains a student model to become brighter or gives it more knowledge or more capabilities as far as reasoning.
And what that means is you can rapidly make a lower grade model achieve higher tier performance.
And my belief is that the reason that Chinese companies can turn around literally like days or single digit weeks after a closed release, a big closed release like Opus 4.6 or GPT Codex 5.3 and have models with competitive performance is because they have mastered the art of distillation.
they have mastered the art of
tuning of their existing models
operate at the scale of trillions
all your models kind of look the same.
They start to look the same.
They're self-organizing, and the concepts start to be convergent,
and the layouts, actually, of the concepts in the models
start to become convergent.
And so when everyone's training on many of the same trillions
of tokens of text, you just need to do a little bit of sampling on someone's new model
to sort of tune those things in order to get better performance out of your model.
And I think that's a little bit of what's happening.
So I think that if we're asking a question like, why open source?
First of all, to undermine the competitive advantage of other players.
Secondly, because you can.
And because almost anyone could.
And there is PR benefit to being the first to doing something.
That would be the short version of the answer. But it's made possible by
the convergence of architectures and information
and improvements in distillation capability.
And yeah, we could talk about some of the implications of this,
Yeah, and do you think that kind of puts pressure on,
I know we've seen Elon come out and say
that more legacy models on Grok will be open source.
Do you think it puts pressure on the big privates
to actually go down that route?
Or do you think the average person doesn't care?
How do you think that plays out?
Well, it's really strange
because, again, we've got a market pricing assumption that I think could be completely invalidated.
So the current market pricing of Anthropik and OpenAI is that they're the undisputed champions of inference.
Because they're trusted brands, they're going to have infinite penetration into corporate networks forever.
And that those corporates are going to be comfortable sharing their IP and training data with these closed networks forever.
And this is really silly because these are all rational economic actors. And I think the moment
you see OpenAI do another move, like cannibalizing the insurance industry or, you know, making a move
that actually threatens pharma, that they will all turn inward and try and use open weights products again.
And so I think that the pressure that is on these closed providers is an impossible tension.
The reality is that if they release a model, like one to two weeks later,
there is going to be a clone of it that works really well because distillation is very
good. And you've got this platonic convergence of model architectures happening underneath the
surface. Like this is just the structural reality. So if you're them, you're trying to tell investors
that you are unique and distinct and that you have a durable brand advantage.
But the reality is that you are not unique and distinct.
And your brand advantage is, I think, illusory.
So this is what always kind of frustrates me in the discourse. You know, I've talked to a lot of different VCs and players in the space.
it's kind of like people telling you that Yahoo is inevitable.
You know, like in the early internet times,
like why would you ever fight Yahoo?
Yeah, you see, you see like the pendulum just swing
Like as I say, the Anthropic kind of buried OpenAI
with those four ads, which was just fantastic.
And then this kind of like pendulum swing back.
And now everyone's like, oh, maybe OpenAI are actually black
because they've managed to convince Pete to come over.
And then, I don't know, starting to see like,
obviously lower rungs down, but everyone's saying,
Kimi's amazing, DeepSea 4.0 is about to be released.
I don't know if it could drop any second, to be honest.
So then it's just like, everyone's just constantly chasing the tail.
what did it get valued at a couple of,
like last round, like 900 million,
900, like some ungodly like valuation.
Everything's moving at breakneck speed
and anyone who talks in definites,
I'm very, very wary of at the minute
because everything's changing so, so quickly.
Well, and I think that, you you know there's some temporary conditions here so one of them is that software situation where people are
like oh these are inevitable they'll always win i actually think that you know if you talk about
credible neutrality again that's where people are going to go they're going to go towards privacy
and credible neutrality because it's going to be very obvious that these large players are abusive, A, and B,
have no durable advantage.
So if they have no durable advantage, it doesn't make sense to be abused, right?
As a consumer or a company.
But the only reason you would consent to being abused, having your data abused, your privacy
abused, to be manipulated by ads, is if that was offering you something that you
couldn't get anywhere else. And that's just not going to be the case. Um, but, uh, you know,
if we're talking about, uh, like where it goes in platform terms, uh, you know, people are going to go where they're on a level playing field.
And that level playing field doesn't align with any of the majors.
Because if you go into the closed providers right now,
the playing field is tilted completely towards them.
And so that's one condition is the software condition.
The other condition is related to hardware.
So you may have been following, you know following NVIDIA's purchase of Grok.
And that should, to me, actually, that should have decimated NVIDIA's stock, that purchase.
NVIDIA are not the best makers
They're ant miners from Bitmain.
if NVIDIA is validating the thesis
that the future of inference is an ASICS
it's game over for their dominance
this whole hardware thing is just a blip then
yeah and it's obviously a geopolitical pawn with Taiwan as well
with regards to NVIDIA and chip reduction out there I think Chamath's obviously won againical pawn with Taiwan as well with regards to Nvidia and chip reduction out there.
I think Chamath's obviously won again with that Grok acquisition
because I know he was heavily involved in that one.
What do you think more like application?
Because there's an interesting,
I know it was a while ago when this happened,
but I feel like the resolver networks
that you have on prediction markets,
they've tripped up a few times.
Like there was the weird court debate.
that went on with Polymarket for quite a while.
And there's quite a lot of people that got quite angry about that.
And then everyone was like,
maybe we can just use LLMs to actually resolve and settle prediction markets.
That's obviously the layer above,
but it's also the layer below at the same time.
I don't know if you've had any thoughts on that.
Is there anything I'm thinking about there with regards to that?
Because it's the use case that has cut through the fourth wall
My parents know what prediction markets are,
and I just wonder if there's an avenue for AI
to resolve the markets there yeah i mean we think llm as judge is huge
for ambient and um the reason is simply that if inference is underpinning the judge you want the
judge to be credibly neutral and the only way that that happens is if the inference operation itself was run correctly. And so we have a
mathematical proof of that, and there's no overhead associated with that particularly. There's like
1% overhead. So for free, you can get a guarantee that the LM judge was operating on the right
context, performing inference correctly, producing the entire result that you end up with. But I think that that concept
extends, you know, not just to prediction markets, but to a lot of other areas. So you imagine
that we live in a human world of personal responsibility, accountability, and liability right now.
And that's mainly associated with employees.
You know, I'm sure we've all had a bad boss who says something like, whose throat do I have to choke?
You know, to get something done around here?
With agents, what are you going to do to Claude?
What are you going to do to Anthropic?
If Anthropic screws up your inference, are you going to sue them?
Do you have any recourse? Can you even prove that your prompt went correctly into their inference engine? Because they're well known,
as is OpenAI, for just actually rewriting queries so that they are safer to process.
And they have all sorts of guardrails that they layer on top.
So if you're a company, you're trying to assign a liability,
responsibility, accountability to an AI agent,
If you don't control the underlying model,
if you have no insight into that,
if you have no insight into the scaffolding around that model,
and if the entity that you're trying to hold accountable is much bigger than you,
And so that's where I think that we get into, we heavily favor,
and this is the credible neutrality again,
we're going to heavily favor systems that are provably fair,
where we can have an immutable record of what they did,
where we know exactly what model is running
with exactly what context,
because what we can do then is tune our systems
to perform better on a going forward.
That's the accountability.
Just like we would tell an
employee, hey man, I need you to look at those files more carefully in the future because we
made a mistake on that. The way that you have agentic accountability is if you understand
where the model slipped up, what context you're providing, and you can re-engineer that context
so that it can be successful in the future.
And that's just not possible
with the scaffolding that we have right now.
And how are you guys setting the model out?
Is it going to be more general purpose?
Are you aiming for it just to be used for...
Yeah, because if you think about different models
you'd probably say more suitable for coding.
or I know there was a lot of people
very, very upset that 4.0 got discontinued the other day
and there was a bit of an outcry on the timeline,
But that was more for general purpose.
How are you setting that up?
And what would be the initial use cases for it?
Yeah, so our view is that the first order of business is to provide a model with high
intelligence, high reasoning capabilities, and high knowledge.
And no one at Web3 has really done that at decent scale.
So that's order of business number one.
But the other thing that we're going to do is enable the high-scale use of fine-tunes.
And we think that fine-tunes are really important because they let you control the personality of your model in a granular and stable way that isn't really achievable right now.
And that's really important for customer-connected
And they also let you be very, very token-efficient
The reality is if you saturate your context, for example,
with a medical textbook, You might get very good results from query to query,
but you are paying through the nose for that.
If you have to give it like 80,000 tokens of context each time,
and your responses are much slower.
And so with a fine-tuned model, you can build that knowledge in.
You can get very quick, token-efficient responses to things that are highly accurate, particularly when you additionally reinforce it with things like RAG. And so
Ambient's thought is, like, deliver the scale, deliver a good experience at scale, but also
support high-scale customization via fine-tunes. And a lot of our effort has been in that area.
A lot of our effort has been in that area.
And something that I don't think we've discussed anywhere really,
and this might be the first time that we'll mention it,
is that in about a month's time,
we will be launching an additional model on our network.
network. So we will have Ambient. We also have Ambient Mini. So Ambient Mini will be a small
We also have Ambient Mini.
model that is latency and fine-tune optimized. And the idea is that users of our network will be able
to use the big model to create fine-tunes of the small model for these specialized use cases that are going
to be really token efficient and really fast and really economical.
And not only does that enable kind of a new capability, but it also opens up the spectrum
on hardware because we're going to open our testnet, which is currently
live, to permissionless mining, you know, by a bunch of people with small GPUs.
And so that kind of, that's kind of the paradigm we see is that the big model is for really,
you know, the state of theart reasoning performance on all these cases,
but that we can get very, very good performance
with a small, fine-tuned model,
and we can really broaden access to that
and be very competitive with the big providers
by running the small model, the small fine-tunes at scale
on things like consumer GPUs and edge devices.
And that's how essentially Ambient becomes Ambient.
We spread the intelligence all around.
Nice. So let me try and relay this back so I fully grasp it.
So Ambient's large model is going to be serving to fine
tune ambient mini so let's think of like a couple of use cases of where that could actually come in
i'm going to try and tie it back to what open ai did which i thought was really sneaky where they
said we're not going to give you any health advice anymore we're not going to give you any legal
advice anymore and then productize both of them so so it could it could in theory you use Ambient as the fine-tuning model
to then spin out bespoke company knowledge-based models
in either of those sectors I spoke about,
like health and or legal and or maybe coding first.
Could you just break off into those three branches effectively?
Exactly right. So imagine
that you have a really tough, detailed
compliance workflow for your company. It involves
normally reviewing hundreds of pages of documentation,
bouncing off a bunch of corporate processes, et cetera.
You could create a fine-tune that just dealt with that knowledge base.
And with Ambient, you could operate on that fine-tune at scale, completely privately,
if you wanted. And you would get the benefits of the high intelligence and the
good token utilization and the specificity. So that's kind of the intention.
For less sensitive things, you could get it very cheap. If you wanted to operate on arbitrary
consumer hardware, the quality is guaranteed by the verified inference.
You know, the privacy is essentially based on anonymity at that point,
if you're just doing, you know, like the on any device workflow.
But you could get that for very, very cheap
if you weren't dealing with sensitive data.
And so, yeah, there's this spectrum of options that this enables.
Awesome. And you mentioned RAG there.
Can you give us a high level of what that is,
just so the average person would be able to understand what the importance of that is?
Sure. So RAG is retrieval augmented generation,
which is essentially like context engineering.
imagine I have a big book that has lots of relevant information and lots of irrelevant information. I can't feed the whole book to the LLM because its memory would overflow.
So what I need to do is just pick out the relevant portions of that context and feed that
to the agent. And then it can spit out an answer that's
informed by that context. Something that builds on RAG, which we're actually using, is what's
called agentic search. So the idea there is that you take a big piece of text and you treat that text like a library would treat books in a library.
You actually catalog it according to, and you put different parts of it in different sections.
You sort of build a tree of different content. And then you have an agent actually navigate the tree
and pick out the right area.
And then within that area, you do RAG.
So you do a similarity search on the concept,
but you've narrowed down the search space
so that you get much more relevant results.
So that's another technique that we employ.
Okay, so we're getting close to wrapping.
Is there anything that you want to make people aware of,
particularly on the ambient side?
Because I know we've covered an awful lot of topics there
and I wanted to give you some more time
with regards to ambient as well.
What should people be aware of?
Yeah, so Ambient's testnet is live.
It's powered by decentralized compute.
The whole blockchain is involved.
We're also on OpenRouter.
So we're really proud that, you know, we're kind of middle of the pack in OpenRouter in
terms of our delivery of everything, sort of accounting for the fact that we're doing
a full blockchain workflow in delivery.
Like we've got behind the scenes when we do inference,
we have like a full auction that happens.
And, you know, we're picking a miner in that auction
and there's a bunch of other processing that happens,
but we're still, our time to first token
is still like under two seconds, right?
So I think that we're proving right now
that the concept is economically viable from a decentralized
standpoint and competitive. And, you know, that's been borne out, I think, in some of the usage,
because in about a month, we've gone from, you know, zero users to about, we're almost at 40,000
monthly active users. So we're really stressing the network right now.
So I would encourage people to check it out.
We also have some fun first-party apps
that we're coming out with.
One of those is ambientrpg.xyz.
So if you want to do a fully generative,
old-school text text role playing experience,
like that's, that's going to be pretty fun.
I have a few others that are coming out as well.
You know, our, we have a drop in one line substitute for, you know,
if you're using cloud code, you can just switch to ambient with one line.
That's in our documentation.
Our API is Anthropic and OpenAI compatible. So, you know, we can really fit into
almost any chat client that people would like. So I would say now's the time. We've got a lot
of people doing kind of agentic judge. We have some prediction market things that we're exploring also,
kind of like we talked about.
And there are just all sorts of applications that you can build right now
So, you know, if you're on the Web3 side, I would say,
like, why the hell are you using a centralized provider
if you can get equivalent performance and, you know, guaranteed, like, verified inference for cheaper?
Because our prices are very, very competitive.
And if you're on Web 2, if you are running into rate limits, or you think that your application is on the bubble, or you're worried about losing your IP to one of the majors,
We're ambient underscore XYZ on X.
I'm Iridium Eagle on X is my username.
We've just got a lot going on.
And for folks who are interested in mining,
as I mentioned, in about a month,
we're going to be doing a fully permissionless mining
on the network, which is going to really blow the doors open
as far as these fine tunes and smaller model use cases.
So I think there's a lot happening.
Well, Travis, thanks so much for your time.
I can only imagine how busy you are,
but I think we'll do it again. It definitely won't how busy you are, but I think we'll do it again.
It definitely won't be a year this time.
I think we should do every six months or so
because I love your insight into why the industry's heading.
You've got a very unique perspective.
I really appreciate it, Jim Brown.
Oh yeah, thank you so much, Grant.
I really appreciate your time too.
Josh, Henry, take it away.