Baseten lands $300 million to power the next wave of AI inference
Baseten, a fast-rising infrastructure startup focused on running and scaling AI inference, has secured a massive $300 million funding round led by growth investors IVP and CapitalG. The new capital positions the company as one of the best-financed challengers to Together AI and other players racing to become the default platform for deploying and serving large-scale AI models in production.
The deal underscores how investors are shifting attention from headline-grabbing foundation model builders to the less glamorous but mission‑critical layer of infrastructure that actually runs those models reliably for enterprises.
From model hype to inference infrastructure
Over the past two years, funding in generative AI has largely focused on companies training frontier large language models (LLMs). But as more organizations move beyond experiments and into real‑world deployments, the bottleneck is increasingly at the inference layer: how to serve models with low latency, predictable cost and enterprise‑grade reliability.
Baseten is squarely targeting that pain point. Rather than building its own foundation models, the company provides a managed platform that lets teams deploy, scale and monitor LLMs and other machine learning models across cloud infrastructure.
What Baseten actually does
At its core, Baseten offers a full stack for production AI:
- Optimized GPU and accelerator orchestration to run inference workloads efficiently.
- Automatic scaling to handle spiky traffic without over‑provisioning.
- Support for multiple open‑source models and proprietary models, giving enterprises flexibility.
- Built‑in observability, logging and performance monitoring for AI endpoints.
- Security and access‑control features tailored to large organizations.
For customers, the pitch is simple: plug into Baseten instead of building and maintaining complex, costly inference infrastructure in‑house.
A direct challenge to Together AI and other inference players
The size of the round and the profile of backers signal that Baseten is being positioned as a serious rival to Together AI, which has rapidly gained attention by offering cloud‑hosted LLMs, high‑performance inference, and APIs tailored to developers.
While Together AI has leaned heavily into providing hosted models and training infrastructure, Baseten is sharpening its focus on the operational realities of running inference at scale inside enterprises. That includes tighter integration with existing DevOps, MLOps and security workflows, as well as support for customers that need to mix open‑source, third‑party and proprietary models.
The competition is not limited to a single rival. Baseten is entering a crowded field that includes:
- Cloud hyperscalers offering native AI inference services.
- Specialized providers focused on GPU leasing and bare‑metal infrastructure.
- Developer‑first platforms providing hosted LLM APIs and fine‑tuning tools.
Yet the new funding suggests investors believe there is room for a dedicated, vendor‑neutral inference layer that sits above raw compute but below application‑level tools.
Why IVP and CapitalG are betting big on inference
The participation of IVP and CapitalG — two of the most prominent late‑stage investors in technology — is a strong signal about where they see durable value emerging in the AI stack.
IVP has a history of backing category‑defining infrastructure companies, while CapitalG, Alphabet’s independent growth fund, brings deep experience with large‑scale data and cloud computing. Their involvement gives Baseten not only capital but also strategic support in navigating partnerships, go‑to‑market and large‑enterprise adoption.
For investors, the thesis is clear: as more organizations integrate AI assistants, recommendation systems and automation tools into core workflows, spending on inference — the ongoing cost of running models — will dwarf the one‑time cost of training them. Owning a key piece of that infrastructure could be enormously valuable.
Enterprise‑grade AI as a service
A growing number of large companies are wary of sending sensitive data to public LLM APIs or being locked into a single vendor’s ecosystem. Baseten is leaning into this concern by emphasizing:
- Flexible deployment models, including virtual private cloud and region‑specific setups.
- Fine‑grained access controls, audit logs and compliance‑ready architecture.
- Support for data governance and integration with existing security tooling.
This enterprise‑first stance could differentiate Baseten from more developer‑centric platforms and make it attractive to regulated industries such as financial services, healthcare and the public sector.
How the new funding will be used
With $300 million in fresh capital, Baseten is expected to accelerate across several fronts:
- R&D on inference optimization: Investing in better model quantization, caching and routing algorithms to reduce latency and cost.
- Global infrastructure expansion: Building out multi‑region capacity to meet data residency and performance needs.
- Enterprise sales and support: Scaling customer success, solution engineering and compliance teams.
- Platform ecosystem: Deepening integrations with popular MLOps, data platforms and application frameworks.
By focusing on performance and reliability, Baseten aims to become the default choice for teams that need to run production‑grade AI services around the clock.
The broader race for AI infrastructure dominance
The funding round highlights a broader shift in the AI market. As the number of capable open‑source models grows and enterprises demand more control, the real differentiation is moving toward:
- How efficiently workloads can be run on scarce GPU resources.
- How predictably costs can be managed at scale.
- How seamlessly AI can be embedded into existing products and processes.
In this context, companies like Baseten and Together AI are not just competing with each other; they are also shaping what the standard stack for production AI will look like over the next decade.
If Baseten can convert its new war chest into technical and commercial momentum, it may emerge as one of the pivotal infrastructure providers underpinning the everyday AI experiences that users and enterprises increasingly take for granted.

