Loading
Loading
Loading
Loading
Loading
Loading
Loading
Loading
Loading

The end of free compute is the beginning of better engineering

The end of free compute is the beginning of better engineering
I guess we all knew this day was coming. Here are two stories to illustrate what I mean. Tell me if you spot the pattern. Let’s start with Uber: Uber Technologies, Inc is learning the hard way that scaling AI isn’t just about speed—it’s about cost. Despite spending $3.4 billion on research and development, the company has already exhausted its planned AI budget just months into 2026. According to The Information, Chief Technology Officer Praveen Neppalli Naga said Uber is now “back to the drawing board” after a surge in the use of AI coding tools, particularly Anthropic’s Claude Code, has blown past internal expectations. Uber’s Anthropic AI Push Hits A Wall, Yahoo Finance Then there’s Microsoft: Last year, Microsoft CEO Satya Nadella revealed that the company writes up to 30% of its code using generative AI. As it now happens, Microsoft is reportedly planning to reduce the use of Anthropic’s Claude Code—a move designed to push its employees toward Github Copilot CLI. According to Warren’s [Tom Warren of The Verge] sources, Microsoft’s Experiences + Devices division, which includes teams working on Windows, Microsoft 365, Outlook, Teams, and Surface, is supposed to stop using Claude Code by the end of June. These teams are expected to transition their workflows to Github Copilot CLI over the next few weeks. The report reveals that the decision isn’t centered on Microsoft pushing its staffers towards its own offering — there are some financial implications at play, too. Microsoft’s financial year is expected to end on June 30, which means canceling Claude Code licenses for its employees could cut its operational costs as it transitions into a new financial year. Microsoft is ditching Claude Code for Copilot CLI—but its own devs aren’t happy, Yahoo Tech If you’re an AI sceptic, these reports of “AI is unaffordable” headlines will feel like vindication. I’m sure you have theories about how the math and economics of token scaling will never work, and perhaps you’re right, but this situation is a bit more nuanced. And a big part of that story is incentives. Take Uber. I’m sure they’re being truthful that they hit their AI budgets early, but what’s underdiscussed is how they hit those budgets. In the same story where Uber CTO Praveen Neppalli Naga admits that the company is “back to the drawing board”, it also notes that Uber’s roughly 5,000-strong engineering team even had internal leaderboards based on usage. Imagine you’re an engineer at Uber and you’re told to spend as much as possible to land on a leaderboard. You’d obviously burn through a serious chunk, and then management is left wondering how the money disappeared. On the other hand, Microsoft has other incentives. In an internal note announcing the rollback of Claude Code, Rajesh Jha, executive vice president of Microsoft’s experiences and devices group, wrote , “When we began offering both Copilot CLI and Claude Code, our goal was to learn quickly, benchmark the tools in real engineering workflows, and understand what best supported our teams”. I guess that makes sense? Microsoft had just launched the Github Copilot CLI coding tool, essentially their answer to Claude Code, but that wasn’t going anywhere, so they gave their teams access to Claude Code so they could bridge the gap. This is still pretty confusing because nearly everything in Microsoft is either called 365 or Copilot (including Office, which is now Microsoft 365 Copilot). Speaking of Github Copilot, here’s another set of incentives that’s changing from next week. The all-you-can-eat AI buffet is coming to an end. Microsoft is closing the AI buffet offered to Github Copilot customers, acknowledging that it can’t sell AI like Red Lobster’s Endless Shrimp. The US seafood restaurant’s all-you-can-eat shrimp promotion led the company to bankruptcy in 2024 and while Microsoft is nowhere near so financially overextended, the software giant’s code hosting biz has decided it no longer wants Copilot to operate at a loss. Github is therefore shifting Copilot from request-based billing to usage-based billing on June 1, 2026. Under request-based billing, Github Copilot subscribers will be allowed to submit a set number of premium requests, with certain models priced at a higher request rate but without any consideration for the complexity of the request. So complex prompts that require a lot of “thinking” often cost Github more than the company earned in subscription fees. Microsoft’s GitHub shifts to metered AI billing, The Register By the way, Microsoft isn’t the only company that is moving away from the “all-you-can-eat” AI token model. Anthropic announced a similar shift in mid-April for its enterprise customers. And then a couple of weeks later, Uber’s CTO conveniently gave a quote about escalating AI costs. So what’s Uber doing about it? Scaling back? Of course not—the story casually mentions they’re now in talks to test with OpenAI’s Codex, to further expand their AI stack. Methinks this whole thing was a public negotiating tactic between Uber and Anthropic. So what do we know from all this? Anthropic and Github Copilot changed their pricing models to reduce subsidies for AI usage. As a result, companies are now vocally complaining that token costs are getting out of hand. Nobody is talking (at least publicly) about scaling back AI usage, which either suggests that these costs aren’t that high, or that companies believe that a Rubicon has been decisively crossed and there’s no way they’ll be able to get back to the “old” world. Thus, companies like Microsoft and Uber are switching to other similar options, like Codex or Github Copilot. There’s also the additional complication: Anthropic and OpenAI are gearing up to go public later this year. I’m guessing this means they’ll try to reduce their burn, which means AI subsidies aren’t going back up anytime soon. AI is going to get a bit more expensive before it starts getting cheaper. If this is going to be the near future, I believe it’s excellent news for AI. Until now, AI has felt practically infinite with near-zero (or highly subsidised) cost to building anything. I think some sense of constraint will lead to much better outcomes, especially for AI. If it gets a bit more expensive to self-publish a 50,000-word book filled with slop, it stops being worth the effort from an ROI standpoint alone. And nowhere is this news truer (or better) than in engineering and for developers. I think most people forget that engineering, as a function, is all about maximising outcomes by optimising around constraints. Go to any startup, and ask them what’s stopping them from shipping faster, and the answer will invariably be, “We need more engineers”. This may be true, but it’s also a misinformed perception. Just because a company can hire more engineers doesn’t mean it should. There’s a point at which constraints become the source of creativity for how you ship products at better quality and speed. Engineers understand this well. They’ve spent their entire careers optimising for speed, latency, memory, and physical constraints. By the way, engineers have also figured out how to optimise for bandwidth . Most engineering organisations have careful, deliberate decisions about how to size and scope projects, including story points, and much more. My prediction is that as tokens become more expensive, engineers will start to include tokens as one of the many constraints that they’ll add to their planning and roadmaps. I think they’ll figure out how to make decisions on when to deploy agents, when to code by hand, what the tradeoffs are, and how it all comes together. A new era of engineering is about to arrive. And it’s all possible because, finally, the free token era is going to be over. I can’t wait for it. This week on the Zero Shot podcast Hi! This is Vidhatri, the producer of Zero Shot . Hope everyone is having a good weekend! I certainly am, enjoying the lovely Bengaluru weather right now while also prepping for two wonderful episodes that I cannot wait to share with all of you. More on those later. But before that, I have to talk about Claude and Anthropic again (no surprises there, we run an AI podcast!). Hear me out. This time, it’s about the company’s push into legal workflows. On 12 May, Anthropic unveiled 20+ new “MCP connectors that link Claude to the software the legal industry runs on”. Alongside this, it announced 12 “new plugins tailored to specific legal work”. So, how does Claude impact a lawyer’s workflow? Are the legal software companies at risk now that Claude seems to be doing everything? We posed that question to Shashank Bijapur, the founder of Spotdraft, a legal AI company that works on contracts. He, unsurprisingly, gave us an example involving contracts. Let’s say someone sends an NDA to The Ken and The New York Times . Both are media organisations, but they operate in different contexts and realities. Therefore, the redlines that come back cannot and will not be the same. Now, where does that reality come from? Shashank argues it’s from the system-of-record layer—the definitive and detailed source of information of how an organisation is structured—that companies like Spotdraft have understood over the years. So, the competition here is not with Claude. It’s about building on top of it and standing out. Here is Shashank summarising Spotdraft’s moat: “A lot of our applications are built on top of Claude, but we supply that intelligence and brain that goes into how Claude should behave and what that response should be.” But that’s just one layer of the conversation. Tune in to our latest episode to understand how legal AI is growing and what really makes for a defensible business when Anthropic is doubling down on industry after industry. Find the episode on Apple Podcasts , Spotify , Youtube or The Ken app .

Source: The Ken

Read Original Source →

Cart (0 items)