Why on Earth is OpenAI buying Windsurf?

(theahura.substack.com)

98 points | by theahura 2 hours ago

23 comments

  • _jab 1 hour ago
    A few thoughts:

    1) I agree that the moat for these companies is thin. AFAICT, auto-complete, as opposed to agentic flows, is Cursor's primary feature that attracts users. This is probably harder than the author gives it credit for; figuring out what context to provide the model is a non-obvious problem - how do you tradeoff latency and model quality? Nonetheless, it's been implemented enough times that it's mostly just down to how good is the underlying model.

    2) Speaking of models, I'm not sure it's been independently benchmarked yet, but GPT 4.1 on the surface looks like a reasonable contestant to back auto-complete functionality. Varun from Windsurf was even on the GPT 4.1 announcement livestream a few days ago, so it's clear Windsurf does intend to use them.

    3) This is probably a stock deal, not a cash deal. Not sure why the author is so convinced this has to be $3B in cash paid for Windsurf. AFAIK that hasn't been reported anywhere.

    4) If agentic flows do take off, data becomes a more meaningful moat. Having a platform like Cursor or Windsurf enables these companies to collect telemetry about _how_ users are coding that isn't possible just from looking at the repo, the finished product. It opens up interesting opportunities for RLHF and other methods to refine agentic flows. That could be part of the appeal here.

    • theahura 1 hour ago
      I mention in the footnotes that this is likely a stock deal!

      I didn't think about telemetry for RL, that's very interesting

    • imtringued 1 hour ago
      If it's a stock deal that's even worse, since OpenAI is saying that their stock is definitely worth less than $3 billion.
  • kace91 2 hours ago
    >Some are better at the auto complete (Copilot), others at the agent flow (Claude Code). Some aim to be the best for non-technical people (Bolt or Replit), others for large enterprises (again, Copilot). Still, all of this "differentiation" ends up making a 1-2% difference in product. In fact, I can't stress enough how much the UX and core functionality of these tools is essentially identical.

    Is this exclusively referring to the ux or full functionality?

    Because I can tell you straight away that cursor (Claude) vs copilot is not a 1% difference. Most people in my company pay their own cursor license even though we have copilot for available for free.

    • Jcampuzano2 2 hours ago
      Agreed, although we're strictly prohibited from using cursor at work in enterprise, though they have been in discussion for an enterprise license.

      I use cursor for personal work though and it's night and day, even with the recent copilot agent mode additions. I told my CTO who asked about it if we should look into cursor and I told him straight up that in comparison copilot is basically useless.

      • nativeit 1 hour ago
        What are the most dramatic differences?
    • tyleo 2 hours ago
      Agreed, I’ve used both at work and personal projects. Copilot auto complete is great but it isn’t ground breaking. Cursor has built near entire features for me.

      I think copilot could get there TBH. I love most Microsoft dev tools and IDEs. But it really isn’t there yet in my opinion.

    • theahura 2 hours ago
      Can you say more?

      I was referring to UX, as that is the main product. Cursor isn't providing their own models, or at least most people that I'm aware of are bringing their own keys.

      I haven't used copilot extensively but my understanding is that they now have feature parity at the IDE level, but the underlying models aren't as good.

      • kace91 48 minutes ago
        >Can you say more?

        My experience is that copilot is basically a better autocomplete, but anything beyond a three liner will deviate from current context making the answer useless - not following the codebase's convention, using packages that aren't present, not seeing the big picture, and so on.

        In contrast, cursor is eerily aware of its surroundings, being able to point out that your choice of naming conflicts with somewhere else, that your test is failing because of a weird config in a completely different place leaking to your suite, and so on.

        I use cursor without bringing my own keys, so it defaults to claude-3.5-sonnet. I always use it in composer mode. Though I can't tell you with full certainty the reasons for its better performance, I strongly suspect it's related to how it searches the codebase for context to provide the model with.

        It gets to the point that I'm frequently starting tasks by dropping a Jira description with some extra info to it directly and watching it work. It won't do the job by itself in one shot, but it will surface entry points, issues and small details in such a way that it's more useful to start there than from a blank slate, which is already a big plus.

        It can also be used as a rubber duck colleague asking it whether a design is good, potential for refactorings, bottlenecks, boy scouting and so on.

      • rfoo 1 hour ago
        > Cursor isn't providing their own models

        For use cases demanding the most intelligent model, yes they aren't.

        However, there are cases that you just can't use best models due to latency. For example next edit prediction, and applying diffs [0] generated by the super intelligent model you decided to use. AFAIK, Cursor does use their own model for these, which is why you can't use Cursor without paying them $20/mo even if you bring your own Anthropic API key. Applying what Claude generated in Copilot is just so painfully slow to the point that I just don't want to use it.

        If you tried Cursor early on, I recommend you update your prior now. Cursor had been redesigned about a year ago, and it is a completely different product compared to what they first released 2 years ago.

        [0] We may not need a model to apply diff soon, as Aider leaderboard shows, recent models started to be able to generate perfect diff that actually applies.

        • theahura 58 minutes ago
          (I most recently used cursor in October before switching to Avante, so I suspect I've experienced the version of the tool you're talking about. I mostly didn't use the autocomplete, I mostly used the chat-q&a sidebar.)
          • rfoo 52 minutes ago
            And I pay Cursor only for autocomplete - this explains the difference I guess.

            I do sometimes use Composer (or Agent in recent versions), but it's being increasingly less useful in my case. Not sure why :(

          • nicoritschel 51 minutes ago
            The redesign was ~5 months ago. If you switched in October, you 100% have not used the current Cursor experience.
  • WA 1 hour ago
    Ironically, 3 billion is proof that these tools do not work as expected and won’t replace coders in the near future.

    Otherwise, why spend 3 billion if you could have it cooked up by an AI coding agent for (almost) free?

    • JambalayaJimbo 1 hour ago
      Existing customer base?
      • arczyx 30 minutes ago
        OpenAI has way more users and brand recognition than Windsurf. If they decide to make their own code editor and marketed it, I'm pretty sure its customer base will surpass Windsurf's relatively quickly.
  • mrcwinn 38 minutes ago
    This is poor analysis. It claims OpenAI is spending “3 of its 40” billion in raised capital on Windsurf. Who said this was an all cash deal?

    And so if you’re purchasing with equity in whole or in part, the critical question is, do you believe this product could be worth more than $3b in the future? That’s not at all a stretch.

    Cursor is awfully cozy with Anthropic, as well, and so if I’m OpenAI, I don’t mind having a competitive product inserted into this space. This space, by the way, that is at the forefront of demonstrating real value creation atop your platform.

    • theahura 33 minutes ago
      (I mention in the footnotes that this is likely a stock deal!)
  • elicksaur 1 hour ago
    >OpenAI has also announced a social media project

    I haven’t heard about this before this post, but if they’re starting a “Social Media but with AI” site in 2025, can’t help but feel like they’re cooked.

  • g8oz 2 hours ago
    Reminds me of Snowflake purchasing Streamlit. A sign of a big wallet and slowing internal execution on the part of the purchaser rather than an indication of the compelling nature of the acquisition.
    • ramraj07 1 hour ago
      800 million for streamlit is still the most mind-blowing acquisition story I've heard. Codeium being a few bill sounds reasonable for that.
    • mosdl 1 hour ago
      The snowflake marketplace was/is such a mess. I always wondered what caused them to choose streamlit.
  • consumer451 42 minutes ago
    For anyone interested, here is an interview with Windsurf CEO and co-founder, Varun Mohan. It was released today. Not sure if it covers the potential acquisition, though I imagine not.

    https://www.youtube.com/watch?v=5Z0RCxDZdrE

    • theahura 34 minutes ago
      Thanks for sharing, super interesting!
  • ramoz 51 minutes ago
    It’s a vehicle that can hit the enterprise, broad user base, training data, and gain coverage of a competitor market (I’m sure the primary LLM in windsurf is Claude just like it is in cursor).

    Beyond that, these IDEs have a potential path to “vibe coding for everyone” and could possibly represent the next generation of general office tooling. Might as well start with a dedicated product team vs spinning up a new one.

  • jchonphoenix 27 minutes ago
    My guess is the telemetry data.

    OAI spends gobs of money on Mercor and Windsurf telemetry gets them similar data. My guess is they saw their Mercor spend hitting close to 1B a year in the next 5 years if they did nothing to curb it

  • croes 1 hour ago
    Can’t be for the software because they claim AI can do it too, so $3B should be more than enough to write it from scratch.
  • firesteelrain 2 hours ago
    Windsurf/Codeium has an enterprise version that can be used by corporations to provide AI assisted coding environments using their own HW stack (non cloud). This is beneficial for privacy and proprietary reasons especially if your data cannot be exfiltrated off premises. The hardware recommended to run Codeium is a lot cheaper than if you were to have 700 developers generate tokens. This model has the chance to generate many paying customers. Whether that has a $40b market cap is unclear
    • mbreese 2 hours ago
      I don’t think the utility of Windsurf was the question. There is clearly a benefit for a tool/service like this.

      The questions raised by the article (as I saw it) were price and timing. $3B is a lot. Is that overpaying for something with a known value but limited reach? Not to mention competitors with deep pockets. And the other question is - why now? What was to be gained by OpenAI by buying Windsuf now.

      • firesteelrain 1 hour ago
        It’s a Copilot competitor and it’s used by Zillow, Dell, and Anduril (newish Defense company). Cursor can’t work in airgapped environments right now. I don’t know what Codeium charges to run an on prem licensed version but they boast over 1000 enterprise customers. Codeium is on a rapid growth trajectory from $1.25b to $2.85b in such a short period.

        Codeium can be fine tuned. Though it’s trained on similar open source it does provide assurances that they do not inadvertently train on wrongly licensed software code.

        https://windsurf.com/blog/copilot-trains-on-gpl-codeium-does...

    • theahura 2 hours ago
      Thanks, I had a feeling it may be something like this since it seemed like they were investing more in enterprise. That said, do they do better than copilot on this? Surely msft has more experience and ability to execute in that market?
      • rfoo 1 hour ago
        Codeium's completion model is better than whatever GitHub Copilot has. For me it's Cursor > Codeium >>> Copilot. Yes, Copilot is that bad.

        And yes Codeium/Windsurf focuses on enterprise customers more. As GP said they have an on-prem [0], a hybrid SaaS offering and enterprise features that just make sense (e.g. pooled credits). Their support team is more responsive (compared to Anysphere). Windsurf also "feels" more finished than Cursor.

        [0] but ultimately if you want to "vibe-coding" you have to call Claude API

        • theahura 1 hour ago
          Ok thanks, that was my follow-up -- I assumed that airgap implementations are significantly worse because they can't back into Claude or Gemini
      • firesteelrain 1 hour ago
        It’s a Copilot competitor and it’s used by Zillow, Dell, and Anduril (newish Defense company). Cursor can’t work in airgapped environments right now. I don’t know what Codeium charges to run an on prem licensed version but they boast over 1000 enterprise customers. Codeium is on a rapid growth trajectory from $1.25b to $2.85b in such a short period.

        Codeium can be fine tuned. Though it’s trained on similar open source it does provide assurances that they do not inadvertently train on wrongly licensed software code.

        https://windsurf.com/blog/copilot-trains-on-gpl-codeium-does...

  • xnx 43 minutes ago
    The $3B number is largely and a marketing move to show what a big/real/important company OpenAI is. I hope Windsurf got some real money out of the deal too. If ChatGPT disappeared tomorrow, people would just move to the next model.
  • phillipcarter 2 hours ago
    To me it's fairly straightforward.

    OpenAI is predominantly a consumer AI company. Anthropic has also won over developer hearts and minds since Claude 3.5. Developers are also, proportionally, the largest uses of AI in an enterprise setting. OpenAI does not want to be pigeonholed into being the "ChatGPT company". And money spent now is a lot cheaper than money spent later.

    But this is all just speculation anyways.

  • apples_oranges 2 hours ago
    Isn’t the usual reason the people that work there?
  • dstroot 2 hours ago
    Currently using Claude code and Cursor, but VSCode is copying Cursor rapidly. Not sure if the VSCode forks will survive. Ideally we’d have VSCode with a robust agent capability and a fully open “bring your own LLM” feature.
  • captn3m0 2 hours ago
    > The worst case scenario for Apple is they decide to use user data late.

    Given how heavily Apple has leaned into E2E over the years, I don't see this happening at all, beyond local on-device stuff.

  • rvz 24 minutes ago
    Because Cursor got too greedy.

    Before approaching Windsurf, OpenAI wanted to buy Cursor (which is what I predicted thought too [0]) first, then the talks failed twice! [1]

    The fact they approached Cursor more than once tells you they REALLY wanted to buyout Cursor. But Cursor wanted more and were raising over $10B.

    Instead OpenAI went to Windsurf. The team at Windsurf should think carefully and they should sell because of the extreme competition, overvaluation and the current AI hype cycle.

    Both Windsurf and Cursor’s revenue can evaporate very quickly. Don’t get greedy like Cursor.

    [0] https://news.ycombinator.com/item?id=43708867

    [1] https://techcrunch.com/2025/04/17/openai-pursued-cursor-make...

  • AndrewKemendo 2 hours ago
    I was really liking windsurf but need to look for another option now unfortunately.

    It’s a shame we can’t have anything nice not get consumed but - such is the world.

  • whippymopp 1 hour ago
    if you look closely at the communications coming out of windsurf, I think it’s pretty obvious that the deal is not happening.
    • dagorenouf 1 hour ago
      where did you see this?
      • whippymopp 24 minutes ago
        check the windsurf subreddit. the official reps have repeatedly said it’s pure speculation
        • disgruntledphd2 15 minutes ago
          They have to say that, even if the deal is real. They might not even have been told.
        • rvz 15 minutes ago
          >pure speculation

          Or called plausible deniability. They will always deny these reports.

          At the end of the day, Windsurf has a private price tag which they know they will sell at.

          If they were smart, they should consider selling the hype.

  • seaourfreed 1 hour ago
    I think the defensible business models in AI are up-the-stack. Windsurf category is one example. There are more.

    AI will lead to far bigger work accomplished than one prompt or chat at a time. Bigger work flows on humans upgrading and interacting with AI will be a big critical category for that.

  • soared 2 hours ago
    I’ve switched off of chatGPT for general use from a kind of moral/ethical standpoint. All the competitors are effectively the same for easy research questions, so I might as well use a vendor who’s not potentially a scumbag.
    • AndrewKemendo 2 hours ago
      Which one did you move to?

      I haven’t found as good of an turnkey chat/search/gen interface as CGPT yet unfortunately.

      Even self hosted deepseek on an Ada machine doesn’t get there cause the open source interfaces are still bad

      • soared 1 hour ago
        Gemini primarily - but I’m using it for help with house projects, landscaping, shopping, etc and not for coding. Not not a scumbag owner but feels better than OpenAI.
    • dangus 2 hours ago
      Which vendor isn't a run scumbag or owned by a scumbag?
      • trollbridge 2 hours ago
        I’ve been using Grok (for free), so in theory I’m getting a vendor to spend money on me.
        • asadotzler 1 hour ago
          >Which vendor isn't a run scumbag or owned by a scumbag? >>I’ve been using Grok

          The biggest scumbag of them all, but hey "I use it for free."

        • dangus 1 hour ago
          But they can count you as a user and that positively impacts their valuation.
          • nativeit 1 hour ago
            One of the many inverted incentives in this space, considering every user Grok counts is actively burning through their cash.
  • dangus 2 hours ago
    > I've always been a staunch defender of capitalism and free markets, even though that's historically been an unpopular opinion in my particular social circle. Watching the LLM market, I can't help but feel extremely vindicated. Over the last 5 years, the cost per token has been driven down relentlessly even as model quality has skyrocketed. The brutal and bruising competition between the tech giants has left nothing but riches for the average consumer.

    There's a rich irony to be saying this right after explaining how Google is dominating the market and how they're involved in an antitrust lawsuit for alleged illegal monopolistic practices.

    And of course this willfully ignores the phase of capitalism we are in with the AI market right now. We all know how the story will end. Over time, AI companies will inevitably merge and the products will eventually enshittify. As companies like OpenAI look to exit they will go public or be acquired and need to greatly trim the fat in order to become profitable long-term.

    We'll start seeing AI products incorporate things like advertising, raise their prices, and every other negative end state we've seen with every other new technology landscape. E.g., When I get a ride from Uber they literally display ads to me while I'm waiting for my vehicle. They didn't do that when they were okay with losing moeny.

    And of course, "free market" capitalism isn't really free market at all in an enviornment where there are random tariffs being applied and removed on a whim to random countries.

    I really don't understand why people feel like they need to defend capitalism like this. Capitalism doesn’t need a defender, if anything it constantly needs people restraining it.

    • probably_wrong 1 hour ago
      I had a similar thought when I reached the part about Apple. A system that punishes the player respects their user's privacy while rewarding those that take everything that isn't nailed down is not a good system.

      The author frames Apple's choice as an own goal, but I'd rather see it as putting the failings of capitalism on display.

  • candiddevmike 2 hours ago
    I predict we will hit peak vibe coding by this summer. The tooling can't be sold at a loss forever/costs will go up for all sorts of reasons, and I think the tech debt generated by the tooling will eventually be recognized by management as velocity/error quotas start to inverse. I don't think self-driving developers will happen in time, and another AI winter will settle in with the upcoming recession.
    • MostlyStable 1 hour ago
      Ok, I'm going to make a slightly controversial statement: Vibe coding is both A)potentially hugely important and transofrmative and B)massively good.

      Most of the criticisms of vibe coding are coming from SWEs who work on large, complicated codebases whose products are used by lots of people and for whom security, edge cases, maintainability, etc. are extremely important considerations. In this context, vibe coding is obviously a horrible idea, and we are pretty far away from AI being able to do more than slightly assist around the edges.

      But small, bespoke scripts that will be used by exactly 1 person and are whose outputs are easily verified are actually _hugely_ important. Millions of things are probably done every single day where, if the person doing it had the skill to write up a small script, it would be massively sped up. But most people don't have that skill, and it's too expensive/there is too much friction to hire an actual programmer to solve it. AI can do these things.

      Each specific instance isn't a big deal, and won't make much productivity difference, but in aggregate, the potential gains are massive, and AI is already far more than good enough to be completely creating these kinds of scripts. It is just going to take people a while to shift their perspective and start asking about what small tasks they do every day that could be scripted.

      This is the true potential of "vibe coding". Someone who can't program, but knows what they need (and how to verify that it works), making something for their personal use.

      • kjellsbells 1 hour ago
        > This is the true potential of "vibe coding". Someone who can't program, but knows what they need

        I would argue that the real money, and the gap right now, is in vibe tasking, not vibe coding.

        There are millions of knowledge workers for whom the ability to synthesize and manipulate office artifacts (excel sheets, salesforce objects, emails, tableau reports, etc) is critical. There are also lots of employees who recognise that a lot of these tasks are "bullshit jobs", and a lot of employers that would like nothing more than to automate them away. Companies like Appian try to convince CEOs that digital process automation can solve this problem, but the difficult reality is that these tasks also require a bit of flexible thinking ("what do I put in my report if the TPS data from Gary doesnt show up in time?"). This is a far bigger and more lucrative market than the one made of people who need quick and dirty apps or scripts.

        It's also one that has had several attempts over the years to solve it. Somewhere between "keyboard automation" (macro recording, AutoHotKey type stuff) and "citizen programming" (VB type tools, power automate) and "application oriented LLM" (copilot for excel, etc) there is a killer product and a vast market waiting to escape.

        Amusingly, in my own experience, the major corps in the IT domain (msft, salesforce, etc etc) all seem to be determined to silo the experience, so that the conversational LLM interface only works inside their universe. Which perhaps is the reason why vibe tasking hasnt succeeded yet. Perhaps MCP or an MCP marketplace will force a degree of openness, but it's too early to say.

      • abxyz 1 hour ago
        I am very much in favor of anything that makes software engineering more accessible. I have no principled objection to vibe coding. The problem I have with vibe coding is practical: it's producing more low quality code rather than allowing people to achieve more than they previously could.

        Almost everything I've seen achieved with vibe coding so far has been long since achievable with low / no code platforms. There is a great deal of value in the freedom that vibe coding gives (and for that reason, I am in favor of it) but the missing piece of this criticism of the criticism is that vibe coding is not the only way to write these simple scripts and it is the least reliable way.

        Vibe coding as the future is an uninspired vision of the future. The future is less code, not more.

        • gerad 1 hour ago
          After watching a sales person and a PM vibe code I’ll say that existing developers are not the initial for vibe coding. Vibe coding absolutely allows non-devs to achieve more than they previously could.
          • abxyz 1 hour ago
            The issue is one of education, not possibility. There is so much hype around vibe coding that it has penetrated non-technical circles and given non-technical people the confidence to try and make things. The same people could use Zapier or Airtable or Tally or Retool or Bubble or n8n to achieve their goals but they didn't have the confidence to do so, or knowledge of the tooling.
      • theahura 1 hour ago
        Strong plus one. This is more or less what my company is working on -- more and more ostensibly nontechnical people are able to contribute to codebases with seasoned engineers.
        • pandemic_region 1 hour ago
          > ostensibly nontechnical people are able to contribute to codebases with seasoned engineers.

          Who is the contributor then? The AI or the prompt writer?

          I mean I'd be more at ease if they would just contribute their prompt instead. And then, what value does that actually have? So many mixed feelings here.

          At work I had a React dev merging Java code into a rather complex project. It was clearly heavily prompt assisted, and looked like the code the junior Java developer would have written. The difference is that the junior Java developer probably would have sweated a couple of days over that code, so she would know it inside out and could maintain it. The React dev would just write more prompts or ask the AI to do it.

          If we're confident that prompting creates good code and solid projects, well then we don't need expensive developers anymore do we?

      • croes 1 hour ago
        Here is my controversial statement:

        Vibe coding is coding like a customer hiring a programmer is coding.

        If all the code is written by AI it isn’t coding at all, it’s ordering.

      • candiddevmike 1 hour ago
        I don't think that's a $3B market.
        • consumer451 1 hour ago
          I dislike the term, but eventually "vibe coding" should replace many existing no-code/low-code platforms, right? I see that as nearly guaranteed, for many use cases.

          > Low-code and no-code development platforms allow professional as well as citizen developers to quickly and efficiently create applications in a visual software development environment. The fact that little to no coding experience is required to build applications that may be used to resolve business issues underlines the value of the technology for organizations worldwide. Unsurprisingly, the global low-code platform market is forecast to amount to approximately 65 billion U.S. dollars by 2027. [0]

          We could argue about the exact no-code TAM, but if you have a decent chance to create the market leader for the no-code replacement, $3B seems fair, doesn't it?

          [0] https://www.statista.com/topics/8461/low-code-and-no-code-pl...

          • abxyz 1 hour ago
            I disagree, it's the opposite. Low code / no code is valuable because you're deferring responsibility to a system that is developed and maintained by experts. A task running once a day on Zapier is orders of magnitude better for a business than the same task being built by someone on the marketing team with vibe coding. Low code / no code platforms have a very bright future, because they can leverage LLMs to help people create tasks with ease that are also reliable.

            LLM-enabled Zapier or Make or n8n is the future, not everyone churning out Claude-written NextJS app after NextJS app.

            • consumer451 1 hour ago
              Yeah, I don't disagree with you at all. I almost wrote a longer and more nuanced comment to begin with.

              There are many use cases for low-code. The two major ones I've dealt with are MVPs where tools like Bubble are used, and the other is creating corporate internal tools, where MS Power Platform is common.

              Corporate IT departments are allergic to custom web apps, and have a much easier time getting a Power Platform project approved due to its easily understood security implications. That low-code use case is certainly going to be the last thing a tool like Windsurf conquers.

              However, even without that use case, in an AI-heavy investment environment, $3B doesn't seem all that bad to me. However, I have zero experience with M&A.

        • MostlyStable 1 hour ago
          I actually literally think that these small scripts, if widely applied, are far, far more bigger than that. Even solely in the US, let alone globally. I'm also relatively certain that, if they weren't sinking money into research etc, subscriptions on inference are probably already profitable. These companies are burning money because they are investing in research, not because $10/month doesn't cover the average inference costs. Although I'd love to find a better source than the speculative ones I've seen about it.
      • orbital-decay 1 hour ago
        Malleable software. This all reminds me of personal computing in 80's with BASIC in every machine, and environments like Emacs that are built for that.

        I think LLMs have a much better chance at this kind of software than Emacs or BASIC, but I also doubt it has any future: once AI is capable enough, you can just hide the programmatic layer entirely and tell the computer what to do.

      • luckylion 1 hour ago
        > But small, bespoke scripts that will be used by exactly 1 person and are whose outputs are easily verified are actually _hugely_ important.

        Are they easily verified though?

        I have a bunch of people who are "vibe coding" in non-dev departments. It's amazing that it allows them to do things they otherwise couldn't, but I don't think it's accurate to say it's easily verified, unless we're talking about the most trivial tasks ("count the words in this text").

        As soon as it gets a bit more complex (but far from "complex"), it's no longer verifiable for them except "the output looks kinda like what I expected". Might still be useful for things, but how much weight do you want to put on your sales-analysis if you've verified its accuracy by "looks intuitively correct"?

    • ToValueFunfetti 1 hour ago
      I've seen a lot of AI hype, but "AI will make management recognize that tech debt is important" takes the cake. Maybe in 2040
      • Magma7404 1 hour ago
        Management realizing and saying publicly that they made a mistake? Maybe in 3025.
      • tyre 1 hour ago
        I hope you’re able to find good managers. I prioritize paying down tech debt over feature development regularly, because it makes business sense.

        Like even in a cold capitalist analysis, the benefits to developer velocity, ease of new feature development, incident response, stability, customer trust, etc.

        It doesn’t always; there are certainly areas of tech debt that bother me personally but I know aren’t worth the ROI to clean up. These become weekend projects if I want a fun win in my life, but nothing terrible happens if there’s a little friction.

        • huntertwo 1 hour ago
          How? I find it hard for my team to reduce tech debt as an OKR since other feature work is 1) sexier for engineers to work on 2) easier to put concrete value on. Everybody agrees in principle that tech debt is bad
          • tyre 38 minutes ago
            Great question. It depends on why you want to kill it.

            Sometimes it’s because there are regular bugs and on-call becomes a drag on velocity.

            Sometimes making code changes is difficult and there’s only one person who knows what going on, so you either have a bus factor risk or it limits flexibility on assigning projects / code review.

            Sometimes the system’s performance is, or will be in the short–medium term, going to start causing incidents.

            Sometimes incident recovery takes a long time. We had a pipeline that would take six–ten hours to run and couldn’t be restarted midway if it failed. Recovering from downtime was crazy!

            Sometimes there’s a host of features whose development timelines would be sped up by more than it would take to burn down the tech debt to unlock them.

            Sometimes a refactor would improve system performance enough to meaningfully affect the customer or reduce infra costs.

            And then…

            Sometimes you have career-driven managers and engineers who don’t want to or can’t make difficult long-term trade-offs, which is sometimes the way it is and you should consider switching teams or companies.

            So I guess my question to you is: why should you burn this down?

      • croes 1 hour ago
        They still have to figure that out for cloud software
    • mountainriver 1 hour ago
      There hasn’t been an AI winter since 2008 and there sure isn’t going to be one now. In spite of everyone saying it every couple of months since then.

      Also what tech debt? If you have good engineers doing the vibe coding they are just way faster. And also faster at squashing bugs.

      I was one-shotting whole features into our Rust code base with 2.5 last week. Absolutely perfect code, better than I could have written it in places.

      Then later that week o3 solved a hard bug 2 different MLEs failed to solve as well as myself.

      I have no idea why people think this stuff is bad, it’s utterly baffling to me

    • apples_oranges 1 hour ago
      IMHO: we will vibe:code with free local/cheaply hosted open source models and IDEs.. the hardware to facilitate is coming to consumers fast. But if Microsoft can sell Office to companies for decades then open ai can surely do the same for coding tools
      • sebzim4500 1 hour ago
        Unless there is a massive change in archiecture, it will always be much more cost effective to have a single cluster of GPUs running inference for many users than have each user have hardware capable of running SOTA models but only using it for the 1% of the time where they have asked the model to do something.
      • exitb 1 hour ago
        There are multiple orders of magnitude between the sizes of models people use for „vibe coding” and models most people can comfortably run. It will take many years to bridge that gap.
      • nativeit 1 hour ago
        > But if Microsoft can sell Office to companies for decades then open ai can surely do the same for coding tools

        This seems like a bold statement.

    • senko 52 minutes ago
      I agree on the underlying premise - current crop of LLMs isn't good enough at coding to completely autonomously achieve a minimum quality level for actually reliable products.

      I don't see how peak vibe coding in a few months follows that. Check revenue and growth figures for products like Lovable ($10m+ ARR) or Bolt.new ($30m+ ARR). This doesn't show costs (they might in fact be deep in red) but with story like that I don't see it crashing in 3-4 months.

      On the user experience/expectation side, I can see how the overhyped claims of "build complete apps" hit a peak, but that will still leave the tools positioned strong for "quick prototyping and experimentation". IMHO, that alone is enough to prevent a cliff drop.

      Even allowing for the peak in tool usage for coding specifically, I don't see how that causes "AI winter", since LLMs are now used in a wide variety of cases and that use is strongly growing (and uncorrelated to the whole "AI coding" market).

      Finally, "costs will go up for all sorts of reasons" claim is dubious, since the costs per token are dropping even while the models are getting better (for a quick example, cost of GPT-4.1 is roughly 50% of GPT-4o while being an improvement).

      For these reasons, if I could bet against your prediction, I'd immediately take that bet.

    • tyre 1 hour ago
      I don’t know, I’ve been using Gemini 2.5 for a bit. The daily quota caps at effectively $55/day. It’s not a ton of development but it’s definitely worth it compared to a human for projects that Claude 3.7 can’t yet wrap its mind around.

      We’ll see if Gemini 2.5 Flash is good enough, but it definitely doesn’t feel like Google is selling for a huge loss post-training.

      Yes the training is a huge investment but are they really not going to do it? Doesn’t seem optional

    • NitpickLawyer 1 hour ago
      > and another AI winter will settle in with the upcoming

      Oh, please. Even if every cent of VC funding dries up tomorrow we'd still have years of discovering how to use LLMs and "generative models" in general to do cool, useful stuff. And by "we" I mean everyone, at every level. The proverbial bearded dude in his mom's basement, the young college grad, phd researcher, big tech researcher, and everyone in the middle. The cat is out of the bag, and this tech is here to stay.

      The various AI winters came because of many reasons, none that are present today. Todays tech is cool! It's also immediately useful (oAI, anthropic, goog are already selling billions of $ worth of tokens!). And it's highly transformative. The amount of innovation in the past 2 years is bonkers. And, for the first time, it's also accessible to "home users". Alpaca was to llama what the home computer was to computers. It showed that anyone can take any of the open models and train them on their downstream tasks for cheap. And guess what, everyone is doing it. From horny teens to business analysts, they're all using this, today.

      Also, as opposed to the last time (which also coincided with the .com bubble), this time the tech is supported and mainly financed by the top tech firms. VCs are not alone in this one. Between MS, goog, AMZ, Meta and even AAPL, they're all pouring billions into this. They'll want to earn their money back, so like it or not, this thing is here to stay. (hell, even IBM! is doing gen ai =)) )

      So no, AI winter is not coming.

    • behnamoh 1 hour ago
      > ... velocity/error quotas start to inverse ...

      Could you please elaborate? Is this how management (at least in your company) looks at code—as a ratio of how fast it's done over how many tests it passes?

    • yojo 1 hour ago
      The chatter around vibe coding to me feels a lot like the late 90s early 2000s FUD around outsourcing. Who would pay a high-cost American engineer when you could get 10 in South Asia for the same price? Media was forecasting an irreversible IT offshoring mega trend. Obviously, some software development did move to cheaper regions. But the US tech sector also exploded.

      For some projects (e.g. your internal-facing CRUD app), cheap code is acceptable. For a high scale consumer product, the cost of premium engineering resources is a rounding error on your profits, and even small marginal improvements can generate high value in absolute dollar terms.

      I’m sure vibe coding will eat the lowest end of software development. It will also allow the creation of software that wouldn’t have been economically viable before. But I don’t see it notably denting the high end without something close to AGI.

    • bobxmax 1 hour ago
      AirBnb just did a 1.5 year engineering migration in 6 weeks thanks to AI.

      "Vibe" coding is here to stay and it's only devs who don't know how to adapt that are wishfully hoping for otherwise.

      • firefoxd 1 hour ago
        Hold on, we aren't even good at estimating but now we know how much time we saved by vibe coding? I can't wait to read the source of this info when you share it.
      • gammarator 1 hour ago
        Whatever the term “vibe coding” is taken to mean, it assuredly doesn’t apply to a large scale migration undertaken by a professional software organization.
      • mech422 1 hour ago
        "Airbnb recently completed our first large-scale, LLM-driven code migration, updating nearly 3.5K React component test files from Enzyme to use React Testing Library (RTL) instead. We’d originally estimated this would take 1.5 years of engineering time to do by hand, but — using a combination of frontier models and robust automation — we finished the entire migration in just 6 weeks."

        Color me unimpressed - it converted some test files. It didn't design any architecture, create any databases, handle any security concerns or any of the other things programmers have to do/worry about on a daily basis. It basically did source to source translation, which has been around for 30+ years.

        • shermantanktop 1 hour ago
          If you told me five years ago that such a conversion had been done in six weeks, I would not have believed it. Even though some level of source-to-source existed. And I would definitely expect that such a conversion would have resulted in hideous, non-idiomatic code in the target language.
      • croes 1 hour ago
        Lets wait for the first wave security bugs because of vibe coded software.

        I doubt that error free code is outnumbering code with errors in the training data.

      • exitb 1 hour ago
        It wasn’t vibe coding, they translated their tests from one framework to another.