Companies are doing lots of shiny new information technology projects based on this year’s hottest new tech, Artificial Intelligence! And 80% of these fail — double the failure rate of other …
And there’s hardly any way to start small and improve upon it.
With regular code, I can write a small solution and piece by piece improve it. But with AI, it’s more or less a gamble whether the results will ever get better at all. You might need to slightly rephrase the prompt, or it’s completely impossible. But you don’t know that. You can only try.
Just fyi, that’s not entirely true. If we’re just focusing on LLMs, structured and guided generation exists. Combine that with an eval set (= unit tests), you can at least track how well you’re doing. For sure, prompt engineering misses the feeling of being in control. You’ll also never be able to claim 100% coverage (although even with unit tests that’s not something you can claim, as there are always blind spots). What you gain over traditional coding, however, is that you can tackle problems that might otherwise take an infinite number of years to express in code. For example, how would you define the rules for detecting whether an image shows a bird?
It’s just a tool like any other. Overuse is currently detestably rife. But its value is there.
Source: ML engineer who secretly hates a lot about ML but is also in awe at the developments of the last few years.
And how often do you need to detect images of birds with an unknown accuracy?
That’s what many tech bros don’t seem to understand: much of the software in this world is boring business crap, and that software needs mainly reliability and explainability. You can’t just throw a product around that poses an incalculable risk. And often enough the specifications of these apps is an amalgamation of decades of cruft, and needs to be changed and tweaked often in tiny ways.
I mean, there are certainly cases where AI products have their uses, but those seem to be very small niches.
I agree with you. I just wanted to share some nuance. The point I wanted to make is that it is in fact possible to incorporate LLMs in a fairly controlled way while calculating (estimates of) the risk of failure as well as the associated social and financial costs. I do it every day, but I’m no tech bro and dislike the ‘AI will fix everything’ types as much as everyone here.
And there’s hardly any way to start small and improve upon it.
With regular code, I can write a small solution and piece by piece improve it. But with AI, it’s more or less a gamble whether the results will ever get better at all. You might need to slightly rephrase the prompt, or it’s completely impossible. But you don’t know that. You can only try.
Just fyi, that’s not entirely true. If we’re just focusing on LLMs, structured and guided generation exists. Combine that with an eval set (= unit tests), you can at least track how well you’re doing. For sure, prompt engineering misses the feeling of being in control. You’ll also never be able to claim 100% coverage (although even with unit tests that’s not something you can claim, as there are always blind spots). What you gain over traditional coding, however, is that you can tackle problems that might otherwise take an infinite number of years to express in code. For example, how would you define the rules for detecting whether an image shows a bird?
It’s just a tool like any other. Overuse is currently detestably rife. But its value is there.
Source: ML engineer who secretly hates a lot about ML but is also in awe at the developments of the last few years.
And how often do you need to detect images of birds with an unknown accuracy?
That’s what many tech bros don’t seem to understand: much of the software in this world is boring business crap, and that software needs mainly reliability and explainability. You can’t just throw a product around that poses an incalculable risk. And often enough the specifications of these apps is an amalgamation of decades of cruft, and needs to be changed and tweaked often in tiny ways.
I mean, there are certainly cases where AI products have their uses, but those seem to be very small niches.
I agree with you. I just wanted to share some nuance. The point I wanted to make is that it is in fact possible to incorporate LLMs in a fairly controlled way while calculating (estimates of) the risk of failure as well as the associated social and financial costs. I do it every day, but I’m no tech bro and dislike the ‘AI will fix everything’ types as much as everyone here.