I-want-to-believe.jpg
Apparently that reddit post itself was generated with AI. Using AI to bash AI is an interesting flex.
How did people find out it was AI generated? Seems natural to me. Scary.
Have any evidence of that? The only thing I saw was commentors in that thread (who were obvious AI-bros) claiming it must be AI generated because “it just wouldn’t happen”…
I’m a data analyst and primary authority on the data model of a particular source system. Most questions for figures from that system that can’t be answered directly and easily in the frontend end up with me.
I had a manager show me how some new LLM they were developing (which I had contributed some information about the model to) could quickly answer some questions that usually I have to answer manually, as part of a pitch to make me switch to his department so I can apply my expertise for improving this fancy AI instead of answering questions manually.
He entered a prompt, got a figure that I knew wasn’t correct and I queried my data model for the same info, with a significantly different answer. Given how much said manager leaned on my expertise in the first place, he couldn’t very well challenge my results and got all sheepish about how the AI still in development and all.
I don’t know how that model arrived at that figure. I don’t know if it generated and ran a query against the data I’d provided. I don’t know if it just invented the number. I don’t know how the devs would figure out the error and how to fix it. But I do know how to explain my own queries, how to investigate errors and (usually) how to find a solution.
Anyone who relies on a random text generator - no matter how complex that generation method to make it sound human - to generate facts is dangerously inept.
I don’t know how the devs would figure out the error and how to fix it.
This is like the biggest factor that people don’t get when thinking of these models in the context of software. “Oh it got it wrong, but the developers will fix it in an update”. Nope, they can fix traditional software mistakes, LLM output and machine learning things… They can throw more training data at it (which sometimes just changes what it gets wrong) and hope for the best, they can do better job at curating the context window to give the model the best shot at outputting the right stuff (e.g. the guy who got Opus to generate a slow crappy buggy compiler had to traditionally write a filter to find and show only the ‘relevent’ compiler output back to the models), they can try to generate code to do what you want and have you review the code and correct issues. But debugging and fixing the model itself… that’s just not a thing at all.
I was in a meeting where a sales executive was bragging about the ‘AI sales agent’ they were working, but admitting frustration with the developres and a bit confused why the software developers weren’t making progress when those same developers always made decent progress before, and they should be able to do this even faster because they have AI tools to help them… It eternally seemed in a state that almost worked but not quite no matter what model or iteration they went to, no matter how much budget they allocated, when it came down to the specific facts and figures it would always screw up.
I cannot understand how long these executives wade in the LLM pool and still believes in capabilities beyond what anyone has experienced.
I cannot understand how long these executives wade in the LLM pool and still believes in capabilities beyond what anyone has experienced.
They leave the actual work to the boots on the ground so they don’t see how shitty the output is. They listen to marketing about how great it is and mandate everyone use it and then any feedback is filtered through all the brownnosers that report to them.
It eternally seemed in a state that almost worked but not quite no matter what model or iteration they went to, no matter how much budget they allocated, when it came down to the specific facts and figures it would always screw up.
This is probably the biggest misunderstanding since “Project Managers think three developers can produce a baby in three months”: Just throw more time and money at AI model “development” for better results. It supposes predictable, deterministic behaviour that can be corrected, but LLMs aren’t deterministic ny design, since that wouldn’t sound human anymore.
Sure, when you’re a developer dedicated to advancing the underlying technology, you may actually produce better results in time, but if you’re just the consumer, you may get a quick turnaround for an alright result (and for some purposes, “alright” may be enough) but eventually you’ll plateau at the limitations of the model.
Of course, executives universally seem to struggle with the concept of upper limits, such as sustainable growth or productivity.
I guarantee you this is how several, if not most, fortune 500 companies currently operate. The 50k DOW is not just propped up by the circlejerk spending on imaginary RAM. There are bullshit reports being generated and presented every day.
I patiently wait. There is a diligent bureaucrat sitting somewhere going through fiscal reports line by line. It won’t add up… receipts will be requested… bubble goes pop
Leopard meets face.
When you delegate, to a person, a tool or a process, you check the result. You make sure that the delegated tasks get done and correctly and that the results are what is expected.
Finding that it is not the case after months by luck shows incompetence. Look for the incompetent.
Yeah. Trust is also a thing, like if you delegate to a person that you’ve seen getting the job done multiple times before, you won’t check as closely.
But this person asked to verify and was told not to. Insane.
100%
Hallucinations are widely known, this is a collective failure of the whole chain of leadership.
Problem being is that whoever is checking the result in this case had to do the work anyway, and in such a case… why bother with the LLM that can’t be trusted to pull the data anyway?
I suppose they could take the facts and figures that a human pulled and have an LLM verbose it up for people who for whatever reason want needlessly verbose BS. Or maybe an LLM can do a review of the human generated report to help identify potential awkward writing or inconsistencies. But delegating work that you have to do anyway to double check the work seems pointless.
Like someone here said “trust is also thing”. Once you check a few time that the process is right and the result are right, you don’t need to check more than ponctually. Unfortunatly, that’s not what happened in this story.
Tbf at this point corporate economy is made up anyway so as long as investors are gambling their endless generational wealth does it matter?
This is how I’m starting to see it too. Stock market is just the gambling statistics of the ownership class. Line goes down and we’re supposed to pretend it’s harder to grow food and build houses all of a sudden.
There’s a difference. If I go and gamble away my life savings, then I’m on the street. If they gamble away their investments, the government will say ‘poor thing’ and give them money to keep the economy ok.
Ah yes, what a surprise. The random word generator gave you random numbers that aren’t actually real.
Surely this is just fraud right? Seeing they have a board directors they have shareholders probably? I feel they should at least all get fired, if not prosecuted. This lack of competency is just criminal to me.
Are you suggesting we hold people responsible?
Ask Bernie Madoff. Scamming rich people is the one and only instance where even rich people are held accountable.
In the current world, probably the one going to jail is the one reporting it. So I don’t expect much no.
This is why I hate search engines promoting AI results when you are researching for something. It is confidently giving incorrect responses. I asked for sources on one LLM model before while using Duckduckgo, and it just told me that there are no sources and the information is based on broad knowledge. At one point, I challenged the AI that it is wrong, but it insisted it doesn’t. It turns out that it is citing a years old source written by a different bot long ago. But on the one hand, most of you are probably familiar that on occasions that the AI is incorrect and you challenge it, it will relent, although it will be a sycophant even though you yourself are actually incorrect. This is Schrödinger’s AI.
I mean it hallucinates numbers when you ask it to extract some numeric daha publicly available online so yeah…
Even when it does pull numeric data, it gets very confused.
I asked about rough price of something and of course the AI summary came back and said something like:
It typically costs 400-500 but could cost up to $200 in extreme circumstances, with 750 being the average
Basically did get three figures from three different internet results and combined them into a single sentence in a nonsense way.
At least in such a scenario, someone with at least a couple of active brain cells would stop and recognize some bullshittery going on, but the executive probably TLDRs the sentence and stops after ‘400-500’.
My broseph in Christ, what did you think a LLM was?
Bro, just give us a few trillion dollars, bro. I swear bro. It’ll be AGI this time next year, bro. We’re so close, bro. I just need need some money, bro. Some money and some god-damned faith, bro.
User: Hi big corp AI(LLM), do this task
Big Corp AI: Here is output
User: Hi big corp your AI’s output is not up to standard I guess it’s a waste of…
Big Corp: use this agent which ensures correct output (for more energy)
User: it still doesn’t work…guess I was wrong all along let me retry…
And the loop continues until they get a few trillion dollars
You can make something AI based that does this, but it’s not cheap or easy. You have to make agents that handle data retrieval and programmatically make the LLM to chose the right agent. We set one up at work, it took months. If it can’t find the data with a high certainty, it tells you to ask the analytics dept.
Large Lying Model?
To everyone I’ve talked to about AI, I’ve suggested a test. Take a subject that they know they are an expert at. Then ask AI questions that they already know the answers to. See what percentage AI gets right, if any. Often they find that plausible sounding answers are produced however, if you know the subject, you know that it isn’t quite fact that is produced. A recovery from an injury might be listed as 3 weeks when it is average 6-8 or similar. Someone who did not already know the correct information, could be damaged by the “guessed” response of AI. AI can have uses but it needs to be heavily scrutinized before passing on anything it generates. If you are good at something, that usually means you have to waste time in order to use AI.
Do the same to any person online, most blogs by experts, or journalists.
Even apparently easy to find data, like the specs of a car. Sucking and lying is not exclusive to LLMs.
Literally nobody suggested it was.
It was implicit in the test suggestion
I had a very simple script. All it does is trigger an action on a monthly schedule.
I passed the script to Copilot to review.
It caught some typos. It also said the logic of the script was flawed and it wouldn’t work as intended.
I didn’t need it to check the logic of the script. I knew the logic was sound because it was a port of a script I was already using. I asked because I was curious about what it would say.
After restating the prompt several times, I was able to get it to confirm that the logic was not flawed, but the process did not inspire any confidence in Copilot’s abilities.
Happy cake day, and this absolutely. I figured out its game the first time I asked it a spec for an automotive project I was working on. I asked it the torque specs for some head bolts and it gave me the wrong answer. But not just the wrong number, the wrong procedure altogether. Modern engines have torque to yield specs, meaning essentially you torque them to a number and then add additional rotation to permanently distort the threads to lock it in. This car was absolutely not that and when I explained back to it the error it had made IT DID IT AGAIN. It sounded very plausible but someone following those directions would have likely ruined the engine.
So, yeah, test it and see how dumb it really is.
Love it.
I fucking love this. It’s amazing.










