AI crap - Why ML will make the world worse, not better

pnutzh4x0r@lemmy.ndlug.org · 1 year ago

AI crap - Why ML will make the world worse, not better

lloram239@feddit.de · 1 year ago

This is a very one sided way to look at things. Yes, people will use AI to generate spam and stuff. What it is missing is that people will also use AI to filter it all away. The nice thing about ChatGPT and friends is that it gives me access to information in whatever format I desire. I don’t have to visit dozens of websites to find what I am looking for, the AI will do that for me and report back with what it has found.

Simply put, AI is a possible path to the Semantic Web, which previously failed since ads and SoC were the driver of the Web, not information.

Sometimes I really wonder in what magical wonderland those people complaining about AI live, since as far as I am concerned, the Web and a lot of other stuff went to shit a long while ago, long before AI got any mass traction. AI is our best hope to drag ourselves out of the mud.

The real problem is that AI isn’t good enough yet. It can handle Wikipedia-like questions quite well. But try to use it for product and price information and all you get is garbage.

botengang@feddit.de · 1 year ago

which previously failed since ads and SoC were the driver of the Web, not information.

Can you elaborate on why you think the ads wouldn’t sneak in again? The semantic web is a fantastic concept, but I don’t immediately see the AI connection. AI doesn’t magically pay for authored content and there is still an incentive to somehow get ads into LLM answers.

lloram239@feddit.de · 1 year ago

Can you elaborate on why you think the ads wouldn’t sneak in again?

You can run a LLM at home on your own PC. Think of it less as a replacement for Google and more like the computer from StarTrek. You tell it what you want and it goes to search the net for you. What you see is just the answer, in a format specified by you, not the websites they came from.

Google, Bing and Co. will of course add ads into their services, but that’s a short issue. AI will fundamentally reshape how we interact with computers and information in the long run.

The semantic web is a fantastic concept, but I don’t immediately see the AI connection.

The semantic web relies on human doing the markup, that’s doomed to fail, nobody has the time for that and even if they spend the effort, they would miss a whole lot of information that is in the text. A LLM can extract semantic information directly from the text without any markup and you can query that information with natural language. That’s not only way easier on the creators side, but also way more powerful on the users end.

xavier666@lemm.ee · 1 year ago

You can run a LLM at home on your own PC. You tell it what you want and it goes to search the net for you.

Unless it’s open-source and connected to a proper crownsourced dataset, hosted on a paid server managed by a community instead of a big corporation, I don’t see how ads are NOT getting in.

lloram239@feddit.de · 1 year ago

Yes, but that’s already the case. There are numerous Open Source’ish language models around that you can run on your own PC, no server required:

And some of them are getting pretty damn close to ChatGPT performance:

https://tatsu-lab.github.io/alpaca_eval/

There is of course still plenty of work that needs to be done in letting LLMs interact with the outside world, use a webbrowser and stuff, but there are projects for that as well, e.g. AutoGPT. Just a matter of time until that stuff becomes good enough to be usable.

botengang@feddit.de · 1 year ago

Thank you very much. My concern is rather in the direction of inserting ads or “promotional information” into the training material, much like SEO plagues search today. If the info is from the web it can still be malicious, even if you run your own LLM.

lloram239@feddit.de · 1 year ago

They’ll certainly try, though it’ll be quite a bit trickier due to LLMs providing direct answers, not just a list of sites. You can’t really sneak a product in there when it doesn’t actually fit the question. I think the bigger problem is just the lack of good information out there. Finding trustworthy reviews these days is getting really hard, most of the time all you have is the product description and some Amazon reviews, which even when done well, fail at providing how product X compares to product Y. No matter how smart the AI will be, that always leaves a ton of room for error and misinformation.

Hard to tell how things will end up. For the time being, LLMs are pretty much completely useless for product search, ChatGPT just doesn’t know enough and BingChat will just summarize the first three SEO-filled Bing search results. The deep knowledge LLMs have on Wikipedia-like topics is missing when it comes to products and services, and they can’t really do calculations either, so price information is almost always wrong. This will need some specific optimization.

david@feddit.uk · 1 year ago

I don’t know why you want to use an AI to purchase goods and learn about products. That’s what the current www is really really strong at. Lots of people are spending an awful lot of money to make that information really easy to discover, and popular search engines definitely prioritise that information.

Also, if an AI is to give you price and product information it’s going to have to be reading live web pages, which will of course be full of ads. SEO will become AIO/LLMO. There is no end to the time and money advertisers are prepared to pour into getting products in front of users. The irony is that you seem to want to view products and you have this weird perspective where you’re keen to avoid ads for products so that you can view marketing information about products without the ads.

It’s already fairly hard to tell without knowing some good websites or reading through to conclusions and using some common sense whether a review website is honest or biased. I don’t know why you think an AI with access to the Internet will filter out fake reviews and content crafted to lead you to specific products over others.

Also, downloading and configuring your own AI is unlikely to be the way the “AI revolution” comes. Amazon, Google, Microsoft, Apple and other mega corporations will be funding the “AI revolution” and will not sit idly by allowing their kingdoms to crumble.

The number of people who will be saved from the corporations that run the online world by open source grass roots AI will be smaller than the number of people who are saved by Linux from proprietory products and SAAS.

Yeah, everyone will get used to using an AI to interact with the web, but it will be freely supplied by a corporation, and I PROMISE you the enshitification of AI has been long planned before we even reach step one of making it awesome for the masses.

lloram239@feddit.de · 1 year ago

That’s what the current www is really really strong at.

You must be using a different WWW than I am, since product search for me is absolutely terrible. Even the simplest of queries can’t be answered, e.g. something trivial as “what’s the cheapest thing that matches query” fails due to some products coming different package sizes (e.g. 100g vs 1000g). If you want to buy a movie or game, and want to know about sequels and prequels, you have to go to Wikipedia to find out, since I have yet to see a single shop that organizes that well. Or try to find the equivalent of a product in another country where the original product isn’t available. Or try to search for the cheapest way to buy multiple product at once, taking shipping cost into account. Even just figuring out the size or what’s actually in the box is often impossible, I have yet to see another site that gives you a full CAD model of the products like McMaster-Carr.

Product search on the Web is utter garbage. I am kind of surprises that nobody ever put serious effort into making that work well. Googles product search is garbage and most other search engines don’t even have a specific product search. A product search engine that automatically bundles up information from different, shops, Youtube videos and comments doesn’t exist as far as I know.

Lots of people are spending an awful lot of money to make that information really easy to discover

Amazon deliberately puts sponsored products on top to make it harder to discover what you want. Some small shops put effort into it and let you search products according the specs, but that only works in that single shop, I have yet to see a search engine that can handle that across multiple shop and with any semblance of reliability.

Also, if an AI is to give you price and product information it’s going to have to be reading live web pages, which will of course be full of ads.

Yes, but that’s irrelevant as long as only the AI reads it. I don’t care what ads my adblocker reads either.

I don’t know why you think an AI with access to the Internet will filter out fake reviews

I am not looking for reviews, but for reliable and detailed product information. An LLM can help gather that information from multiple different sources and format it in a unified way. SEO has limited influence on that, as either the product has those specs or it has not, in which case the LLM should be able to find contradictions in the information and automatically write a letter to whatever consumer protection office is responsible for false advertisement.

Also, downloading and configuring your own AI is unlikely to be the way the “AI revolution” comes.

Given the way privacy is getting traction in the public consciousness, I wouldn’t be so sure. Look at how many people already use adblockers, around 40% or so, that’s quite a lot, many of them will be upgrading to some form of AI driven adblocking and information gathering sooner or later.

david@feddit.uk · edit-2 1 year ago

You know that a LLM is a statistical word prediction thing, no? That LLMs “hallucinate”. That this is an inevitable consequence of how they work. They’re designed to take in a context and then sound human, or sound formal, or sound like an excellent programmer, or sound like a lawyer, but there’s no particular reason why the content that they present to you would be accurate. It’s just that their training data contains an awful lot of accurate data which has a surprisingly large amount of commonality of meaning.

You say that the current crop of LLMs are good at Wikipedia style questions, but that’s because their authors have trained them with some of the most reliable and easy to verify information on the Web. A lot of that is Wikipedia style stuff. That’s it’s core knowledge, what it grew up reading, the yardstick by which it was judged. And yet it still goes off on inaccurate tangents because there’s nothing inherently accurate about statistically predicting the next word based on your training and the context and content of the prompt.

Yes, LLMs sound like they understand your prompt and are very knowledgeable, but the output is fundamentally not a fact-based thing, it’s a synthesized thing, engineered to sound like its training data.

lloram239@feddit.de · edit-2 1 year ago

You do not query the LLM directly. The LLM just provides the baseline language understanding. You use the LLM to extract information out of websites and convert it into a machine readable format. You can do that with ChatGPT today:

Prompt: Extract important product information out of this text and format it as json:

[copy and paste random Amazon.com website]

Answer:
Here's the important product information extracted from the text and formatted as JSON:
{
  "Product Name": "kwmobile 8 Port Patch Panel - RJ45 Cat6 Shielded Network Splitter Panel with Ground Wire",
  "Price": {
    "Discounted Price": "$20.99",
    "Typical Price": "$22.99"
  },
  "Color": "Black",
  "Brand": "Kwmobile",
  "Connector Type": "RJ45",
  "Cable Type": "Ethernet",
 ...
}

That’s the power of LLMs. They aren’t better a Google, they are a way to interface with semantic information stored in human readable text (or pictures or sound). And with that extracted information you can go and built a better Google or just let the LLM browse the web and search for information relevant to you.

acastcandream@beehaw.org · 1 year ago

deleted by creator

hglman@lemmy.ml · 1 year ago

Its either that or extreme fragmentation and or de- informationalization

acastcandream@beehaw.org · 1 year ago

deleted by creator

hglman@lemmy.ml · 1 year ago

Its not a problem if its removed by improved ai; it would be a transient fear that never manifests.

acastcandream@beehaw.org · 1 year ago

deleted by creator

acastcandream@beehaw.org · edit-2 1 year ago

deleted by creator

lloram239@feddit.de · 1 year ago

You never wanted to have the computer from StarTrek, the Holodeck or the Universal Translator? Modern AI provides a fundamental shift in how we can interact with data and allows us to do things that would have been impossible by classic means.

And it’s not like you can escape it anyway, phone cameras use AI, spell checkers use AI, mobile phone keyboards use AI, it’s already everywhere and we have barely started.