I know a lot of people want to interpret copyright law so that allowing a machine to learn concepts from a copyrighted work is copyright infringement, but I think what people will need to consider is that all that’s going to do is keep AI out of the hands of regular people and place it specifically in the hands of people and organizations who are wealthy and powerful enough to train it for their own use.
If this isn’t actually what you want, then what’s your game plan for placing copyright restrictions on AI training that will actually work? Have you considered how it’s likely to play out? Are you going to be able to stop Elon Musk, Mark Zuckerberg, and the NSA from training an AI on whatever they want and using it to push propaganda on the public? As far as I can tell, all that copyright restrictions will accomplish to to concentrate the power of AI (which we’re only beginning to explore) in the hands of the sorts of people who are the least likely to want to do anything good with it.
I know I’m posting this in a hostile space, and I’m sure a lot of people here disagree with my opinion on how copyright should (and should not) apply to AI training, and that’s fine (the jury is literally still out on that). What I’m interested in is what your end game is. How do you expect things to actually work out if you get the laws that you want? I would personally argue that an outcome where Mark Zuckerberg gets AI and the rest of us don’t is the absolute worst possibility.
For something to be a fact, it needs to actually be true. AI is currently accessible to everyone.
I disagree. I can barely run a 13B parameter model locally. Much less a 175B parameter model like GPT3. Or GPT4, whatever that model truly is. Or whatever behemoth of a model the NSA almost certainly has and just hasn’t told anyone about. I’ll eat my sock if the NSA doesn’t have a monster LLM along with a myriad of other special purpose models by now.
And even though the research has (mostly) been public so far, the resources needed to train these massive models is out of reach for all but the most privileged. We can train a GPT2 or GPT-Neo if we’re dedicated, but you and I aren’t training an open version of GPT4.
But you can run it.
I’ve got a commodity GPU and I’ve been doing plenty of work with local image generation. I’ve also run and fine-tuned LLMs, though more out of idle interest than for serious usage yet. If I needed to do more serious work, renting time on cloud computing for this sort of thing actually isn’t all that expensive.
The fact that the very most powerful AIs aren’t “accessible” doesn’t mean that AI in general isn’t accessible. I don’t have a Formula 1 racing car but automobiles are still accessible to me.
If we’re just talking about what you can do, then these laws aren’t going to matter because you can just pirate whatever training material you want.
But that is beside my actual point, which is that there is a practical real-world limit to what you, the little guy, and they, the big guys, can do. That disparity is the privilege that OP way back up at the top mentioned.
I have no idea what that original commenter’s opinion on copyright vs training is. Personally I agree with the OP-OP of the whole thread. Training isn’t copying, and even if it were the public interest outweighs the interests of the copyright holders in this regard. I’m just saying that in the real world there is a privilege that that the elites and ultra-corps have over us, regardless of what systems we set up unless capitalism and society as a whole is upended.
At this point we’re just bickering over semantics.
So clearly we do agree on most of this stuff, but I did want to point out a possibility you may not have considered.
This depends on the penalty and how strictly it’s enforced. If it’s enforced like normal copyright law, then you’re right; your chances of getting in serious trouble just for downloading stuff are essentially nil – the worst thing that will happen to you is your ISP will three-strikes you and you’ll lose internet access. On the other hand, there’s a lot of panic surrounding AI, and the government might use that as an excuse to pass laws that would give people prison time for possessing one, and then fund strict enforcement. I hope that doesn’t happen, but with rumblings of insane laws that would give people prison time for using a VPN to watch a TV show outside of the country, I’m a bit concerned.
As for the parent comment’s motivations, it’s hard to say for sure with any particular individual, but I have noticed a pattern among neoliberals where they say things like “well, the rich are already powerful and we can’t do anything about it, so why try” or “having universal health care, which the rest of the first world has implemented successfully, is unrealistic, so why try” and so on. It often boils down to giving lip service to progressive social values while steadfastly refusing to do anything that might actually make a difference. It’s economic conservatism dressed as progressivism. Even if that’s not what they meant (and it would be unwise of me to just assume that), I feel like that general attitude needs to be confronted.
If I’m the “parent comment” you’re referring to, then that’s very much not my motivation. I’m just pointing out that “AI is accessible to everyone” is not a hard binary situation, and that while it may be true that big giant corporations have an advantage due to being big giant corporations with a ton of resources to throw at this stuff AI is indeed still accessible to some degree to the average consumer.
Well, again, “the average consumer” being first-world individuals with the resources to buy a nice computer and spend time playing with it. These things are a continuum and that’s not the end point of it, you can always go further down the resource rankings and find people for whom AI is not “accessible” by whatever standards. Unfortunately it’s kind of accepted as a given that people on the poor end of the spectrum don’t have access to this kind of stuff or will have to depend on external service providers.
You’re not. I was talking about the thread parent: “Many things in life are a privilege for these groups. AI is no different.” I should have been more specific.
At any rate, I personally feel that we have a moral responsibility to make it accessible to as many people as possible.
Okay, just wanted to make sure since I’m upstream of the comment.
I agree that making these things as accessible as possible is ideal, it’s just that the “as possible” part is tricky with expensive new technology like this. My personal desire is to see UBI implemented on the backs of AI and robot labor, which hopefully will come a lot closer to making universal access possible.
AI is more than just ChatGPT.
When we talk about reinterpreting copyright law in a way that makes AI training essentially illegal for anything useful, it also restricts smaller and potentially more focused networks. They’re discovering that smaller networks can perform very well (not at the level of GPT-4, but well enough to be useful) if they’re trained in a specific way where reasoning steps are spelled out in the training.
Also, there are used nvidia cards currently selling on Amazon for under $300 with 24 gigs of ram and AI performance almost equal to a 3090, which puts group-of-experts models like a smaller version of GPT-4 within reach of people who aren’t ultra-wealthy.
There’s also the fact that there are plenty of companies currently working on hardware that will make AI significantly cheaper and more accessible to home users. Systems like ChatGPT aren’t always going to be restricted to giant data centers, unless (as some people really want) laws are passed to prevent that hardware from being sold to regular people.
I want to be clear that I don’t disagree with your premise and your assertion that AI training should be legal regardless of copyright of the training material. My only point was that the original commenter said the ultra-elites have privilege over us little guys, and he was right in that regard. I have no ideas how that plays into his opinion on this whole matter, only that what he said on its face is accurate.