Edit: To clarify:
Is it even possible, financially speaking, to keep adding storage? I mean, advertisements don’t even make a lot of money, is the indefinite growth of server storage even sustainable?
Or will they do what Twitch does with old content and just delete them?
Storage is cheap, especially at the corporate scale.
Make two simplifying assumptions: pretend that Google is paying consumer prices for storage, and pretend that Google doesn’t need to worry about data redundancy. In truth Google will pay a lot less than consumer prices, but they’ll also need more than 1 byte of storage for each byte of data they have, so for the sake of envelope math we can just pretend they cancel out.
Western Digital sells a 22TB HDD for $400. Seagate has a 20TB HDD for $310. I don’t like Seagate but I do like round numbers, so for simplicity we’ll call it $300 for 20TB. This works out to $15/TB. According to wikipedia, Youtube had just under $29b of revenue in 2021. If youtube spend just $100m of that — 0.34% — they’d be able to buy 6,666,666 of those hard drives. In a single year. That’s 6,666,666x20TB = 133,333,333 TB of storage, also known as 133note 1 exabytes.
That’s a lot of storage. A quick search tells me that youtube’s compression for 4k/25fps is 45Mbps, which is about 5.5 megabytes/s. That’s 768,722 years of 4k video content. All paid for with 0.34% of youtube’s annual revenue.
Note 1: Note that I am using SI units here. If you want to use 1024n for data names, then the SI prefixes aren’t correct. It’d be 115 exbibytes instead.
EDIT: I initially did the price wrong, fixed now.
I wouldn’t assume Googe pays less for storage. They need to pay for land use in many countries, power usage, redundancy and the staff that manages all of it.
They also need powerful servers with fast caching storage and a lot of RAM. They also need to pay for the bandwidth.
As far as I know, they save multiple copies of each video in all resolutions they serve. So an 8K video will also have 4K + 1440p + 1980p + 720p + 480p + 240p + 144p Possibly also 60Hz and 30Hz for some of them and also HDR versions.
You have to add all that to the cost per TB. Finally, there is the question of how much additional storage they need per year, 100 PByr? Presumably also increasing yearly?
I wasn’t calculating server costs, just raw storage. Google is not buying hard drives at retail prices. I wouldn’t be surprised if they’re paying as little as 50% of the retail price to buy at volume.
All of what you say is true but the purpose was to get a back of the envelope estimation to show that the cost of storage is not a truly limiting factor for a company like youtube. My point was to answer the question.
With the level of compression youtube uses, the storage costs of everything below 4k is substantially lower than 4k by itself: for back of envelope purposes we can just ignore those resolutions.
Do you absolutely know they’re storing those qualities individually? It’s perfectly plausible that they do on the fly transcoding.
deleted by creator
what is that note notation?
It’s a superscript. You can see it in the comment editor options. It’s:
^text^
which looks like textYou can also check a comment’s source by clicking on the icon that looks like a dog eared piece of paper at the bottom of it.
ah it must be my client not visualizing
Open in your browser to see: https://lemmy.world/comment/2847886
I get invalid response
Ah, the Home Instance button for lemmy.world comments is broken. Try lemmy.ml instead: https://lemmy.ml/comment/3143665
You can use footnotes now[1].
They are neat and don’t look too bad if unsupported by the interpreter.
I know you are saying Google doesn’t have to worry about redundancy to simplify the math but I think that makes it completely useless.
Redundancy is not just about having another copy incase of data loss but more importantly for enterprises redundancy allows for more throughput. If each video was on a single hard drive the site would not be able to function as even the fastest multi actuator hard drive can only do 524 MB/s in a perfect vacuum.
It’s useless for answering a questions that wasn’t asked, sure. But I didn’t pretend to answer that question. What it is useful for is answering the topic question. You know, the whole damn point?
How much of a factor off do you think the estimate is? You think they need three drives of redundancy each? Ten? Chances are they’re paying half (or less) for storage drives compared to retail pricing. The estimate on what they could get with $100m was also 134 EB, a mind boggling sum of storage. I wouldn’t be surprised if they’re using up on the order of 1 EB/year in needed storage. There’s also a lot more room in their budget than 0.34%.
The point is to get a quick and simple estimate to show that there really will not be a problem in Google acquiring sufficient storage. If you want a very accurate estimate of their costs you’ll need data that we do not have. I was not aiming to get a highly accurate estimate of their costs. I made this clear, right from the beginning.
If each video was on a single hard drive the site would not be able to function as even the fastest multi actuator hard drive can only do 524 MB/s in a perfect vacuum.
The most popular videos are all going to be kept in RAM, they don’t read them all off disk with every single view request. If you wanted a comment going over the finer details of server architecture, you shouldn’t have looked at the one saying it was doing back of the envelope math on storage costs only, eh?
YouTube is known to reduce the quality of old videos. The resolution is often the same (e.g.1080p), but the image quality is way worse compared to when those videos were new. They’re probably doing it to reduce their storage cost.
They still keep the originals. They’re degrading the quality of older videos for bandwidth reasons, not storage.
Also, it’s not because of the price of bandwidth: It’s because they can use up so much bandwidth in a given region that they can cause slowdowns (e.g. hogging too much of what’s available).
deleted by creator
I mean, advertisements don’t even make a lot of money
Advertising has made Alphabet one of the richest companies in the world. I assure you that it does make a lot of money.
I think OP is conflating the amount that a YT channel sees per ad vs the amount that YT would keep. These are not the same thing.
Plus, YT gets their share of every single ad seen every day. The economy of scale obviously is paying off.
Google datacenters are global. They store petabytes of data in each and are constantly evolving to Google Cloud customer needs and their own, meaning they are always expanding their storage network with new servers and drives.
In addition to the hardware expansion, YouTube engineers are experimenting with ways to encode videos in much more efficient formats, such as AV1. Basically, encoding is how a video is stored. The engineers are trying new standards to retain original video quality in much smaller file sizes, leading to more video storage capacity without the need to upgrade servers as quickly.
lol you can store petabytes in one rack, much less an entire data center.
Storage for YT is not like storage for your computer. The question to ask is not if Google has enough hard drive space to keep old videos, but what it costs Google to keep all YT videos available enough to meet demand.
First keep in mind that there isn’t just one giant server at YouTube. Everything is replicated onto many parallel servers. And enormous datasets that are too large for any one server are “sharded” across many. Perhaps it takes 1000 server clusters to store one copy of everything.
Now you have to parallelize copies of those 1000 so there are redundant servers that can scale up to meet viewer capacity. This is a server “grid.”
But only some videos are being watched millions of times today. Only those server nodes need 100x redundancy for scale. The long tail of less watched videos might barely need a single node to be kept available.
So there is a massive “head” of videos that need tremendous server capacity to be available enough, and a very long and thin “tail” of videos that don’t require much resources at all.
The “head” grows as YT’s overall audience gets bigger. It’s very resource hungry and is probably their main challenge.
The “tail” gets longer as the total library of videos grows. But the tail is thin and making it longer isn’t that expensive.
I’m sure there is also a threshold below which they will drop videos. Made over 1 year ago. More than 30 seconds long. Has never been viewed once. Auhthor account hasn’t been visited in a year either. Drop it. No one will ever know.
No, they just add more servers.
and maybe delete videos that have been, you know… abandoned…
deleted by creator
Google is a trillion dollar company. TRILLION DOLLARS!
If your product scales with your size, your pure revenue doesn’t matter as much. Video is expensive
i imagine they will delete content. they have already started deleting google accounts with 2 years of inactivity.
That specifically excludes accounts with YouTube channels though, if I remember correctly
If you’re asking if YouTube has a finite amount of storage, the answer is yes. Assuming no safeguards were in place, you could theoretically fill up all their storage.
If you’re asking whether they will run out of storage… probably not while it is considered important. YouTube can buy additional storage space (the good ending), or they can delete content they deem unimportant (the bad ending). Or, they could decide that YouTube is “finished” and elect not to increase its storage. It’s their storage, so they call the shots.
Really, everything hosted “in the cloud” is hosted locally on someone else’s storage. If that storage dies, the data dies, unless you or someone else has a backup.
Edit: fixed for clarity
Confidently incorrect.
YouTube is owned by Google. Google is a cloud provider. Therefore YouTube is hosted on its own cloud.
Services are setup to automatically spin up more resources as needed.
Your claim that the cloud can lose data because of hard drive failure is ridiculous.
You do not understand how any of this works.
Your claim that the cloud can lose data because of hard drive failure is ridiculous.
Yes, that was a simplification of the reality that the data exists in storage somewhere. Killing one drive shouldn’t cause the data to be destroyed, but if you killed enough of their data centers, eventually you would see data loss.
Services are setup to automatically spin up more resources as needed.
Eventually, you can find a load large enough overwhelm these services. My point really was that theoretically you could overwhelm the system, but that it is unlikely to happen.
YouTube is owned by Google. Google is a cloud provider. Therefore YouTube is hosted on its own cloud.
That’s a bit of a cop-out. I guess I should have said “in a cloud that isn’t self-hosted”. Like yeah if I build my own cloud then I trivially control my data, but that’s usually not the case.
You do not understand how any of this works.
Well I’m not in the IT department but I do have a baseline understanding of how cloud computing works. Your data has to “live” somewhere, possibly multiple “somewheres”. If you compromise all the “somewheres”, or at least the locations of the desired data in the “somewheres”, the data is gone.
Edit: I edited my original comment to reference “storage” rather than hard drives specifically.
Removed by mod
You can disagree and claim the contrary without being a dick.
deleted by creator
Maybe other people can, but that doesn’t mean I can.
Cool, then don’t comment, or we’ll ban you.
That’s an interesting way to build a community.
Didn’t realize this was a safe place.
I see why people are sticking with Reddit rather than these fiefdoms.
That sounds like an underlying mental issue. I highly recommend seeking a therapist!
I have what is called allouttafucks syndrome. I can no longer suffer fools.
I work in cloud computing and it’s amazing to me how magical people like you think it is. Yes Google owns YouTube, but could still run out of resources if Google chooses, they are still at the mercy of their provider.
Services may be setup to dynamically grow but they are still consuming finite physical resources and would run out if the provider doesn’t expand those resources.
The cloud most certainly can lose data due to hard drive failure and other hardware issues; the services are designed to make that very unlikely, but cloud services also have disaster recovery options you must implement if you want to be truly isolated from a given hardware footprint.
So if Google allows to runout of space the it willing. That is quite circular.
The amount of data YouTube is processing is not going to be affected by a small brigade.
Please cite when YouTube has lost content because of storage failure.
Do you understand the infrastructure Google has built around their services to prevent data loss?
I’m not arguing any fool with a cloud account cannot lose data. This is specifically about YouTube.
Your claim that the cloud can lose data because of hard drive failure is ridiculous.
Did he claim that then?
He modified the comment
There’s a physical limit to everything (I’m talking more land space here, they can’t just keep adding new drives for infinity), especially since a site like youtube will only get more expensive to run as time goes. I expect them to start deleting content that doesn’t make them money like videos with under 5 views that are over a year old relatively soon especially with the current economy. This would make sense for them and would free up a lot of their storage. Google doesn’t disclose the profit margins of youtube but I am pretty sure they are not very large especially now with 8K HDR videos being available
They’re already wiping inactive google accounts and all related content. It’s going to be problematic for old videos where the owners haven’t used the account in some time.
I have a friend that passed away past the limit, I’m going to need to make sure to archive all of his stuff or else it’ll all fall into the youtube void.
They’re already wiping inactive google accounts and all related content
GDPR requires that inactive accounts older than four years be wiped
I got the memo but accounts with YouTube videos seem to be exempt from this. This is good because I uploaded some content a few years back and forgot the passwords.
But first, will YouTube run out of video IDs? Tom Scott answers: https://youtube.com/watch?v=gocwRvLhDf8
Just based on the url you shared. 11 characters needed. 26 lowercase letters + 26 uppercase letters + 10 numbers = 62 possible characters choices. 6211=5.2x1019 possible ids
Edit: oooh I was close
Its base 64. 26 uppercase + 26 lowercase + 10 digits + - and _. And there’s 11 places, so in total it’s 6411.
Your superscripts are messed up. Lemmy uses the syntax
a^b^
Go back into an old email account and try old links. I did that and found links that work from the early 00’s but there are a lot removed.
My first reddit link was from 2007. Memories…
Google has said they’ll start deleting long inactive accounts. I’m guessing they’ll go for unlisted next. Eventually they’ll try and prune their catalog to not have tons of unprofitable videos.
can’t wait until Google has a planet dedicated as its server ☺️
If we build a planet sized computer I got a question for it.
42
Yes, we know that. But I need to know the question.
What is 6 x 9?
Are you a mouse, by any chance?
No.
On an unrelated note, do you have any cheese?
All I have is this towel, I already gave all the fish to the dolphins…
deleted by creator
They could call it something clever, like Google Earth.
Or the death star.
Can’t wait to live on the Jupiter Brain with the homies
Oh they might already have it: Planet Earth
I have wondered the same from time to time, but I really don’t think so, every time tech becomes better and smaller (for the most part), it wouldn’t be crazy to think we can double or triple the current amount of normal storage we have in a few years, without compromising size at all, and Google has the funds to adapt these new tendencies first than most folks.