If you go to the store and buy a 4 gigabyte flash drive, you will notice it doesn’t show up as 4 gigabytes in your computer*. It shows up as slightly less. You might even get curious and ask someone why this happens? One of the most common response seems to be that part of the storage is taken up by the drive for formatting. I’ve also met people that think that they just try to put 4 gigabytes in, but it can vary. This has to be my favorite answer, because it brings to mind a factory worker who’s just too lazy to keep pouring in bytes from the byte hose on the assembly line before putting on the USB cap. Then, at the end of the day, the janitor sweeps up all the extra bytes that spilled out onto the floor. Unfortunately, neither answer is correct.
You see, when you plug that 4 gigabyte flash drive into your computer, your computer reads it as…. 4 gigabytes. Now I know that sounds wrong and I know when you plug in your 4 gigabyte drive your computer says it reads 3.73 GB*. But the sad truth is that your computer is lying. This isn’t a malicious lie though. Microsoft isn’t trying to convince you that you have less storage as part of some nefarious world domination plot. It actually started as a small lie a long time ago that has grown exponentially over time. It’s a story of base 2 and base 10, the metric system, the importance of defining variables, and a bit of math. So if you’re not into numbers, you should probably stop reading now and just go back to the picture of a hard drive being filled by a hose with bits and bytes.
The first thing that we need to understand is the bit. A bit is the smallest piece of digital information, and essentially describes something being on or off, and is represented is with either a 1 or 0. Every computer, and everything digital, on it’s most basic level is made up of 1′s and 0′s. That’s your processor’s calculations, memory, hard drive storage, and even the digital signal that makes up your HD display can be broken down into 1′s and 0′s. Numbers in this system are represented as base 2, or binary. Each place can only hold 1 and will roll over to the next place if another 1 is added. Ex:
Base 10|Base 2 1|0001 2|0010 3|0011 4|0100 5|0101 6|0110 7|0111 8|1000
The next step up from a bit is a byte. Generally, a byte represents 8 bits, and can have 256 values (0-255). It used to be that each value for a byte represented a character in the alphabet. You will also see 255 show up in a lot of older video games since most variables were only one byte. Since the byte was more useful than a bit, it became the more common term when talking about computer storage.
The missing megabyte problem started when we tried to use the metric system label the amount of bytes in a system. Normally, in the metric system, when you add the kilo prefix to something, it means 1000. The problem is that 1000 is 1111101000 in binary, and was not easily calculated by early computers. This uses the processor more, and for older computers, the processor power is precious. The solution was to use 1024 for kilo instead of 1000. Since 1024 is a multiple of 2, it is represented by 10000000000 in binary and is only off by 24 bytes per kilobyte. When a computer with 16K ram was impressive, this didn’t seem too large of an issue.
The problem was that 1024 bytes was still being called a kilobyte, instead of its correct name, kibibyte. The ‘ibi scale works off of 1024x as opposed to the 1000x scale the metric system normally uses. As computer storage grew larger, the problem grew larger. Next time you see a hard drive box, look at the side and they’ll have a handy chart telling you that 1TB=1,000,000,000,000 bytes. Your computer reads a megabyte as 1024^2 bytes, while the actual (and hard drive manufacturer) definition of a megabyte is 10002 bytes. This results in a 4.63% difference between a megabyte and mibibyte instead of the 2.34% difference between a kilobyte and a kibibyte. This gap gets worse and worse every time you go up a unit, to the point where it looks like you have “lost” 69 gigabytes on a terabyte hard drive.
Nowadays, it takes virtually no processing power to go from base 2 to base 10 readings on hard drives. Currently only major operating system to read hard drives in base 10 is OSX Snow Leopard. Yet most computers still read the size of hard drives incorrectly. But now when someone complains to you about buying a new 500GB hard drive to only have it “filled” with 465GB, or when someone talks about “formatting space” being taken up on that .25GB of their 4GB flash drive, you can know better.
*unless you’re running OSX 10.6 or higher
Author: Alex McKenzie