Data Storage
Last updated
Last updated
📚 Data Storage Measurements
In this reading, you will learn about the different names for measurements of data storage capacities and file sizes. Data storage capacity increases in step with the evolution of computer hardware technology. Larger storage capacities allow for dynamic growth in file sizes. These advances make it possible for companies like Netflix and Hulu to store thousands of feature-length films in high video quality formats.
📊 Standardized Terminology
There are standardized sets of terms used to name the ever-expanding sizes of data storage and files. For example, the common terms used to describe file sizes and hard drive storage capacity include: bytes, kilobytes, megabytes, gigabytes, and terabytes. However, if you are a computer engineer, you might use a different set of terms.
🔢 Decimal Nomenclature
Table illustrating decimal values for data storage measurements
The decimal naming system for computer storage uses the metric system of prefixes from the International System of Units: kilo, mega, giga, tera, peta, exa, zetta, and yotta. These prefixes may also be referred to as the decimal system of prefixes. The metric/decimal nomenclature represents a base-10 approximation of the actual amount of data storage bytes. The metric system prefixes were selected to simplify the marketing of computer products.
🔢 Binary Nomenclature
Table illustrating binary values for data storage measurements
The binary naming system is a standard set by the International Organization for Standardization (ISO) in partnership with the International Electrotechnical Commission (IEC). The ISO 80000 and IEC 80000 guides to units of measurement define the International System of Quantities (ISQ). The prefixes kibi-, mebi-, gibi, -tebi-. pebi-, exbi-, zebi-, and yobi- were created by the IEC organization. They are a blend of the first two letters of the metric prefix fused with the first two letters of the word "binary" (example: megabyte + binary + byte = mebibyte).
Binary measurements of computer data are more accurate than decimal system measurements. While decimal nomenclature is commonly used to market computers and computer parts to the general public, binary nomenclature is often used in computer engineering for numerical accuracy.
📏 One bit
One bit, also called a binary digit, stores an electric signal as 1. The absence of an electric signal is stored as 0, which is also the default value of a bit. One bit can store only one value, either 1 or 0. These two possible values are the basis of the binary number system (base-2) that computers use. All numbers in a base-2 system increase exponentially as powers of 2.
📏 One byte
One byte stores eight bits of ones and zeros that translate to a symbol or basic computer instruction. Examples: 01101101 is the byte that translates to the letter "m." The byte 01111111 tells the computer to delete the character to the right of the cursor.
📏 One kilobyte (1 KB)
Kilobyte (KB) decimal format: 10^3 = 1,000 bytes
Kibibyte (KiB) binary format: 2^10 = 1,024 bytes
Decimal inaccuracy: Off by -2.4% or -24 bytes
Name origin: "Kilo-" is a French derivation from the Ancient Greek word for "thousand." A kilobyte is one thousand bytes.
1 KB can hold: A short text file or a small icon as a 16x16 pixel .gif file.
📏 One megabyte (1 MB)
Megabyte (MB) decimal format: 10^6 = 1,000,000 bytes
Mebibyte (MiB) binary format: 2^20 = 1,048,576 bytes
Decimal inaccuracy: Off by -4.9% or -48,576 bytes
Name origin: "Mega-" is derived from the Ancient Greek word for "large." A megabyte is a large number of bytes.
1 MB can hold: Approximately one minute of music in a lossless .mp3 format or a short novel.
📏 One gigabyte (1 GB)
Gigabyte (GB) decimal format: 10^9 = 1,000,000,000 bytes
Gibibyte (GiB) binary format: 2^30 = 1,073,741,824 bytes
Decimal inaccuracy: Off by -7.4% or -73,741,824 bytes
Name origin: "Giga-" is derived from the Greek word for "giant." A gigabyte is a giant number of bytes.
1 GB can hold: A high-definition movie or a large collection of photos.
📏 One terabyte (1 TB)
Terabyte (TB) decimal format: 10^12 = 1,000,000,000,000 bytes
Tebibyte (TiB) binary format: 2^40 = 1,099,511,627,776 bytes
Decimal inaccuracy: Off by -9.9% or -99,511,627,776 bytes
Name origin: "Tera-" is derived from the Greek word for "monster." A terabyte is a monstrous number of bytes.
1 TB can hold: A large library of books or hundreds of hours of video footage.
📏 One petabyte (1 PB)
Petabyte (PB) decimal format: 10^15 = 1,000,000,000,000,000 bytes
Pebibyte (PiB) binary format: 2^50 = 1,125,899,906,842,624 bytes
Decimal inaccuracy: Off by -12.6% or -125,899,906,842,624 bytes
Name origin: "Peta-" is derived from the Greek word for "five." A petabyte is five times larger than a terabyte.
1 PB can hold: A vast amount of data, such as the entire written works of humankind.
📏 One exabyte (1 EB)
Exabyte (EB) decimal format: 10^18 = 1,000,000,000,000,000,000 bytes
Exbibyte (EiB) binary format: 2^60 = 1,152,921,504,606,846,976 bytes
Decimal inaccuracy: Off by -15.4% or -152,921,504,606,846,976 bytes
Name origin: "Exa-" is derived from the Greek word for "six." An exabyte is six times larger than a petabyte.
1 EB can hold: A massive amount of data, such as the total digital content ever created.
📏 One zettabyte (1 ZB)
Zettabyte (ZB) decimal format: 10^21 = 1,000,000,000,000,000,000,000 bytes
Zebibyte (ZiB) binary format: 2^70 = 1,180,591,620,717,411,303,424 bytes
Decimal inaccuracy: Off by -18.3% or -180,591,620,717,411,303,424 bytes
Name origin: "Zetta-" is derived from the Italian word for "seven." A zettabyte is seven times larger than an exabyte.
1 ZB can hold: An astronomical amount of data, far beyond current storage needs.
📏 One yottabyte (1 YB)
Yottabyte (YB) decimal format: 10^24 = 1,000,000,000,000,000,000,000,000 bytes
Yobibyte (YiB) binary format: 2^80 = 1,208,925,819,614,629,174,706,176 bytes
Decimal inaccuracy: Off by -21.2% or -208,925,819,614,629,174,706,176 bytes
Name origin: "Yotta-" is derived from the Greek word for "eight." A yottabyte is eight times larger than a zettabyte.
1 YB can hold: An inconceivable amount of data, surpassing any current or foreseeable storage requirements.
Please note that the decimal inaccuracy mentioned in the tables is due to the difference in base calculation between the decimal and binary systems. The binary system is based on powers of 2, while the decimal system is based on powers of 10.