I’ve been covering data storage on and off for 20 years, which nicely coincides with the first 20% of the century. I decided to compile a list of the most important developments in data storage since 2000.
First, consider where we were on December 31, 1999: Nobody was quite sure what would happen to computer systems when the clock struck midnight; security mattered but not in the post-September 11 way of constant cyberwar; legacy companies like Digital Equipment Corp. were still relevant although under the Compaq banner; broadband internet access was just beginning to surpass dial-up for home users; Google was a startup; social media was a fad for teenagers; PalmPilots were still hip, and smartphones were in their infancy.
Following are the top 11 new mainstream movements in storage since that time. I’m reluctant to use the word “inventions” because the underlying technologies mostly evolved from computer science labs in the 1960s, 1970s, and 1980s. I present the list alphabetically—feel free to debate their order of importance among your colleagues!
What if a large portion of your company’s data resided off-site on infrastructure managed by you or by a third party? Pros: Someone else maintains the hardware; in some cases, you can add or reduce capacity at will; and you’re automatically protected from disasters. Cons: Do you trust non-employees to secure your data; what if your internet access goes down; and are the cost savings real or a mirage?
There didn’t used to be any easy way to know if you were paying to store duplicate information, and then dedupe-ing came along. Suppose 100 employees all stored a copy of a company policy document or a sales spreadsheet—it would be much more efficient if they all accessed one copy. Dedupe software constantly scans your storage to purge redundancies. Pros: It saves money. Cons: It could slow down your network.
Would you trade slower recovery for faster backup? Many companies happily make that bet. Before this method, you saved all of your data all of the time; after it, you only saved all of your data now and then—most times, you only saved the changes. The risk is that recovery takes longer, and it’s more likely to go wrong if you need to completely rebuild a system. But for many companies, it is a necessity because the increase in data exceeds the amount of non-critical time available for performing full backups.
The big storage hardware vendors, such as EMC, Hitachi, IBM, and then Network Appliance (NetApp), all wanted you to believe that(NAS) and (SAN) had to be made up of expensive enterprise-class hard drives. People realized gradually that cheap commodity drives were almost as good at a fraction of the price, thus the term “just a bunch of disks” was born. The idea is that you can run NAS or SAN software on your own commodity hardware, rather than paying the big boys for potentially over-built and almost always overpriced systems. It’s no coincidence that the large companies put all of their research and marketing dollars into software in the 2010s.
Non-volatile memory express (NVMe)
There are many benefits of replacing your traditional hard disks with solid-state disks (see below), but one problem is the motherboard connections between processors and storage were not designed to handle the new speeds. It’s like buying a Ferrari and being limited to 55 miles per hour.is quickly becoming the standard for faster connections between components, thereby reducing the bottleneck. (NVMe also enables storage-class memory, which is the farther-out idea that disks of any type can be replaced with non-volatile RAM.)
Object storage became mainstream throughout the 2010s.
Enterprise data always lacked context, residing on servers as files or on storage networks as blocks. Object storage came out of research labs to provide not just your data but also metadata and unique identifiers, which were perfect unstructured data such as healthcare records and your Facebook posts.
Solid-state drives (SSDs)
There will always be magnetic tapes for slow long-term storage, better known as cold storage, but conventional spinning hard disks are quickly becoming obsolete due to lower prices for solid-state disks. As compared to hard-disk drive (HDD) products, SSD technology is exponentially faster, uses less energy, reduces noise pollution, requires less space, and is almost as cheap. The big concerns are whether SSD products will last long enough between failures and whether they’re more prone to corrupt your data.
Few people can remember back to a time when computers weren’t made of spinning disks for storage and volatile RAM for memory. What if RAM could be non-volatile? The chips that are used to build memory are faster than the ones in solid-state drives, so you could have a computer that’s entirely based around the memory and processor. No separate drive of any type is needed. Researchers are also working on ways to put memory immediately adjacent to, or on top of, a processor—rather than someplace else on a motherboard. Eliminate most of that pathway entirely and enjoy superfast computing. It is coming soon.
Wouldn’t it be great if all major storage hardware and storage software companies made their products work to a single standard? That is, great for users, and not so great for vendors. The idea was huge in the 2000s but proved to be a pipe dream. Storage networking industry groups are still trying to make it happen, and I wish them the best of luck.
Everyone talks about server virtualization, but storage virtualization is just as important. The idea is to manage all of your storage—a network here, file system there, perhaps some objects and data lakes and tapes and server silos someplace else—as one massive pool. It is an idea that works in doses—byte-sized, you could say—but not in any Utopian way. An exception to make it work as drawn-up would be if all of your storage is from one company or maybe if you’re running a supercomputer installation.
USB memory sticks debuted in 2000 offering 8 megabytes in a device the size of your finger.
USB memory sticks are still popular today, but now capacities start at 8 gigabytes, nicely following a Moore’s Law curve. Will it hold through 2040 to give us 8 terabytes, or roughly 8.8 trillion bits, in the same tiny package? What kind of data will fill that space?