“Data storage cannot be an exercise in the whole and in forgetting”
The preeminence of the data-driven business has grown over the years; CIOs without a practical plan to make it happen will not survive. Yet evolving into a data-driven organization requires a multi-faceted strategy, from technology and hiring decisions to organizational priority setting.
Analytics and AI tools, people, skills and culture are of course necessary ingredients for data-driven operations. What can be overlooked goes deeper: the way the data is stored and managed itself. Data storage cannot be a fixed and forgotten exercise. Our world has changed too much in the past decade.
Without a thoughtful plan and process for the ongoing management of data from the perspective of cost, performance and business needs, CIOs face impending disaster.
SEE ALSO: Kubernetes Managed Object Storage For The Win
JAXenter: Considering this setup, remind us how we got to the current data management problem in the first place. How did it start?
Kumar Goswami: Twenty years ago, all but the largest companies had only one or two file servers in the data center, and the data stored was largely transactional. Spreading the data was not a problem. Yet with the rise of smartphones, online business models, IoT and cloud applications, a deluge of data has ensued. Coincidentally, computing power has become cheap enough, in part thanks to the cloud, where businesses have been able to analyze massive volumes of data like never before. Meanwhile, data volumes have grown exponentially in recent years as technology has become ubiquitous in all aspects of work and home life with smart home devices, industrial automation, medical devices, advanced computing and more.
As a result, a host of new storage technologies have come into play, including software-defined storage (SDS), high-performance all-flash NAS arrays, HCI, and many variants of cloud storage. But innovation in storage hasn’t solved the problem and in some cases made it worse because of all the silos. It has become impossible to keep pace with the growth and diversity of data, which these days is mostly unstructured data.
JAXenter: Despite the data explosion you just described, I’m guessing IT organizations haven’t necessarily changed their storage strategies, have they?
Kumar Goswami: That’s right. They keep buying expensive storage devices because unassailable performance is required for critical or “hot” data. The reality is that not all data is diamonds. Some of them are emeralds and some are made of glass. By treating all data the same, businesses create unnecessary costs and complexity.
For example, let’s look at backups. The purpose of regular backups is to protect hot or critical data, to which departments need reliable and regular access for day-to-day operations. Yet, as hot data continues to grow, the backup process becomes slow. So, you buy expensive high-end backup solutions to speed up this operation, but you still need more storage for all those copies of your data. The ratio of unique data (created and captured) to replicated data (copied and consumed) is approximately 1: 9. By 2024, IDC expects this ratio to be 1:10. Most organizations back up and replicate data that is in fact rarely accessible and better suited for low cost archives such as in the cloud.
Beyond the costs of backup and storage, organizations also need to secure all of this data. A single policy means all data is secured at the most sensitive and important data level. Large organizations spend 15% of their IT budget on security, according to a recent survey.
JAXenter: So where are the big companies going? What’s the best approach for modern enterprise data management?
Kumar Goswami: It’s time for IT leaders to create a sustainable enterprise data management model that is fit for the digital age. By doing so, organizations can not only save significantly on storage and backup costs, but they will also be able to better leverage “hidden” and cold data for analysis. Here are the principles of this new model:
- Automating. It’s no longer enough to do the annual data asset spring cleaning exercise. It needs to be a continuous, automated discovery process, using analytics to provide insight into the data (date, location, usage, file type), and then categorize the data into hot, hot, cool, and cold. Having ad hoc conversations with department managers is inefficient and is no longer scalable. Get data about your data.
- Segmentation. At a minimum, create two data compartments: hot and cold. The cold bucket will always be much larger than the hot one, which should remain relatively stable over time. On average, the data turns cold after 90 days, but depending on the industry this can vary. For example, a healthcare organization that stores large image files might consider a file to be hot after three days and cold after 60 days. Select a data management solution that can automatically move data to the age-appropriate storage device or cloud service.
- Planning of dead data. It can be difficult to know for sure when and how IT can potentially delete data, especially in a highly regulated organization. Deleting data is part of the entire lifecycle management process, although some organizations never delete anything. Scans can often indicate what data can be safely deleted. For example, a great use case is for former employees. Companies often unknowingly store large amounts of email data and files from employees who have left the company. In many cases, this data can be purged, if you know where it is and what it contains.
- A scalable storage ecosystem. New data storage technologies are always around the corner: DNA storage is just one example. Organizations need to work towards a flexible and agile data management strategy so that they can move data from one technology or cloud service to another without the high cost of vendor lockdown; this can come in the form of overspending cloud egress and rehydration that is common with many proprietary storage systems. This view can be rife with conflict in computer stores with strong supplier relationships and the desire to have “a throat to choke”. Over time, however, this can be a limiting strategy with more drawbacks than benefits.
SEE ALSO: Why integration tests take off
IT managers have the potential to unlock previously untapped unstructured data stores to improve employee productivity, improve customer experience, and support new revenue streams. To achieve this, we must evolve from traditional data management practices. It’s time to stop treating all data the same, perpetuating the endless cycle of backups, mirroring, replication, and updating of storage technology. By developing a modern model of information lifecycle management with data analytics, automation and segmentation, organizations can have the right data in the right place, at the right time, and managed at the right price.