What is the future of enterprise storage administration? Will the job of enterprise storage administrator still be necessary in 10 years? A friend in IT storage management recently asked me a question related to that very relevant topic. In a word, I believe the answer to that 2nd question is a resounding yes, but alas, things are changing and we are going to have to embrace the inevitable changes in the industry to stay on top of our game and remain relevant. In recent years I’ve seen the need to broaden my skills and demonstrate how my skills can actually drive business value. The modern data center is undergoing a tremendous transformation, with hyper-converged systems, open source solutions, software-defined storage, large cloud-scale storage systems that companies can throw together all by themselves, and many others. This transformation is being created by the need for business agility, and it’s being fueled by software innovation.
As the expansion of data into the cloud influences and changes our day-to-day management, we will begin to see the rise of the IT generalist in the storage world. These inevitable changes and the new tools that manage them will mean that storage will likely move toward being procured and managed by IT generalists rather than specialists like myself. Hyper converged infrastructures will allow these generalists to manage an entire infrastructure with a single, familiar set of tools. As overall data center responsibilities start to shift to more generalist roles, traditional enterprise storage storage professionals like myself will need to expand our expertise beyond storage, or focus on more strategic projects where storage performance is critical. I personally see us starting to move away from the day-to-day maintenance of infrastructure and more toward how IT can become an real driver of business value. The glory days of on-prem SAN and storage arrays are nearing an end, but us old timers in enterprise storage can still be a key part of the success of the businesses we work for. If we didn’t embrace change, we wouldn’t be in IT, right?
Despite all of these new technologies and trends, keep in mind that there are still some good reasons to take the classic architecture into consideration for new deployments. It’s not going to disappear overnight. It’s the business that drives the need for storage, and it’s the business applications that dictate the ideal architecture for your environment. Aside from the application, businesses will also be dependent on their existing in-house skills which will of course affect the overall cost analysis of embracing the new technologies, possibly pushing them off.
So, what are we in for? The following list summarizes my view on the key changes that I think we’ll see in the foreseeable future. I’m guessing you’d see these (along with many others) pop up in pretty much any google search about the future of storage or storage trends, but these are the most relevant to what I’m personally witnessing.
- The public cloud is an unstoppable force. Embrace it as a storage admin or risk becoming irrelevant.
- Hyper-converged systems will become more and more common and will driven by market demand.
- Hardware commoditization will continue to eat away at the proprietary hardware business.
- Storage vendors will continue to consolidate.
- We will see the rise of RDMA in enterprise storage.
- Open Source storage software will mature and will see more widespread acceptance.
- Flash continues to dominate and will be embraced by the enterprise, driving newer technologies like NVMe and diminishing technologies like fiber channel.
- GDPR will drive increase spending and overall focus on data security.
- Scale out and object solutions will increasingly be more important.
- Data Management and automation will increase in importance.
I believe that the future of cloud computing is undeniably hybrid. The future data center will likely represent a combination of cloud based software products and on-prem compute, creating a hybrid IT solution that balances the scalability and flexibility associated with cloud, and the security and control you have with a private data center. With that said, I don’t believe that Cloud is a panacea as there are always concerns about security, privacy, backups, and especially performance. In my experience, when the companies I’ve worked for have directed us to investigate cloud options for specific applications, on-premises infrastructure costs less than public cloud in the long run. Even so, there is no doubting the inexorable shift of projects, infrastructure, and spending to the cloud, and it will affect compute, networking, software, and storage. I expect I’ll see more and more push to find more efficient solutions that offer lower costs, likely resulting in hybrid solutions. When moving to the cloud, monitoring consumption is the key to cost savings. Cost management tools from the likes of Cloudability, Cloud Cruiser and Cloudyn are available and well worth looking at.
I’ve also heard, “the cloud is already in our data center, it’s just private”. Contrary to popular belief, private clouds are not simply existing data centers running virtualized, legacy workrkloads. They are highly-modernized application and service environments running on true cloud platforms (like AWS or Azure) residing either on-prem or in a hybrid scenario with a hosting services partner. As we shift more of our data to the cloud, we’ll see industry demand for storage move from “just in case” storage (an upfront capex model) to “just in time” storage (an ongoing opex model). “Just in time” storage has been a running joke for years for me in the more traditional data center environments that I’ve been responsible for, alluding to the fact that we’d get storage budget approved, ordered and installed just days before reaching full capacity. That’s not what I’m referring to in this case… “Just in time” means online service providers are running at much higher asset utilization than the typical customer can add capacity in more granular increments. The migration to cloud allows for a much more efficient “just in time” model than I’m used to, and allows the switch to an ongoing opex model.
A hyper-converged infrastructure can greatly simplify the management of IT and yes, it could reduce the need for skilled storage administrators: the complexities of storage, servers and networking that require separate skills to manage are hidden ‘under the hood’ by that software layer, allowing it to be managed by staff with more general IT skills through a single administrative interface. Hyperconverged infrastructure is also much easier to scale and in smaller increments than traditional integrated systems. Instead of making major infrastructure investments every few years, businesses can simply add modules of hyperconverged infrastructure when they are needed.
It seems like an easy sell. It’s a data center in a box. Fewer components, a smaller data center footprint, reduced energy consumption, lower cooling requirements, reduced complexity, rapid deployment time, fewer high level skill requirements, and reduced cost. What else could you ask for?
As it turns out, there are issues. Hyper converged systems require a massive amount of interoperability testing, which means hardware and software updates take a very long time to be tested, approved and released. A brand new intel chipset can take half a year to be approved. There is a tradeoff between performance and interoperability. In addition, you won’t be saving any money over a traditional implementation, hyper-converged requires vendor lock-in, and performance and capacity must be scaled out at the same time. Even with those potential pitfalls, hyper converged systems are here to stay and will continue to be adopted at a fast pace in the industry. The Pros tend to outweigh the cons.
The commoditization of hardware will continue to eat away at proprietary hardware businesses. The cost savings from economies of scale always seem to overpower the benefits of specialized solutions. Looking at history, there has been a long a pattern of mass-market produced products that completely wipe out low-volume high-end products, even superior products. Open source software using off-the-shelf hardware will become more common as we move toward the commoditzation of storage.
I believe most enterprises in general lack the in-house talent required to combine third-party or open source storage software with commodity hardware in a way that can guarantee the scalability and resilience that would be required. I think we’re moving in that direction, but we’re not likely to see it become prevalent in enterprise storage soon.
The mix of storage vendors in typical enterprises is not likely to be radically changed anytime soon, but it’s coming. Startups, even with their innovative storage software, have to deal with concerns about interoperability, supportability and resilience, and those concerns aren’t going anywhere. While the endorsement of a startup by one of the major vendors could change that, I think the current largest players like Dell/EMC and NetApp might be apprehensive in accelerating the move to storage hardware commoditization.
I believe that software innovation has decisively shifted to open source, and we’re seeing that more and more in the enterprise storage space. You can take a look at many current open source solutions in my previous blog post here. Moreover, I can’t think of a single software market that has a proprietary packaged software vendor that defines and leads the field. Open source allows fast access to innovative software at little or no cost, allowing IT organizations to redirect their budget to other new initiatives.
When Enterprise architecture groups look at open source solutions, which generally focus on which proprietary vendor they should lock themselves in to, are now faced with the onerous task of selecting the appropriate open source software components, figuring out how they’ll be integrated, and doing interoperability testing, all while ensuring that they are maintaining a reliable infrastructure to the business. As you might expect, implementing open source requires a much higher level of technical ability than traditional proprietary solutions. Having the programming knowledge to build a and support an open source solution is far different than operating someone else’s supported solution. I’m seeing some traditional vendors move to the “milk the installed base” strategy and stifle their own internal innovation. If we want to showcase ourselves as technology leaders, we’re going to have to embrace open source solutions, despite the drawbacks.
While open source software can increase complexity and include poorly tested features and bugs, the overall maturity and general usability of Open Source storage software has been improving in recent years. With the right staff, implementation risks can be managed. For some businesses, the cost benefits of moving to that model are very tangible. Open source software has become commonplace in the enterprise, especially in the Linux realm. Linux of course pretty much started the open source movement, followed by widely adopted enterprise applications like MySQL, Apache, Hadoop. Open source software can allow businesses to develop IT solutions to address challenges that are customized and innovative while at the same time bring down acquisition costs by using commodity hardware.
Storage industry analysts have predicted the slow death of Fiber Channel based storage for a long time. I expect that trend to speed up, with the steadily increasing speed of standard Ethernet all but eliminating the need for proprietary SAN connections and the expensive Fibre Channel infrastructure that comes along with it. NVMe over ethernet will drive it. NVMe technology is a high performance interface for solid-state drives (SSDs) and predictably, it will be embraced by all-flash vendors moving forward.
All the current storage trends you’ve read around efficiency, flash, performance, big data, machine learning, object storage, hyper-converged infrastruture, etc. are all moving against the current Fibre Channel standard. Production deployments are not yet widespread, but it’s coming. It allows vendors and customers get the most out of flash (and other non-volatile memory) storage. The rapid growth of all-flash arrays has kept fiber channel alive because it typcially replaces legacy disk or hybrid fiber channel arrays.
Legacy Fiber Channel vendors like Emulex, QLogic, and Brocade have been acquired by larger companies so the larger companies can milk the cash flow from the expensive FC hardware before their customers convert to Ethernet. I don’t see any growth or innovation in the FC market moving forward.
In case you haven’t noticed, it’s near the end of 2017 flash has taken over. It was widely predicted, and from what I’ve seen personally, those predictions absolutely came true. While it still may not rule the data center overall, new purchases have trended that way for quite some time now. Within the past year the organizations I’ve worked for have completely eliminated spinning disk from block storage purchases, instead relying on the value propositions of all-flash with data reduction capabilities making up for the smaller footprint. SSDs are now growing in capacity faster than HDDs (15TB SSDs have been announced) and every storage vendor now has an all-flash offering.
Consolidate and Innovate
The environment for flash startups is getting harder because all the traditional vendors now offer their own all-flash options. There are still startups making exciting progress in NVMe over Fabrics, object storage, hyper-converged infrastructure, data classification, and persistent memory, but only a few can grow into profitability on their own. We will continue to see acquisitions of these smaller, innovative startups as the larger companies struggle to develop similar technologies internally.
RDMA will continue to become more prevalent in enterprise storage, as it significantly boosts performance.. RDMA, or Remote Direct Memory Access, has actually been around in the storage arena for quite a while as a cluster interconnect and for HPC storage. Most high-performance scale-out storage arrays use DMA for their cluster communications. Examples inlcude Dell FluidCache for SAN, XtremIO, VMAX3, IBM XIV, InfiniDat, and Kaminario. In a microsoft blog I was reading, it showed 28% more throughput, realized by the reduced IO latency. It also illustrated that RDMA is more CPU efficient which leaves the CPU available to run more virtual machines. TCP/IP is of course no slouch and is absolutely still a viable deployment option. While not quite as fast and efficient as RDMA, it will remain well suited for organizations that lack the expertise needed for RDMA.
The Importance of Scale-Out
Scale-up storage is showing it’s age. If you’re reading this, you probably know that scale up is limited to the scalability limits of the storage controllers and has for years led to storage system sprawl. As we move into a multi data center architecture, especially in the world of object storage, clusters will be extended by adding nodes in different geographical areas. As object storage is geo aware (I am in the middle of a global ECS installation), policies can be established to distribute data into these other locations. As a user is accessing the storage the object storage system will return data from the node that provides the best response time to the user. As data storage needs continue to rapidly grow, it’s critical to move towards scale-out architecture vs. scale-up. The scalability that scale-out storage offers will help reduce costs, complexity, and resource allocation.
The General Data Protection Regulation takes effect in 2018 and applies to any entity doing business within any EU country. Under the GDPR, companies will need to build controls around security roles and levels in regard to data access and data transfer, and must provide tight data-breach mechanisms and notification protocols. As process controls they probably will have little impact on your infrastructure, however the two main points within the GDPR that have the most potential for directly impacting storage are data protection by design and data privacy by default.
the GDPR is going to require you to think about the benefits of cloud vs on-prem solutions. Data will have to meet the principle of privacy by default, be in an easily portable format and meet the data minimization principle. Liability of the new regulation falls on all parties however, so cloud providers will have to provide robust compliance solutions in place as well, meaning it could be a simpler, less-expensive route to look at a cloud or hybrid solution in the future.