We are looking at implementing a storage virtualization device and I started doing a bit of research on EMC’s product offering. Below is a summary of some of the information I’ve gathered, including a description of what VPLEX does as well as some pros and cons of implementing it. This is all info I’ve gathered by reading various blogs, looking at EMC documentation and talking to our local EMC reps. I don’t have any first-hand experience with VPLEX yet.
What is VPLEX?
VPLEX at its core is a storage virtualization appliance. It sits between your arrays and hosts and virtualizes the presentation of storage arrays, including non-EMC arrays. Instead of presenting storage to the host directly you present it to the VPLEX. You then configure that storage from within the VPLEX and then zone the VPLEX to the host. Basically, you attach any storage to it, and like in-band virtualization devices, it virtualizes and abstracts them.
There are three VPLEX product offerings, Local, Metro, and Geo:
Local. VPLEX Local manages multiple heterogeneous arrays from a single interface within a single data center location. VPLEX Local allows increased availability, simplified management, and improved utilization across multiple arrays.
Metro. VPLEX Metro with AccessAnywhere enables active-active, block level access to data between two sites within synchronous distances. Host application stability needs to be considered. It is recommended that depending on the application that consideration for Metro be =< 5ms latency. The combination of virtual storage with VPLEX Metro and virtual servers allows for the transparent movement of VM’s and storage across longer distances and improves utilization across heterogeneous arrays and multiple sites.
Geo. VPLEX Geo with AccessAnywhere enables active-active, block level access to data between two sites within asynchronous distances. Geo improves the cost efficiency of resources and power. It provides the same distributed device flexibility as Metro but extends the distance up to 50ms of network latency.
Here are some links to VPLEX content from EMC, where you can learn more about the product:
- EMC VPLEX Local data sheet: http://www.emc.com/collateral/hardware/specification-sheet/h7077-VPLEX-local-ss.pdf
- EMC Engineering tech book for VPLEX Metro: http://www.emc.com/collateral/hardware/technical-documentation/h7113-VPLEX-architecture-deployment.pdf
- EMC’s VPLEX Performance Best Practices: https://www.emc.com/collateral/white-papers/h11299-emc-VPLEX-elements-performance-testing-best-practices-wp.pdf
What are some advantages of using VPLEX?
1. Extra Cache and Increased IO. VPLEX has a large cache (64GB per node) that sits in-between the host and the array. It offers additional read cache that can greatly improve read performance on databases because the additional cache is offloaded from the individual arrays.
2. Enhanced options for DR with RecoverPoint. The DR benefits are increased when integrating RecoverPoint with VPLEX Metro or Geo to replicate the data using real time replication. It includes a capacity based journal for very granular rollback capabilities (think of it as a DVR for the data center). You can also use the native bandwidth reduction features (compression & deduplication) or disable them if you have WAN optimization devices installed like those from Riverbed. If you want active/active read/write access to data across a large distance, VPLEX is your only option. NetApp’s V-Series and HDS USPV can’t do it unless they are in the same data center. Here’s a few more advantages:
- DVR-like recovery to any point in time
- Dynamic synchronous and asynchronous replication
- Customized recovery point opbjectives that support any-to-any storage arrays
- WAN bandwidth reduction of up to 90% of changed data
- Non-disruptive DR testing
4. Non disruptive data mobility & reduced maintenance costs. One of the biggest benefits of virtualizing storage is that you’ll never have to take downtime for a migration again. It can take months to migrate production systems and without virtualization downtime is almost always required. Also, migration is expensive, it takes a great deal of resources from multiple groups as well as the cost of keeping the older array on the floor during the process. Overlapping maintenance costs are expensive too. By shortening the migration timeframe hardware maintenance costs will drop, saving money. Maintenance can be a significant part of the storage TCO, especially if the arrays are older or are going to be used for a longer period of time. Virtualization can be a great way to reduce those costs and improve the return on assets over time.
5. Flexibility based on application IO. The ability to move and balance LUN I/O among multiple smaller arrays non-disruptively would allows you to balance workloads and increase your ability to respond to performance demands quickly. Note that underlying LUNs can be aggregated or simply passed through the VPLEX.
6. Simplified Management and vendor neutrality. Implementing VPLEX for all storage related provisioning tasks would reduce complexity with multiple vendor arrays. It allows you to manage multiple heterogeneous arrays from a single interface. It also makes zoning easier as all hosts would only need to be zoned to the VPLEX rather than every array on the floor, which makes it faster and easier to provision new storage to a new host.
7. Increased leverage among vendors. This advantage would be true with any virtualization device. When controller based storage virtualization is employed, there is more flexibility to pit vendors against each other to get the best hardware, software and maintenance costs. Older arrays could be commoditized which could allow for increased leverage to negotiate the best rates.
8. Use older arrays for Archiving. Data could be seamlessly demoted or promoted to different arrays based on an array’s age, it’s performance levels and it’s related maintenance costs. Older arrays could be retained for capacity and be demoted to a lower tier of service, and even with the increased maintenance costs it could still save money.
9. Scale. You can scale it out and add more nodes for more performance when needed. With a VPLEX Metro configuration, you could configure VPLEX with up to 16 nodes in the cluster between the two sites.
What are some possible disadvantages of VPLEX?
1. Licensing Costs. VPLEX is not cheap. Also, it can be licensed per frame on VNX but must be licensed per TB on CX series. Your large,older CX arrays will cost you a lot more to license.
2. It’s one more device to manage. The VPLEX is an appliance, and it’s one more thing (or things) that has to be managed and paid for.
3. Added complexity to infrastructure. Depending on the configuration, there could be multiple VPLEX appliances at every site, adding considerable complexity to the environment.
4. Managing mixed workloads in virtual enviornments. When heavy workloads are all mixed together on the same array there is no way to isolate them, and the ability to migrate that workload non-disruptively to another array is one of the reasons to implement a VPLEX. In practice, however, those VMs may end up being moved to another array with the same storage limitations as where they came from. The VPLEX could be simply temporarily solving a problem by moving that problem to a different location.
5. Lack of advanced features. The VPLEX has no advanced storage features such as snapshots, deduplication, replication, or thin provisioning. It relies on the underlying storage array for those type of features. As an example, you may want to utilize block based deduplication with an HDS array by placing a Netapp V-series in front of it and using Netapp’s dedupe to enable it. It is only possible to do that with a Netapp Vseries or HDS USP-V type device, the VPLEX can’t do that.
6. Write cache performance is not improved. The VPLEX uses write-through caching while their competitor’s storage virtualization devices use write-back caching. When there is a write I/O in a VPLEX environment the I/O is cached on the VPLEX, however it is passed all the way back to the virtualized storage array before an ack is sent back to the host. The Netapp V-Series and HDS USPV will store the I/O in their own cache and immediately return an ack to the host. At that point the I/Os are flushed to the back end storage array using their respective write coalescing & cache flushing algorithms. Because of the write-back behavior it is possible for a possible performance gain above and beyond the performance of the underlying storage arrays due to the caching on these controllers. There is no performance gain for write I/O in VPLEX environments beyond the existing storage due to the write-through cache design.