Tag Archives: tier

Reporting on LUN auto-tier distribution

We have auto-tiering turned on in all of our storage pools, which all use EFD, FC, and SATA disks.  I created a script that will generate a list of all of our LUNs and the current tier distribution for each LUN.  Note that this script is designed to run in unix.  It can be run using cygwin installed on a Windows server if you don’t have access to a unix based server.

You will first need to create a text file with a list of the hostnames for your arrays (or the IP to one of the storage processors for each array).  Separate lists must be made for VNX vs. older Clariion arrays, as the naviseccli output was changed for VNX.  For example, “Flash” in the text output on a CX was changed to “Extreme Performance” as the output from a VNX when you run the same command.  I have one file named san.list for the older arrays, and another named san2.list for the VNX arrays.

As I mentioned in my previous post, our naming convention for LUNs includes the pool ID, LUN number, server name, filesystem/drive letter, last four digits of the array’s serial number, and size (in GB). Having all of this information in the LUN name makes for very easy reporting.  This information is what truly makes this report useful, as simply having a list of LUNs gives me all the information I need for reporting.  If I need to look at tier distribution for a certain server from this report, I simply filter the list in the spreadsheet for the server name (which is included in the LUN name).

Here’s what our LUN names looks like: P1_LUN100_SPA_0000_servername_filesystem_150G

As I said earlier, because of output differences from the naviseccli command on VNX arrays vs. older CX’s, I have two separate scripts.  I’ll include the complete scripts first, then explain in more detail what each section does.

Here is the script for CX series arrays:

for san in `/bin/cat /reports/tiers/san.list`
do
naviseccli -h $san lun -list -tiers |grep LUN |awk '{print $2}' > $san.out 
     for lun in `cat $san.out`
        do
        sleep 2
        echo $san
        naviseccli -h $san -np lun -list -name $lun -tiers > $lun.$san.dat &
     done 

mv $san.report.csv $san.report.`date +%j`.csv 
echo "LUN Name","FLASH","FC","SATA" > $san.report.csv 
     for lun in `cat  $san.out`
        do
        echo $lun
        echo `grep Name $lun.$san.dat |awk '{print $2}'`","`grep -i flash $lun.$san.dat |awk '{print $2}'`","`grep -i fc $lun.$san.dat |awk '{print $2}'`","`grep -i sata $lun.$san.dat |awk '{print $2}'` >> $san.report.csv
     done
 done

./csv2htm.pl -e -T -i /reports/clariion1_hostname.report.csv -o /reports/clariion1_hostname.report.html

./csv2htm.pl -e -T -i /reports/clariion2_hostname.report.csv -o /reports/clariion2_hostname.report.html

./csv2htm.pl -e -T -i /reports/clariion3_hostname.report.csv -o /reports/clariion3_hostname.report.html

Here is the script for VNX series arrays:

for san in `/bin/cat /reports/tiers2/san2.list`
do
naviseccli -h $san lun -list -tiers |grep LUN |awk '{print $2}' > $san.out
   for lun in `cat $san.out`
     do
     sleep 2
     echo $san.Generating-LUN-List
     naviseccli -NoPoll -h $san lun -list -name $lun -tiers > $lun.$san.dat &
  done

mv $san.report.csv $san.report.`date +%j`.csv
echo "LUN Name","FLASH","FC","SATA" > $san.report.csv
   for lun in `cat  $san.out`
      do
      echo $lun
      echo `grep Name $lun.$san.dat |awk '{print $2}'`","`grep -i extreme $lun.$san.dat |awk '{print $3}'`","`grep -i Performance $lun.$san.dat |grep -v Extreme|awk '{print $2}'`","`grep -i Capacity $lun.$san.dat |awk '{print $2}'` >> $san.report.csv
   done
 done

./csv2htm.pl -e -T -i /reports/VNX1_hostname.report.csv -o /reports/VNX1_hostname.report.html

./csv2htm.pl -e -T -i /reports/VNX2_hostname.report.csv -o /reports/VNX2_hostname.report.html

./csv2htm.pl -e -T -i /reports/VNX3_hostname.report.csv -o /reports/VNX3_hostname.report.html
 Here is a more detailed explanation of the script.

Section 1:

The entire script runs in a loop based on the SAN hostname entries.   We’ll use this list in the next section to get the LUN information from each SAN that needs to be monitored.

for san in `/bin/cat /reports/tiers/san.list`

do

naviseccli -h $san lun -list -tiers |grep LUN |awk '{print $2}' > $san.out
 Section 2:

This section will run the naviseccli command for every lun in each of the <san_hostname>.out files, and output a single text file with the tier distribution for every LUN.  If you have 500 LUNs, then 500 text files will be created in the same directory that your run the script in.

     for lun in `cat $san.out`
        do
        sleep 2
        echo $san
        naviseccli -h $san -np lun -list -name $lun -tiers > $lun.$san.dat &
     done
 Each file will be named <lun_name>.dat, and the contents of the file looks like this:
LOGICAL UNIT NUMBER 962
Name:  P1_LUN962_0000_SPB_servername_filesystem_350G
Tier Distribution: 
Flash:  4.74%
FC:  95.26%
 Section 3:

This line simply makes a copy of the previous day’s output file for archiving purposes.  The %j adds the Julian date to the file (which is 1-365, the day of the year), so the files will automatically be overwritten after one year.  It’s a self cleaning archive directory.  🙂

mv $san.report.csv $san.report.`date +%j`.csv

Section 4:

This section then processes each individual LUN file pulling out only the tier information that we need, and then combines the list into one large output file in csv format.

The first line creates a blank CSV file with the appropriate column headers.

echo "LUN Name","FLASH","FC","SATA" > $san.report.csv

This block of code parses each individual LUN file, doing a grep for each column item that we need added to the report, and awk to only grab the specific text that we want from that line.  For example, if the LUN output file has “Flash:  4.74%” in one line, and we only want the “4.74%” and the word “Flash:” stripped off, we would do an awk ‘{print $2}’ to grab only the second line item.

     for lun in `cat  $san.out`
        do
        echo $lun
        echo `grep Name $lun.$san.dat |awk '{print $2}'`","`grep -i flash $lun.$san.dat |awk '{print $2}'`","`grep -i fc $lun.$san.dat |awk '{print $2}'`","`grep -i sata $lun.$san.dat |awk '{print $2}'` >> $san.report.csv
     done
done
 Once every LUN file has been processed and added to the report, I run the csv2html.pl perl script (from http://www.jpsdomain.org/source/perl.html) to add to our intranet website.  The csv files are also added as download links on the site.
./csv2htm.pl -e -T -i /reports/clariion1_hostname.report.csv -o /reports/clariion1_hostname.report.html

./csv2htm.pl -e -T -i /reports/clariion2_hostname.report.csv -o /reports/clariion2_hostname.report.html

./csv2htm.pl -e -T -i /reports/clariion3_hostname.report.csv -o /reports/clariion3_hostname.report.html
 And finally, the output looks like this:
LUN Name FLASH FC SATA
P0_LUN101_0000_SPA_servername_filesystem_100G

24.32%

67.57%

8.11%

P0_LUN102_0000_SPA_servername_filesystem_100G

5.92%

58.77%

35.31%

P1_LUN103_0000_SPA_servername_filesystem_100G

7.00%

81.79%

11.20%

P1_LUN104_0000_SPA_servername_filesystem_100G

1.40%

77.20%

21.40%

P0_LUN200_0000_SPA_servername_filesystem_100G

5.77%

75.06%

19.17%

P0_LUN201_0000_SPA_servername_filesystem_100G

6.44%

71.21%

22.35%

P0_LUN202_0000_SPA_servername_filesystem_100G

4.55%

90.91%

4.55%

P0_LUN203_0000_SPA_servername_filesystem_100G

10.73%

80.76%

8.52%

P0_LUN204_0000_SPA_servername_filesystem_100G

8.62%

88.31%

3.08%

P0_LUN205_0000_SPA_servername_filesystem_100G

10.88%

82.65%

6.46%

P0_LUN206_0000_SPA_servername_filesystem_100G

7.00%

81.79%

11.20%

P0_LUN207_0000_SPA_servername_filesystem_100G

1.40%

77.20%

21.40%

P0_LUN208_0000_SPA_servername_filesystem_100G

5.77%

75.06%

19.17%

Advertisements

Strategies for implementing Multi-tiered FAST VP Storage Pools

After speaking to our local rep and attending many different classes at the most recent EMC World in Vegas, I came away with some good information and a very logical best practice for implementing multi-tiered FAST VP storage pools.

First and foremost, you have to use Flash.  High RPM Fiber Channel drives are neighter capactiy efficient or performance efficient, the highest IO data needs to be hosted on Flash drives.  The most effective split of drives in a storage pool is 5% Flash, 20% Fiber Channel, and 75% SATA.

Using this example, if you have an existing SAN with 167 15,000 RPM 600GB Fiber Channel Drives, you would replace them with 97 drives in the 5/20/75 blend to get the same capacity with much improved performance:

  • 25 200GB Flash Drives
  • 34 15K 600GB Fiber Channel Drives
  • 38 2TB SATA Drives

The ideal scenario is to implement FAST Cache along with FAST VP.  FAST Cache continously ensures that the hottest data is serverd from Flash Drives.  With FAST Cache, up to 80% of your data IO will come from Cache (Legacy DRAM Cache served up only about 20%).

It can be a hard pill to swallow when you see how much the Flash drives cost, but their cost is negated by increased disk utilization and reduction in the number of total drives and DAEs that you need to buy.   With all FC drives, disk utilization is sacrificed to get the needed performance – very little of the capacity is used, you just buy tons of disks in order to get more spindles in the raid groups for better performance.  Flash drives can achieve much higher utilization, reducing the effective cost.

After implementing this at my company I’ve seen dramatic performance improvements.  It’s an effective strategy that really works in the real world.

In addition to this, I’ve also been implementing storage pools in pairs of two, each sized identically.  The first pool is designated only for SP A, the second is for SPB.  When I get a request for data storage, in this case let’s say for 1 TB, I will create a 500GB LUN in the first pool on SP A, and a 500GB LUN in the second pool on SP B.  When the disks are presented to the host server, the server administrator will then stripe the data across the two LUNs.  Using this method, I can better balance the load across the storage processors on the back end.