Tag Archives: scripts

Auto transferring reports from VNX to an IIS web server via FTP

I previously posted on how to create a script that monitors your Celerra replication jobs.  I have an intranet web page that is updated daily with many other reports (most of which I’ve posted about here), so I thought I’d add this one to the web page as well rather than having to search through my inbox for it every day.

Developing an easy and automated method of getting files from the Celerra to a windows based web server was my challenge.  I figured out an easy way to do this with FTP.  As my internal windows web server is also my internal FTP server, I can place the file directly in the public folder for easy web publishing.  Now that I’ve got the report working and updating on the intranet page every day my next task will be to come up with a more secure method using SSH or SCP, but this works well for now.

The big challenge in creating a bash script using FTP is figuring out how to pass the user id and password.  I tried various methods unsuccessfully and finally settled on using the .netrc file.  Create an empty file named .netrc in your home directory (in my case I put it in /home/nasadmin) with the following syntax:

machine <ftp_server_name> login <ftp_login_id> password <ftp_password>

Once that is created, you need to do a chmod 600 on the .netrc file in order for it to work.  If the permissions are not set to 600 on that file the auto-login to the FTP server will fail.

My next step was to create the script that sends the replication status report to the IIS web server:

#!/bin/bash
cd /home/nasadmin/scripts
ftp <ftp_server_name> <<SCRIPT
put <filename>.csv
quit
SCRIPT
 I always chmod the script with 755 and +X after creating it in vi.  The script always ran fine manually, but I struggled for a while getting it to work properly when run from crontab.  I figured out that you must cd to the correct directory in the script before you call the ftp command, if not you will get “file not found” errors when you run it.  I was always running it manually from within that directory, so I didn’t immediately catch that problem. 🙂

I then added the above script to crontab on the Celerra.  I run it at 6AM every morning with the following entry:

0 6 * * * /scripts/repl_status.sh

For those not familiar with cron, you can add an entry using “crontab -e”, and list your current entries with “crontab -l”.  The first two entries in the line “0 6” represent minutes and the hour of each day, in this case it will run at 6:00AM every day.

I have a download link to the csv file on my web page, and I also have a script on my web server that converts the csv file to HTML output with a perl script called csv2html.pl so the data can be easily viewed without having to download the csv and open it in excel.  You can find csv2html.pl easily with a google search, I’ve blogged about it in previous posts as well.

That’s it!  An easy way to automatically push your reports to another server from the Celerra.  Now that I have the transfer method down, I’ll be adding more daily reports in the near future.  If anyone has experience doing this type of transfer from a Celerra (or Linux server) to a windows server via SSH or SCP, please comment! 🙂

Advertisements

Reporting on the state of VNX auto-tiering

 

To go along with my previous post (reporting on LUN tier distribution) I also include information on the same intranet page about the current state of the auto-tiering job.  We run auto-tiering from 10PM to 6AM in the morning to avoid the movement of data during business hours or our normal backup window in the evening.

Sometimes the auto-tiering job will get very backed up and would theoretically never finish in the time slot that we have for data movement.  I like to keep tabs on the amount of data that needs to move up or down, and the amount of time that the array estimates until it’s completion.  If needed, I will sometimes modify the schedule to run 24 hours a day over the weekend and change it back early on Monday morning.  Unfortunately, EMC did not design the auto-tiering scheduler to allow for creating different time windows on different days. It’s a manual process.

This is a relatively simple, one line CLI command, but it provides very useful info and it’s convenient to add it to a daily report to see it at a glance.

I run this script at 6AM every day, immediately following the end of the window for data to move:

naviseccli -h clariion1_hostname autotiering -info -state -rate -schedule -opStatus > c:\inetpub\wwwroot\clariion1_hostname.autotier.txt

naviseccli -h clariion2_hostname autotiering -info -state -rate -schedule -opStatus > c:\inetpub\wwwroot\clariion2_hostname.autotier.txt

naviseccli -h clariion3_hostname autotiering -info -state -rate -schedule -opStatus > c:\inetpub\wwwroot\clariion3_hostname.autotier.txt

naviseccli -h clariion4_hostname autotiering -info -state -rate -schedule -opStatus > c:\inetpub\wwwroot\clariion4_hostname.autotier.txt

 ....
 The output for each individual clariion looks like this:
Auto-Tiering State: Enabled
Relocation Rate: Medium

Schedule Name: Default Schedule
Schedule State: Enabled
Default Schedule: Yes
Schedule Days: Sun Mon Tue Wed Thu Fri Sat
Schedule Start Time: 22:00
Schedule Stop Time: 6:00
Schedule Duration: 8 hours
Storage Pools: Clariion1_SPB, Clariion2_SPA

Storage Pool Name: Clariion2_SPA
Storage Pool ID: 0
Relocation Start Time: 12/05/11 22:00
Relocation Stop Time: 12/06/11 6:00
Relocation Status: Inactive
Relocation Type: Scheduled
Relocation Rate: Medium
Data to Move Up (GBs): 1854.11
Data to Move Down (GBs): 909.06
Data Movement Completed (GBs): 2316.00
Estimated Time to Complete: 9 hours, 12 minutes
Schedule Duration Remaining: None

Storage Pool Name: Clariion1_SPB
Storage Pool ID: 1
Relocation Start Time: 12/05/11 22:00
Relocation Stop Time: 12/06/11 6:00
Relocation Status: Inactive
Relocation Type: Scheduled
Relocation Rate: Medium
Data to Move Up (GBs): 1757.11
Data to Move Down (GBs): 878.05
Data Movement Completed (GBs): 1726.00
Estimated Time to Complete: 11 hours, 42 minutes
Schedule Duration Remaining: None
 
 

Reporting on LUN auto-tier distribution

We have auto-tiering turned on in all of our storage pools, which all use EFD, FC, and SATA disks.  I created a script that will generate a list of all of our LUNs and the current tier distribution for each LUN.  Note that this script is designed to run in unix.  It can be run using cygwin installed on a Windows server if you don’t have access to a unix based server.

You will first need to create a text file with a list of the hostnames for your arrays (or the IP to one of the storage processors for each array).  Separate lists must be made for VNX vs. older Clariion arrays, as the naviseccli output was changed for VNX.  For example, “Flash” in the text output on a CX was changed to “Extreme Performance” as the output from a VNX when you run the same command.  I have one file named san.list for the older arrays, and another named san2.list for the VNX arrays.

As I mentioned in my previous post, our naming convention for LUNs includes the pool ID, LUN number, server name, filesystem/drive letter, last four digits of the array’s serial number, and size (in GB). Having all of this information in the LUN name makes for very easy reporting.  This information is what truly makes this report useful, as simply having a list of LUNs gives me all the information I need for reporting.  If I need to look at tier distribution for a certain server from this report, I simply filter the list in the spreadsheet for the server name (which is included in the LUN name).

Here’s what our LUN names looks like: P1_LUN100_SPA_0000_servername_filesystem_150G

As I said earlier, because of output differences from the naviseccli command on VNX arrays vs. older CX’s, I have two separate scripts.  I’ll include the complete scripts first, then explain in more detail what each section does.

Here is the script for CX series arrays:

for san in `/bin/cat /reports/tiers/san.list`
do
naviseccli -h $san lun -list -tiers |grep LUN |awk '{print $2}' > $san.out 
     for lun in `cat $san.out`
        do
        sleep 2
        echo $san
        naviseccli -h $san -np lun -list -name $lun -tiers > $lun.$san.dat &
     done 

mv $san.report.csv $san.report.`date +%j`.csv 
echo "LUN Name","FLASH","FC","SATA" > $san.report.csv 
     for lun in `cat  $san.out`
        do
        echo $lun
        echo `grep Name $lun.$san.dat |awk '{print $2}'`","`grep -i flash $lun.$san.dat |awk '{print $2}'`","`grep -i fc $lun.$san.dat |awk '{print $2}'`","`grep -i sata $lun.$san.dat |awk '{print $2}'` >> $san.report.csv
     done
 done

./csv2htm.pl -e -T -i /reports/clariion1_hostname.report.csv -o /reports/clariion1_hostname.report.html

./csv2htm.pl -e -T -i /reports/clariion2_hostname.report.csv -o /reports/clariion2_hostname.report.html

./csv2htm.pl -e -T -i /reports/clariion3_hostname.report.csv -o /reports/clariion3_hostname.report.html

Here is the script for VNX series arrays:

for san in `/bin/cat /reports/tiers2/san2.list`
do
naviseccli -h $san lun -list -tiers |grep LUN |awk '{print $2}' > $san.out
   for lun in `cat $san.out`
     do
     sleep 2
     echo $san.Generating-LUN-List
     naviseccli -NoPoll -h $san lun -list -name $lun -tiers > $lun.$san.dat &
  done

mv $san.report.csv $san.report.`date +%j`.csv
echo "LUN Name","FLASH","FC","SATA" > $san.report.csv
   for lun in `cat  $san.out`
      do
      echo $lun
      echo `grep Name $lun.$san.dat |awk '{print $2}'`","`grep -i extreme $lun.$san.dat |awk '{print $3}'`","`grep -i Performance $lun.$san.dat |grep -v Extreme|awk '{print $2}'`","`grep -i Capacity $lun.$san.dat |awk '{print $2}'` >> $san.report.csv
   done
 done

./csv2htm.pl -e -T -i /reports/VNX1_hostname.report.csv -o /reports/VNX1_hostname.report.html

./csv2htm.pl -e -T -i /reports/VNX2_hostname.report.csv -o /reports/VNX2_hostname.report.html

./csv2htm.pl -e -T -i /reports/VNX3_hostname.report.csv -o /reports/VNX3_hostname.report.html
 Here is a more detailed explanation of the script.

Section 1:

The entire script runs in a loop based on the SAN hostname entries.   We’ll use this list in the next section to get the LUN information from each SAN that needs to be monitored.

for san in `/bin/cat /reports/tiers/san.list`

do

naviseccli -h $san lun -list -tiers |grep LUN |awk '{print $2}' > $san.out
 Section 2:

This section will run the naviseccli command for every lun in each of the <san_hostname>.out files, and output a single text file with the tier distribution for every LUN.  If you have 500 LUNs, then 500 text files will be created in the same directory that your run the script in.

     for lun in `cat $san.out`
        do
        sleep 2
        echo $san
        naviseccli -h $san -np lun -list -name $lun -tiers > $lun.$san.dat &
     done
 Each file will be named <lun_name>.dat, and the contents of the file looks like this:
LOGICAL UNIT NUMBER 962
Name:  P1_LUN962_0000_SPB_servername_filesystem_350G
Tier Distribution: 
Flash:  4.74%
FC:  95.26%
 Section 3:

This line simply makes a copy of the previous day’s output file for archiving purposes.  The %j adds the Julian date to the file (which is 1-365, the day of the year), so the files will automatically be overwritten after one year.  It’s a self cleaning archive directory.  🙂

mv $san.report.csv $san.report.`date +%j`.csv

Section 4:

This section then processes each individual LUN file pulling out only the tier information that we need, and then combines the list into one large output file in csv format.

The first line creates a blank CSV file with the appropriate column headers.

echo "LUN Name","FLASH","FC","SATA" > $san.report.csv

This block of code parses each individual LUN file, doing a grep for each column item that we need added to the report, and awk to only grab the specific text that we want from that line.  For example, if the LUN output file has “Flash:  4.74%” in one line, and we only want the “4.74%” and the word “Flash:” stripped off, we would do an awk ‘{print $2}’ to grab only the second line item.

     for lun in `cat  $san.out`
        do
        echo $lun
        echo `grep Name $lun.$san.dat |awk '{print $2}'`","`grep -i flash $lun.$san.dat |awk '{print $2}'`","`grep -i fc $lun.$san.dat |awk '{print $2}'`","`grep -i sata $lun.$san.dat |awk '{print $2}'` >> $san.report.csv
     done
done
 Once every LUN file has been processed and added to the report, I run the csv2html.pl perl script (from http://www.jpsdomain.org/source/perl.html) to add to our intranet website.  The csv files are also added as download links on the site.
./csv2htm.pl -e -T -i /reports/clariion1_hostname.report.csv -o /reports/clariion1_hostname.report.html

./csv2htm.pl -e -T -i /reports/clariion2_hostname.report.csv -o /reports/clariion2_hostname.report.html

./csv2htm.pl -e -T -i /reports/clariion3_hostname.report.csv -o /reports/clariion3_hostname.report.html
 And finally, the output looks like this:
LUN Name FLASH FC SATA
P0_LUN101_0000_SPA_servername_filesystem_100G

24.32%

67.57%

8.11%

P0_LUN102_0000_SPA_servername_filesystem_100G

5.92%

58.77%

35.31%

P1_LUN103_0000_SPA_servername_filesystem_100G

7.00%

81.79%

11.20%

P1_LUN104_0000_SPA_servername_filesystem_100G

1.40%

77.20%

21.40%

P0_LUN200_0000_SPA_servername_filesystem_100G

5.77%

75.06%

19.17%

P0_LUN201_0000_SPA_servername_filesystem_100G

6.44%

71.21%

22.35%

P0_LUN202_0000_SPA_servername_filesystem_100G

4.55%

90.91%

4.55%

P0_LUN203_0000_SPA_servername_filesystem_100G

10.73%

80.76%

8.52%

P0_LUN204_0000_SPA_servername_filesystem_100G

8.62%

88.31%

3.08%

P0_LUN205_0000_SPA_servername_filesystem_100G

10.88%

82.65%

6.46%

P0_LUN206_0000_SPA_servername_filesystem_100G

7.00%

81.79%

11.20%

P0_LUN207_0000_SPA_servername_filesystem_100G

1.40%

77.20%

21.40%

P0_LUN208_0000_SPA_servername_filesystem_100G

5.77%

75.06%

19.17%

Reporting on Trespassed LUNs

 

All of our production clariions are configured with two large tiered storage pools, one for LUNs on SPA and one for LUNs on SPB.  When storage is created on a server, two identical LUNs are created (one in each pool) and are striped at the host level.  I do it that way to more evenly balance the load on the storage processors.

I’ve noticed that LUNs will occassionally trespass to the other SP.  In order to keep the SP’s balanced how I want them, I will routinely check and trespass them back to their default owner.  Our naming convention for LUNs includes the SP that the LUN was initially configured to use, as well as the pool ID, server name, filesystem/drive letter, last four digits of serial number, and size.  Having all of this information in the LUN name makes for very easy reporting.  Having the default SP in the LUN name is required for this script to work as written.

Here’s what our LUN names looks like:     P1_LUN100_SPA_0000_servername_filesystem_150G

To quickly check on the status of any mismatched LUNs every morning, I created a script that generates a daily report.  The script first creates output files that list all of the LUNs on each SP, then uses simple grep commands to output only the LUNs whose SP designation in the name does not match the current owner.   The csv output files are then parsed by the csv2html perl script, which converts the csv into easy to read HTML files that are automatically posted on our intranet web site.  The csv2html perl script is from http://www.jpsdomain.org/source/perl.html and is under a GNU General Public License.  Note that this script is designed to run in unix.  It can be run using cygwin installed on a Windows server if you don’t have access to a unix based server.

Here’s the shell script (I have one for each clariion/VNX):

naviseccli -h clariion_hostname getlun -name -owner |grep -i name > /reports/sp/lunname.out

sleep 5

naviseccli -h clariion_hostname getlun -name -owner |grep -i current >  /reports/sp/currentsp.out

sleep 5

paste -d , /reports/sp/lunname.out /reports/sp/currentsp.out >  /reports/sp/clariion_hostname.spowner.csv

./csv2htm.pl -e -T -i /reports/sp/clariion_hostname.spowner.csv -o /reports/sp/clariion_hostname.spowner.html

#Determine SP mismatches between LUNs and SPs, output to separate files

cat /reports/sp/clariion_hostname.spowner.csv | grep 'SP B' > /reports/sp/clariion_hostname_spb.csv

grep SPA /reports/sp/clariion_hostname_spb.csv > /reports/sp/clariion_hostname_spb_mismatch.csv

cat /reports/sp/clariion_hostname.spowner.csv | grep 'SP A' > /reports/sp/clariion_hostname_spa.csv

grep SPB /reports/sp/clariion_hostname_spa.csv > /reports/sp/clariion_hostname_spa_mismatch.csv

#Convert csv output files to HTML for intranet site

./csv2htm.pl -e -d -T -i /reports/sp/clariion_hostname_spa_mismatch.csv -o /reports/sp/clariion_hostname_spa_mismatch.html

./csv2htm.pl -e -d -T -i /reports/sp/clariion_hostname_spb_mismatch.csv -o /reports/sp/clariion_hostname_spb_mismatch.html
 The output files look like this (clariion_hostname_spa_mismatch.html from the script):
Name: P1_LUN100_SPA_0000_servername_filesystem1_150G       Current Owner: SPB

Name: P1_LUN101_SPA_0000_servername_filesystem2_250G      Current Owner: SPB

Name: P1_LUN102_SPA_0000_servername_filesystem3_350G      Current Owner: SPB

Name: P1_LUN103_SPA_0000_servername_filesystem4_450G
Current Owner: SPB

Name: P1_LUN104_SPA_0000_servername_filesystem5_550G      
Current Owner: SPB
 The 0000 represents the last four digits of the serial number of the Clariion.

That’s it, a quick and easy way to report on trespassed LUNs in our environment.

Celerra replication monitoring script

This script allows me to quickly monitor and verify the status of my replication jobs every morning.  It will generate a csv file with six columns for file system name, interconnect, estimated completion time, current transfer size,current transfer size remaining, and current write speed.

I recently added two more remote offices to our replication topology and I like to keep a daily tab on how much longer they have to complete the initial seeding, and it will also alert me to any other jobs that are running too long and might need my attention.

Step 1:

Log in to your Celerra and create a directory for the script.  I created a subdirectory called “scripts” under /home/nasadmin.

Create a text file named ‘replfs.list’ that contains a list of your replicated file systems.  You can cut and paste the list out of Unisphere.

The contents of the file should should look something like this:

Filesystem01
Filesystem02
Filesystem03
Filesystem04
Filesystem05
 Step 2:

Copy and paste all of the code into a text editor and modify it for your needs (the complete code is at the bottom of this post).  I’ll go through each section here with an explanation.

1: The first section will create a text file ($fs.dat) for each filesystem in the replfs.list file you made eariler.

for fs in `cat replfs.list`
         do
         nas_replicate -info $fs | egrep 'Celerra|Name|Current|Estimated' > $fs.dat
         done
 The output will look like this:
Name                                        = Filesystem_01
Source Current Data Port            = 57471
Current Transfer Size (KB)          = 232173216
Current Transfer Remain (KB)     = 230877216
Estimated Completion Time        = Thu Nov 24 06:06:07 EST 2011
Current Transfer is Full Copy      = Yes
Current Transfer Rate (KB/s)       = 160
Current Read Rate (KB/s)           = 774
Current Write Rate (KB/s)           = 3120
 2: The second section will create a blank csv file with the appropriate column headers:
echo 'Name,System,Estimated Completion Time,Current Transfer Size (KB),Current Transfer Remain (KB),Write Speed (KB)' > replreport.csv

3: The third section will parse all of the output files created by the first section, pulling out only the data that we’re interested in.  It places it in columns in the csv file.

         for fs in `cat replfs.list`

         do

         echo $fs","`grep Celerra $fs.dat | awk '{print $5}'`","`grep -i Estimated $fs.dat |awk '{print $5,$6,$7,$8,$9,$10}'`","`grep -i Size $fs.dat |awk '{print $6}'`","`grep -i Remain $fs.dat |awk '{print $6}'`","`grep -i Write $fs.dat |awk '{print $6}'` >> replreport.csv

        done
 If you’re not familiar with awk, I’ll give a brief explanation here.  When you grep for a certain line in the output code, awk will allow you to output only one word in the line.

For example, if you want the output of “Yes” put into a column in the csv file, but the output code line looks like “Current Transfer is Full Copy      = Yes”, then you could pull out only the “Yes” by typing in the following:

 nas_replicate -info Filesystem01 | grep  Full | awk '{print $7}'

Because the word ‘Yes’ is the 7th item in the line, the output would only contain the word Yes.

4: The final section will send an email with the csv output file attached.

uuencode replreport.csv replreport.csv | mail -s "Replication Status Report" user@domain.com

Step 3:

Copy and paste the modified code into a script file and save it.  I have mine saved in the /home/nasadmin/scripts folder. Once the file is created, make it executable by typing in chmod +X scriptfile.sh, and change the permissions with chmod 755 scriptfile.sh.

Step 4:

You can now add the file to crontab to run automatically.  Add it to cron by typing in crontab –e, to view your crontab entries type crontab –l.  For details on how to add cron entries, do a google search as there is a wealth of info available on your options.

Script Code:

for fs in `cat replfs.list`

         do

         nas_replicate -info $fs | egrep 'Celerra|Name|Current|Estimated' > $fs.dat

        done

 echo 'Name,System,Estimated Completion Time,Current Transfer Size (KB),Current Transfer Remain (KB),Write Speed (KB)' > replreport.csv

         for fs in `cat replfs.list`

         do

         echo $fs","`grep Celerra $fs.dat | awk '{print $5}'`","`grep -i Estimated $fs.dat |awk '{print $5,$6,$7,$8,$9,$10}'`","`grep -i Size $fs.dat |awk '{print $6}'`","`grep -i Remain $fs.dat |awk '{print $6}'`","`grep -i Write $fs.dat |awk '{print $6}'` >> replreport.csv

         done

 uuencode replreport.csv replreport.csv | mail -s "Replication Status Report" user@domain.com
 The final output of the script generates a report that looks like the sample below.  Filesystems that have all zeros and no estimated completion time are caught up and not currently performing a data synchronization.
Name System Estimated Completion Time Current Transfer Size (KB) Current Transfer Remain (KB) Write Speed (KB)
SA2Users_03 SA2VNX5500 0 0 0
SA2Users_02 SA2VNX5500 Wed Dec 16 01:16:04 EST 2011 211708152 41788152 2982
SA2Users_01 SA2VNX5500 Wed Dec 16 18:53:32 EST 2011 229431488 59655488 3425
SA2CommonFiles_04 SA2VNX5500 0 0 0
SA2CommonFiles_03 SA2VNX5500 Wed Dec 16 10:35:06 EST 2011 232173216 53853216 3105
SA2CommonFiles_02 SA2VNX5500 Mon Dec 14 15:46:33 EST 2011 56343592 12807592 2365
SA2commonFiles_01 SA2VNX5500 0 0 0

Auto generating daily performance graphs with EMC Control Center / Performance Manager

This document describes the process I used to pull performance data using the ECC pmcli command line tool, parse the data to make it more usable with a graphing tool, and then use perl scripts to automatically generate graphs.

You must install Perl.  I use ActiveState Perl (Free Community Edition) (http://www.activestate.com/activeperl/downloads).

You must install Cygwin.  Link: http://www.cygwin.com/install.html. I generally choose all packages.

I use the follow CPAN Perl modules:

Step 1:

Once you have the software set up, the first step is to use the ECC command line utility to extract the interval performance data that you’re interested in graphing.  Below is a sample PMCLI command line script that could be used for this purpose.

:Get the current date

For /f “tokens=2-4 delims=/” %%a in (‘date /t’) do (set date=%%c%%a%%b)

:Export the interval file for today’s date.

D:\ECC\Client.610\PerformanceManager\pmcli.exe -export -out D:\archive\interval.csv -type interval -class clariion -date %date% -id APM00324532111

:Copy all the export data to my cygwin home directory for processing later.

copy /y e:\san712_interval.csv C:\cygwin\home\<userid>

You can schedule the command script above to run using windows task scheduler.  I run it at 11:46PM every night, as data is collected on our SAN in 15 minute intervals, and that gives me a file that reports all the way up to the end of one calendar day.

Note that there are 95 data collection points from 00:15 to 23:45 every day if you collect data at 15 minute intervals.  The storage processor data resides in the last two lines of the output file.

Here is what the output file looks like:

EMC ControlCenter Performance manager generated file from: <path>

Data Collected for DiskStats

Data Collected for DiskStats – 0_0_0

                                                             3/28/11 00:15       3/28/11 00:30      3/28/11  00:45      3/28/11 01:00 

Number of Arrivals with Non Zero Queue     12                         20                        23                      23 

% Utilization                                                30.2                     33.3                     40.4                  60.3

Response Time                                              1.8                        3.3                        5.4                     7.8

Read Throughput IO per sec                        80.6                    13.33                   90.4                    10.3

Great information in there, but the format of the data makes it very hard to do anything meaningful with the data in an excel chart.  If I want to chart only % utilization, that data is difficult to chart because there are so many counters around it that are also have data collected on them.   My next goal was to write a script to reformat the data in a much more usable format to automatically create a graph for one specific counter that I’m interested in (like daily utilization numbers), which could then be emailed daily or auto-uploaded to an internal website.

Step 2:

Once the PMCLI data is exported, the next step is to use cygwin bash scripts to parse the csv file and pull out only the performance data that is needed.  Each SAN will need a separate script for each type of performance data.  I have four scripts configured to run based on the data that I want to monitor.  The scripts are located in my cygwin home directory.

The scripts I use:

  • Iostats.sh (for total IO throughput)
  • Queuestats.sh (for disk queue length)
  • Resptime.sh (for disk response time in ms)
  • Utilstats.sh (for % utilization)

Here is a sample shell script for parsing the CSV export file (iostats.sh):

#!/usr/bin/bash

#This will pull only the timestamp line from the top of the CSV output file. I’ll paste it back in later.

grep -m 1 “/” interval.csv > timestamp.csv

#This will pull out only lines that begin with “total througput io per sec”.

grep -i “^Total Throughput IO per sec” interval.csv >> stats.csv

#This will pull out the disk/LUN title info for the first column.  I’ll add this back in later.

grep -i “Data Collected for DiskStats -” interval.csv > diskstats.csv

grep -i “Data Collected for LUNStats -” interval.csv > lunstats.csv

#This will create a column with the disk/LUN number .  I’ll paste it into the first column later.

cat diskstats.csv lunstats.csv > data.csv

#This adds the first column (disk/LUN) and combines it with the actual performance data columns.

paste data.csv stats.csv > combined.csv

#This combines the timestamp header at the top with the combined file from the previous step to create the final file we’ll use for the graph.  There is also a step to append the current date and copy the csv file to an archive directory.

cat timestamp.csv combined.csv > iostats.csv

cp iostats.csv /cygdrive/e/SAN/csv_archive/iostats_archive_$(date +%y%m%d).csv

#  This removes all the temporary files created earlier in the script.  They’re no longer needed.

rm timestamp.csv

rm stats.csv

rm diskstats.csv

rm lunstats.csv

rm data.csv

rm combined.csv

#This strips the last two lines of the CSV (Storage Processor data).  The resulting file is used for the “all disks” spreadsheet.  We don’t want the SP
data to skew the graph.  This CSV file is also copied to the archive directory.

sed ‘$d’ < iostats.csv > iostats2.csv

sed ‘$d’ < iostats2.csv > iostats_disk.csv

rm iostats2.csv

cp iostats_disk.csv /cygdrive/e/SAN/csv_archive/iostats_disk_archive_$(date +%y%m%d).csv

Note: The shell script above can be run in the windows task scheduler as long as you have cygwin installed.  Here’s the syntax:

c:\cygwin\bin\bash.exe -l -c “/home/<username>/iostats.sh”

After running the shell script above, the resulting CSV file contains only Total Throughput (IO per sec) data for each disk and lun.  It will contain data from 00:15 to 23:45 in 15 minute increments.  After the cygwin scripts have run we will have csv datasets that are ready to be exported to a graph.

The Disk and LUN stats are combined into the same CSV file.  It is entirely possible to rewrite the script to only have one or the other.  I put them both in there to make it easier to manually create a graph in excel for either disk or lun stats at a later time (if necessary).  The “all disks graph” does not look any different with both disk and lun stats in there, I tried it both ways and they overlap in a way that makes the extra data indistinguishable in the image.

The resulting data output after running the iostats.sh script is shown below.  I now have a nice, neat excel spreadsheet that lists the total throughput for each disk in the array for the entire day in 15 minute increments.   Having the data formatted in this way makes it super easy to create charts.  But I don’t want to have to do that manually every day, I want the charts to be created automatically.

                                                             3/28/11 00:15       3/28/11 00:30      3/28/11  00:45      3/28/11 01:00

Total Throughput IO per sec   – 0_0_0          12                             20                             23                           23 

Total Throughput IO per sec    – 0_0_1        30.12                        33.23                        40.4                         60.23

Total Throughput IO per sec    – 0_0_2         1.82                          3.3                           5.4                              7.8

Total Throughput IO per sec    -0_0_3         80.62                        13.33                        90.4                         10.3 

Step 3:

Now I want to automatically create the graphs every day using a Perl script.  After the CSV files are exported to a more usable format from the previous step, I Use the GD::Graph library from CPAN (http://search.cpan.org/~mverb/GDGraph-1.43/Graph.pm) to auto-generate the graphs.

Below is a sample Perl script that will autogenerate a great looking graph based on the CSV ouput file from the previous step.

#!/usr/bin/perl

#Declare the libraries that will be used.

use strict;

use Text::ParseWords;

use GD::Graph::lines;

use Data::Dumper;

#Specify the csv file that will be used to create the graph

my $file = ‘C:\cygwin\home\<username>\iostats_disk.csv’;

#my $file  = $ARGV[0];

my ($output_file) = ($file =~/(.*)\./);

#Create the arrays for the data and the legends

my @data;

my @legends;

#parse csv, generate an error if it fails

open(my $fh, ‘<‘, $file) or die “Can’t read csv file ‘$file’ [$!]\n”;

my $countlines = 0;

while (my $line = <$fh>) {

chomp $line;

my @fields = Text::ParseWords::parse_line(‘,’, 0, $line);

#There are 95 fields generated to correspond to the 95 data collection points in each
of the output files.

my @field =

(@fields[1],@fields[2],@fields[3],@fields[4],@fields[5],@fields[6],@fields[7],@fields[8],@fields[9],@fields[10],@fields[11],@fields[12],@fields[13],@fields[14],@fields[15],@fields[16],@fields[17],@fields[18],@fields[19],@fields[20],@fields[21],@fields[22],@fields[23],@fields[24],@fields[25],@fields[26],@fields[27],@fields[28],@fields[29],@fields[30],@fields[31],@fields[32],@fields[33],@fields[34],@fields[35],@fields[36],@fields[37],@fields[38],@fields[39],@fields[40],@fields[41],@fields[42],@fields[43],@fields[44],@fields[45],@fields[46],@fields[47],@fields[48],@fields[49],@fields[50],@fields[51],@fields[52],@fields[53],@fields[54],@fields[55],@fields[56],@fields[57],@fields[58],@fields[59],@fields[60],@fields[61],@fields[62],@fields[63],@fields[64],@fields[65],@fields[66],@fields[67],@fields[68],@fields[69],@fields[70],@fields[71],@fields[72],@fields[3],@fields[74],@fields[75],@fields[76],@fields[77],@fields[78],@fields[79],@fields[80],@fields[81],@fields[82],@fields[83],@fields[84],@fields[85],@fields[86],@fields[87],@fields[88],@fields[89],@fields[90],@fields[91],@fields[92],@fields[93],@fields[94],@fields[95]);
push @data, \@field;

if($countlines >= 1){

push @legends, @fields[0];

}

$countlines++;

}

#The data and legend arrays will read 820 lines of the CSV file.  This number will change based on the number of disks in the SAN, and will be different depending on the SAN being reported on.  The legend info will read the first column of the spreadsheet and create a color box that corresponds to the graph line.  For the purpose of this graph, I won’t be using it because 820+ legend entries look like a mess on the screen.

splice @data, 1, -820;

splice @legends, 0, -820;

#Set Graphing Options

my $mygraph = GD::Graph::lines->new(1024, 768);

# There are many graph options that can be changed using the GD::Graph library.  Check the website (and google) for lots of examples.

$mygraph->set(

title => ‘SP IO Utilization (00:15 – 23:45)’,

y_label => ‘IOs Per Second’,

y_tick_number => 4,

values_vertical => 6,

show_values => 0,

x_label_skip => 3,

) or warn $mygraph->error;

#As I said earlier, because of the large number of legend entries for this type of graph, I change the legend to simply read “All disks”.  If you want the legend to actually put the correct entries and colors, use this line instead:  $mygraph->set_legend(@legends);

$mygraph->set_legend(‘All Disks’);

#Plot the data

my $myimage = $mygraph->plot(\@data) or die $mygraph->error;

# Export the graph as a gif image.  The images are currently moved to the IIS folder (c:\inetpub\wwwroot) with one of the scripts.  The could also be emailed using a sendmail utility.

my $format = $mygraph->export_format;

open(IMG,”>$output_file.$format”) or die $!;

binmode IMG;

print IMG $myimage->gif;

close IMG;

After this script runs the resulting image file will be saved in the cygwin home directory (It saves it in the same directory that the CSV file is located in).  One of the nightly scripts I run will copy the image to our interal IIS server’s image directory, and sendmail will email the graph to the SAN Admin team.

That’s it!  You now have lots of pretty graphs with which you can impress your management team. 🙂

Here is a sample graph that was generated with the Perl script:

Tiering reports for EMC’s FAST VP

Note: On a separate blog post, I shared a script to generate a report of the tiering status of all LUNs.

One of the items that EMC did not implement along with FAST VP is the ability to run a canned report on how your LUNs are being allocated among the different tiers of storage.  While there is no canned report, alas, it is possible to get this information from the CLI.

The naviseccli –h {SP IP or hostname} lun –list –tiers command fits the bill. It shows how a specific LUN is distributed across the different drive types.  I still need to come up with a script to pull out only the information that I want, but the info is definitely in the command’s output.

Here’s the sample output:

LOGICAL UNIT NUMBER 6
 Name:  LUN 6
 Tier Distribution:
 Flash:  13.83%
 FC:  86.17%

The storagepool report gives some good info as well.  Here’s an excerpt of what you see with the naviseccli –h {SP IP or hostname} storagepool –list –tiers command:

SPA

Tier Name:  Flash
 Raid Type:  r_5
 User Capacity (GBs):  1096.07
 Consumed Capacity (GBs):  987.06
 Available Capacity (GBs):  109.01
 Percent Subscribed:  90.05%
 Data Targeted for Higher Tier (GBs):  0.00
 Data Targeted for Lower Tier (GBs):  11.00

Tier Name:  FC
 Raid Type:  r_5
 User Capacity (GBs):  28981.77
 Consumed Capacity (GBs):  10592.65
 Available Capacity (GBs):  18389.12
 Percent Subscribed:  36.55%

Tier Name:  SATA
 Raid Type:  r_5
 User Capacity (GBs):  11004.67
 Consumed Capacity (GBs):  260.02
 Available Capacity (GBs):  10744.66
 Percent Subscribed:  2.36%
 Data Targeted for Higher Tier (GBs):  3.00
 Data Targeted for Lower Tier (GBs):  0.00
 Disks (Type):

SPB

Tier Name:  Flash
 Raid Type:  r_5
 User Capacity (GBs):  1096.07
 Consumed Capacity (GBs):  987.06
 Available Capacity (GBs):  109.01
 Percent Subscribed:  90.05%
 Data Targeted for Higher Tier (GBs):  0.00
 Data Targeted for Lower Tier (GBs):  25.00

Tier Name:  FC
 Raid Type:  r_5
 User Capacity (GBs):  28981.77
 Consumed Capacity (GBs):  10013.61
 Available Capacity (GBs):  18968.16
 Percent Subscribed:  34.55%
 Data Targeted for Higher Tier (GBs):  25.00
 Data Targeted for Lower Tier (GBs):  0.00

Tier Name:  SATA
 Raid Type:  r_5
 User Capacity (GBs):  11004.67
 Consumed Capacity (GBs):  341.02
 Available Capacity (GBs):  10663.65
 Percent Subscribed:  3.10%
 Data Targeted for Higher Tier (GBs):  20.00
 Data Targeted for Lower Tier (GBs):  0.00

Good stuff in there.   It’s on my to-do list to run these commands periodically, and then parse the output to filter out only what I want to see.  Once I get that done I’ll post the script here too.

Note: I did create and post a script to generate a report of the tiering status of all LUNs.