Tag Archives: bash

Reporting on Celerra / VNX NAS Pool capacity with a bash script

I recently created a script that I run on all of our celerras and VNX’s that reports on NAS pool size.   The output from each array is then converted to HTML and combined on a single intranet page to provide a quick at-a-glance view of our global NAS capacity and disk space consumption.  I made another post that shows how to create a block storage pool report as well:  http://emcsan.wordpress.com/2013/08/09/reporting-on-celerravnx-block-storage-pool-capacity-with-a-bash-script/

The default command unfortunately outputs in Megabytes with no option to change to GB or TB.  This script performs the MB to GB conversion and adds a comma as the numerical separator (what we use in the USA) to make the output much more readable.

First, identify the ID number for each of your NAS pools.  You’ll need to insert the ID numbers into the script itself.

[nasadmin@celerra]$ nas_pool -list
 id      inuse   acl     name                      storage system
 10      y       0       NAS_Pool0_SPA             AKM00111000000
 18      y       0       NAS_Pool1_SPB             AKM00111000000
Note that the default output of the command that provides the size of each pool is in a very hard to read format.  I wanted to clean it up to make it easier to read on our reporting page.  Here’s the default output:
[nasadmin@celerra]$ nas_pool -size -all
id           = 10
name         = NAS_Pool0_SPA
used_mb      = 3437536
avail_mb     = 658459
total_mb     = 4095995
potential_mb = 0
id           = 18
name         = NAS_Pool1_SPB
used_mb      = 2697600
avail_mb     = 374396
total_mb     = 3071996
potential_mb = 1023998
 My script changes the output to look like the example below.
Name (Site)   ; Total GB ; Used GB  ; Avail GB
 NAS_Pool0_SPA ; 4,000    ; 3,356.97 ; 643.03
 NAS_Pool1_SPB ; 3,000    ; 2,634.38 ; 365.62
 In this example there are two NAS pools and this script is set up to report on both.  It could be easily expanded or reduced depending on the number of pools on your array. The variable names I used include the Pool ID number from the output above, that should be changed to match your ID’s.  You’ll also need to update the ‘id=’ portion of each command to match your Pool ID’s.

Here’s the script:


export NAS_DB

# Set the Locale to English/US, used for adding the comma as a separator in a cron job
export LC_NUMERIC="en_US.UTF-8"


# Gather Pool Name, Used MB, Avaialble MB, and Total MB for First Pool

# Set variable to pull the Name of the pool from the output of 'nas_pool -size'.
name18=`/nas/bin/nas_pool -size id=18 | /bin/grep name | /bin/awk '{print $3}'`

# Set variable to pull the Used MB of the pool from the output of 'nas_pool -size'.
usedmb18=`/nas/bin/nas_pool -size id=18 | /bin/grep used_mb | /bin/awk '{print $3}'`

# Set variable to pull the Available MB of the pool from the output of 'nas_pool -size'.
availmb18=`/nas/bin/nas_pool -size id=18 | /bin/grep avail_mb | /bin/awk '{print $3}'`
# Set variable to pull the Total MB of the pool from the output of 'nas_pool -size'.

totalmb18=`/nas/bin/nas_pool -size id=18 | /bin/grep total_mb | /bin/awk '{print $3}'`

# Convert MB to GB, Add Comma as separator in output

# Remove '...b' variables if you don't want commas as a separator

# Convert Used MB to Used GB
usedgb18=`/bin/echo $usedmb18/1024 | /usr/bin/bc -l | /bin/sed 's/^\./0./;s/0*$//;s/0*$//;s/\.$//'`

# Add comma separator
usedgb18b=`/usr/bin/printf "%'.2f\n" "$usedgb18" | /bin/sed 's/\.00$// ; s/\(\.[1-9]\)0$/\1/'`

# Convert Available MB to Available GB
availgb18=`/bin/echo $availmb18/1024 | /usr/bin/bc -l | /bin/sed 's/^\./0./;s/0*$//;s/0*$//;s/\.$//'`

# Add comma separator
availgb18b=`/usr/bin/printf "%'.2f\n" "$availgb18" | /bin/sed 's/\.00$// ; s/\(\.[1-9]\)0$/\1/'`

# Convert Total MB to Total GB
totalgb18=`/bin/echo $totalmb18/1024 | /usr/bin/bc -l | /bin/sed 's/^\./0./;s/0*$//;s/0*$//;s/\.$//'`

# Add comma separator
totalgb18b=`/usr/bin/printf "%'.2f\n" "$totalgb18" | /bin/sed 's/\.00$// ; s/\(\.[1-9]\)0$/\1/'`

# Gather Pool Name, Used MB, Avaialble MB, and Total MB for Second Pool

# Set variable to pull the Name of the pool from the output of 'nas_pool -size'.
name10=`/nas/bin/nas_pool -size id=10 | /bin/grep name | /bin/awk '{print $3}'`

# Set variable to pull the Used MB of the pool from the output of 'nas_pool -size'.
usedmb10=`/nas/bin/nas_pool -size id=10 | /bin/grep used_mb | /bin/awk '{print $3}'`

# Set variable to pull the Available MB of the pool from the output of 'nas_pool -size'.
availmb10=`/nas/bin/nas_pool -size id=10 | /bin/grep avail_mb | /bin/awk '{print $3}'`

# Set variable to pull the Total MB of the pool from the output of 'nas_pool -size'.
totalmb10=`/nas/bin/nas_pool -size id=10 | /bin/grep total_mb | /bin/awk '{print $3}'`
# Convert MB to GB, Add Comma as separator in output

# Remove '...b' variables if you don't want commas as a separator
# Convert Used MB to Used GB
usedgb10=`/bin/echo $usedmb10/1024 | /usr/bin/bc -l | /bin/sed 's/^\./0./;s/0*$//;s/0*$//;s/\.$//'`

# Add comma separator
usedgb10b=`/usr/bin/printf "%'.2f\n" "$usedgb10" | /bin/sed 's/\.00$// ; s/\(\.[1-9]\)0$/\1/'`

# Convert Available MB to Available GB
availgb10=`/bin/echo $availmb10/1024 | /usr/bin/bc -l | /bin/sed 's/^\./0./;s/0*$//;s/0*$//;s/\.$//'`

# Add comma separator
availgb10b=`/usr/bin/printf "%'.2f\n" "$availgb10" | /bin/sed 's/\.00$// ; s/\(\.[1-9]\)0$/\1/'`

# Convert Total MB to Total GB
totalgb10=`/bin/echo $totalmb10/1024 | /usr/bin/bc -l | /bin/sed 's/^\./0./;s/0*$//;s/0*$//;s/\.$//'`

# Add comma separator
totalgb10b=`/usr/bin/printf "%'.2f\n" "$totalgb10" | /bin/sed 's/\.00$// ; s/\(\.[1-9]\)0$/\1/'`

# Create Output File

# If you don't want the comma separator in the output file, substitute the variable without the 'b' at the end.

# I use the semicolon rather than the comma as a separator due to the fact that I'm using the comma as a numerical separator.

# The comma could be substituted here if desired.

/bin/echo $TODAY > /scripts/NasPool.txt
/bin/echo "Name" ";" "Total GB" ";" "Used GB" ";" "Avail GB" >> /scripts/NasPool.txt
/bin/echo $name18 ";" $totalgb18b ";" $usedgb18b ";" $availgb18b >> /scripts/NasPool.txt
/bin/echo $name10 ";" $totalgb10b ";" $usedgb10b ";" $availgb10b >> /scripts/NasPool.txt
 Here’s what the Output looks like:
Wed Jul 17 23:56:29 JST 2013
 Name (Site) ; Total GB ; Used GB ; Avail GB
 NAS_Pool0_SPA ; 4,000 ; 3,356.97 ; 643.03
 NAS_Pool1_SPB ; 3,000 ; 2,634.38 ; 365.62
 I use a cron job to schedule the report daily and copy it to our internal web server.  I then run the csv2html.pl perl script (from http://www.jpsdomain.org/source/perl.html) to convert it to an HTML output file to add to our intranet report page.

Note that I had to modify the csv2html.pl command to accomodate the use of a semicolon instead of the default comma in a csv file.  Here is the command I use to do the conversion:

./csv2htm.pl -e -T -D “;” -i /reports/NasPool.txt -o /reports/NasPool.html
 Below is what the output looks like after running the HTML conversion tool.



Auto generating daily performance graphs with EMC Control Center / Performance Manager

This document describes the process I used to pull performance data using the ECC pmcli command line tool, parse the data to make it more usable with a graphing tool, and then use perl scripts to automatically generate graphs.

You must install Perl.  I use ActiveState Perl (Free Community Edition) (http://www.activestate.com/activeperl/downloads).

You must install Cygwin.  Link: http://www.cygwin.com/install.html. I generally choose all packages.

I use the follow CPAN Perl modules:

Step 1:

Once you have the software set up, the first step is to use the ECC command line utility to extract the interval performance data that you’re interested in graphing.  Below is a sample PMCLI command line script that could be used for this purpose.

:Get the current date

For /f “tokens=2-4 delims=/” %%a in (‘date /t’) do (set date=%%c%%a%%b)

:Export the interval file for today’s date.

D:\ECC\Client.610\PerformanceManager\pmcli.exe -export -out D:\archive\interval.csv -type interval -class clariion -date %date% -id APM00324532111

:Copy all the export data to my cygwin home directory for processing later.

copy /y e:\san712_interval.csv C:\cygwin\home\<userid>

You can schedule the command script above to run using windows task scheduler.  I run it at 11:46PM every night, as data is collected on our SAN in 15 minute intervals, and that gives me a file that reports all the way up to the end of one calendar day.

Note that there are 95 data collection points from 00:15 to 23:45 every day if you collect data at 15 minute intervals.  The storage processor data resides in the last two lines of the output file.

Here is what the output file looks like:

EMC ControlCenter Performance manager generated file from: <path>

Data Collected for DiskStats

Data Collected for DiskStats – 0_0_0

                                                             3/28/11 00:15       3/28/11 00:30      3/28/11  00:45      3/28/11 01:00 

Number of Arrivals with Non Zero Queue     12                         20                        23                      23 

% Utilization                                                30.2                     33.3                     40.4                  60.3

Response Time                                              1.8                        3.3                        5.4                     7.8

Read Throughput IO per sec                        80.6                    13.33                   90.4                    10.3

Great information in there, but the format of the data makes it very hard to do anything meaningful with the data in an excel chart.  If I want to chart only % utilization, that data is difficult to chart because there are so many counters around it that are also have data collected on them.   My next goal was to write a script to reformat the data in a much more usable format to automatically create a graph for one specific counter that I’m interested in (like daily utilization numbers), which could then be emailed daily or auto-uploaded to an internal website.

Step 2:

Once the PMCLI data is exported, the next step is to use cygwin bash scripts to parse the csv file and pull out only the performance data that is needed.  Each SAN will need a separate script for each type of performance data.  I have four scripts configured to run based on the data that I want to monitor.  The scripts are located in my cygwin home directory.

The scripts I use:

  • Iostats.sh (for total IO throughput)
  • Queuestats.sh (for disk queue length)
  • Resptime.sh (for disk response time in ms)
  • Utilstats.sh (for % utilization)

Here is a sample shell script for parsing the CSV export file (iostats.sh):


#This will pull only the timestamp line from the top of the CSV output file. I’ll paste it back in later.

grep -m 1 “/” interval.csv > timestamp.csv

#This will pull out only lines that begin with “total througput io per sec”.

grep -i “^Total Throughput IO per sec” interval.csv >> stats.csv

#This will pull out the disk/LUN title info for the first column.  I’ll add this back in later.

grep -i “Data Collected for DiskStats -” interval.csv > diskstats.csv

grep -i “Data Collected for LUNStats -” interval.csv > lunstats.csv

#This will create a column with the disk/LUN number .  I’ll paste it into the first column later.

cat diskstats.csv lunstats.csv > data.csv

#This adds the first column (disk/LUN) and combines it with the actual performance data columns.

paste data.csv stats.csv > combined.csv

#This combines the timestamp header at the top with the combined file from the previous step to create the final file we’ll use for the graph.  There is also a step to append the current date and copy the csv file to an archive directory.

cat timestamp.csv combined.csv > iostats.csv

cp iostats.csv /cygdrive/e/SAN/csv_archive/iostats_archive_$(date +%y%m%d).csv

#  This removes all the temporary files created earlier in the script.  They’re no longer needed.

rm timestamp.csv

rm stats.csv

rm diskstats.csv

rm lunstats.csv

rm data.csv

rm combined.csv

#This strips the last two lines of the CSV (Storage Processor data).  The resulting file is used for the “all disks” spreadsheet.  We don’t want the SP
data to skew the graph.  This CSV file is also copied to the archive directory.

sed ‘$d’ < iostats.csv > iostats2.csv

sed ‘$d’ < iostats2.csv > iostats_disk.csv

rm iostats2.csv

cp iostats_disk.csv /cygdrive/e/SAN/csv_archive/iostats_disk_archive_$(date +%y%m%d).csv

Note: The shell script above can be run in the windows task scheduler as long as you have cygwin installed.  Here’s the syntax:

c:\cygwin\bin\bash.exe -l -c “/home/<username>/iostats.sh”

After running the shell script above, the resulting CSV file contains only Total Throughput (IO per sec) data for each disk and lun.  It will contain data from 00:15 to 23:45 in 15 minute increments.  After the cygwin scripts have run we will have csv datasets that are ready to be exported to a graph.

The Disk and LUN stats are combined into the same CSV file.  It is entirely possible to rewrite the script to only have one or the other.  I put them both in there to make it easier to manually create a graph in excel for either disk or lun stats at a later time (if necessary).  The “all disks graph” does not look any different with both disk and lun stats in there, I tried it both ways and they overlap in a way that makes the extra data indistinguishable in the image.

The resulting data output after running the iostats.sh script is shown below.  I now have a nice, neat excel spreadsheet that lists the total throughput for each disk in the array for the entire day in 15 minute increments.   Having the data formatted in this way makes it super easy to create charts.  But I don’t want to have to do that manually every day, I want the charts to be created automatically.

                                                             3/28/11 00:15       3/28/11 00:30      3/28/11  00:45      3/28/11 01:00

Total Throughput IO per sec   – 0_0_0          12                             20                             23                           23 

Total Throughput IO per sec    – 0_0_1        30.12                        33.23                        40.4                         60.23

Total Throughput IO per sec    – 0_0_2         1.82                          3.3                           5.4                              7.8

Total Throughput IO per sec    -0_0_3         80.62                        13.33                        90.4                         10.3 

Step 3:

Now I want to automatically create the graphs every day using a Perl script.  After the CSV files are exported to a more usable format from the previous step, I Use the GD::Graph library from CPAN (http://search.cpan.org/~mverb/GDGraph-1.43/Graph.pm) to auto-generate the graphs.

Below is a sample Perl script that will autogenerate a great looking graph based on the CSV ouput file from the previous step.


#Declare the libraries that will be used.

use strict;

use Text::ParseWords;

use GD::Graph::lines;

use Data::Dumper;

#Specify the csv file that will be used to create the graph

my $file = ‘C:\cygwin\home\<username>\iostats_disk.csv’;

#my $file  = $ARGV[0];

my ($output_file) = ($file =~/(.*)\./);

#Create the arrays for the data and the legends

my @data;

my @legends;

#parse csv, generate an error if it fails

open(my $fh, ‘<‘, $file) or die “Can’t read csv file ‘$file’ [$!]\n”;

my $countlines = 0;

while (my $line = <$fh>) {

chomp $line;

my @fields = Text::ParseWords::parse_line(‘,’, 0, $line);

#There are 95 fields generated to correspond to the 95 data collection points in each
of the output files.

my @field =

push @data, \@field;

if($countlines >= 1){

push @legends, @fields[0];




#The data and legend arrays will read 820 lines of the CSV file.  This number will change based on the number of disks in the SAN, and will be different depending on the SAN being reported on.  The legend info will read the first column of the spreadsheet and create a color box that corresponds to the graph line.  For the purpose of this graph, I won’t be using it because 820+ legend entries look like a mess on the screen.

splice @data, 1, -820;

splice @legends, 0, -820;

#Set Graphing Options

my $mygraph = GD::Graph::lines->new(1024, 768);

# There are many graph options that can be changed using the GD::Graph library.  Check the website (and google) for lots of examples.


title => ‘SP IO Utilization (00:15 – 23:45)’,

y_label => ‘IOs Per Second’,

y_tick_number => 4,

values_vertical => 6,

show_values => 0,

x_label_skip => 3,

) or warn $mygraph->error;

#As I said earlier, because of the large number of legend entries for this type of graph, I change the legend to simply read “All disks”.  If you want the legend to actually put the correct entries and colors, use this line instead:  $mygraph->set_legend(@legends);

$mygraph->set_legend(‘All Disks’);

#Plot the data

my $myimage = $mygraph->plot(\@data) or die $mygraph->error;

# Export the graph as a gif image.  The images are currently moved to the IIS folder (c:\inetpub\wwwroot) with one of the scripts.  The could also be emailed using a sendmail utility.

my $format = $mygraph->export_format;

open(IMG,”>$output_file.$format”) or die $!;

binmode IMG;

print IMG $myimage->gif;

close IMG;

After this script runs the resulting image file will be saved in the cygwin home directory (It saves it in the same directory that the CSV file is located in).  One of the nightly scripts I run will copy the image to our interal IIS server’s image directory, and sendmail will email the graph to the SAN Admin team.

That’s it!  You now have lots of pretty graphs with which you can impress your management team. 🙂

Here is a sample graph that was generated with the Perl script: