The SAN Guy

tips and tricks from an IT veteran…

VPLEX Unisphere Login hung at “Retrieving Meta-Volume Information”

I recently had an issue where I was unable to log in to the Unisphere GUI on the VPLEX, it would hang with the message “Retrieving Meta-Volume Information” after progressing about 30% on the progress bar.

This was caused by a hung Java process.  In order to resolve it, you must restart the management server. This will not cause any disruption to hosts connected to the VPLEX.

To do this, run the following command:

ManagementServer:/> sudo /etc/init.d/VPlexManagementConsole restart

If this hangs or does not complete, you will need to run the top command to identify the PID for the java service:

Mem:   3920396k total,  2168748k used,  1751648k free,    29412k buffers
Swap:  8388604k total,    54972k used,  8333632k free,   527732k cached

26993 service   20   0 2824m 1.4g  23m S     14 36.3  18:58.31 java
 4948 rabbitmq  20   0  122m  42m 1460 S      1  1.1  13118:32 beam.smp
    1 root      20   0 10540   48   36 S      0  0.0  12:34.13 init

Once you’ve identified the PID for the java service, you can kill the process with the kill command, and then run the command to restart the management console again.

ManagementServer:/> sudo kill -9 8798
ManagementServer:/> sudo /etc/init.d/VPlexManagementConsole start

Once the management server restarts, you should be able to log in to the Unisphere for VPLEX GUI again.

Default Passwords

Here is a collection of default passwords for EMC, HP, Cisco, VMware, TrendMicro and IBM hardware & software.

EMC Secure Remote Support (ESRS) Axeda Policy Manager Server:

  • Username: admin
  • Password: EMCPMAdm7n

EMC VNXe Unisphere (EMC VNXe Series Quick Start Guide, step 4):

  • Username: admin
  • Password: Password123#

EMC vVNX Unisphere:

  • Username: admin
  • Password: Password123#
    NB You must change the administrator password during this first login.

EMC CloudArray Appliance:

  • Username: admin
  • Password: password
    NB Upon first login you are prompted to change the password.

EMC CloudBoost Virtual Appliance:

  • Username: local\admin
  • Password: password
    NB You must immediately change the admin password.
    $ password <current_password> <new_password>

EMC Ionix Unified Infrastructure Manager/Provisioning (UIM/P):

  • Username: sysadmin
  • Password: sysadmin

EMC VNX Monitoring and Reporting:

  • Username: admin
  • Password: changeme

EMC RecoverPoint:

  • Username: admin
    Password: admin
  • Username: boxmgmt
    Password: boxmgmt
  • Username: security-admin
    Password: security-admin

EMC XtremIO:

XtremIO Management Server (XMS)

  • Username: xmsadmin
    password: 123456 (prior to v2.4)
    password: Xtrem10 (v2.4+)

XtremIO Management Secure Upload

  • Username: xmsupload
    Password: xmsupload

XtremIO Management Command Line Interface (XMCLI)

  • Username: tech
    password: 123456 (prior to v2.4)
    password: X10Tech! (v2.4+)

XtremIO Management Command Line Interface (XMCLI)

  • Username: admin
    password: 123456 (prior to v2.4)
    password: Xtrem10 (v2.4+)

XtremIO Graphical User Interface (XtremIO GUI)

  • Username: tech
    password: 123456 (prior to v2.4)
    password: X10Tech! (v2.4+)

XtremIO Graphical User Interface (XtremIO GUI)

  • Username: admin
    password: 123456 (prior to v2.4)
    password: Xtrem10 (v2.4+)

XtremIO Easy Installation Wizard (on storage controllers / nodes)

  • Username: xinstall
    Password: xiofast1

XtremIO Easy Installation Wizard (on XMS)

  • Username: xinstall
    Password: xiofast1

Basic Input/Output System (BIOS) for storage controllers / nodes

  • Password: emcbios

Basic Input/Output System (BIOS) for XMS

  • Password: emcbios

EMC ViPR Controller :
http://ViPR_virtual_ip (the ViPR public virtual IP address, also known as the

  • Username: root
    Password: ChangeMe

EMC ViPR Controller Reporting vApp:

  • Username: admin
    Password: changeme

EMC Solutions Integration Service:
https://<Solutions Integration Service IP Address>:5480

  • Username: root
    Password: emc

EMC VSI for VMware vSphere Web Client:
https://<Solutions Integration Service IP Address>:8443/vsi_usm/

  • Username: admin
  • Password: ChangeMe

After the Solutions Integration Service password is changed, it cannot be modified.
If the password is lost, you must redeploy the Solutions Integration Service and use the default login ID and password to log in.

Cisco Integrated Management Controller (IMC) / CIMC / BMC:

  • Username: admin
  • Password: password

Cisco UCS Director:

  • Username: admin
  • Password: admin
  • Username: shelladmin
  • Username: changeme

Hewlett Packard P2000 StorageWorks MSA Array Systems:

  • Username: admin
  • Password: !admin (exclamation mark ! before admin)
  • Username: manage
  • Password: !manage (exclamation mark ! before manage)
IBM Security Access Manager Virtual Appliance:
  • Username: admin
  • Password: admin

VCE Vision:

  • Username: admin
  • Password: 7j@m4Qd+1L
  • Username: root
  • Password: V1rtu@1c3!

VMware vSphere Management Assistant (vMA):

  • Username: vi-admin
  • Password: vmware

VMware Data Recovery (VDR):

  • Username: root
  • Password: vmw@re (make sure you enter @ as Shift-2 as in US keyboard layout)

VMware vCenter Hyperic Server:

  • Username: root
  • Password: hqadmin


  • Username: hqadmin
  • Password: hqadmin

VMware vCenter Chargeback:

  • Username: root
  • Password: vmware

VMware vCenter Server Appliance (VCSA) 5.5:

  • Username: root
  • Password: vmware

VMware vCenter Operations Manager (vCOPS):

Console access:

  • Username: root
  • Password: vmware


  • Username: admin
  • Password: admin

Administrator Panel:

  • Username: admin
  • Password: admin

Custom UI User Interface:

  • Username: admin
  • Password: admin

VMware vCenter Support Assistant:

  • Username: root
  • Password: vmware

VMware vCenter / vRealize Infrastructure Navigator:

  • Username: root
  • Password: specified during OVA deployment

VMware ThinApp Factory:

  • Username: admin
  • Password: blank (no password)

VMware vSphere vCloud Director Appliance:

  • Username: root
  • Password: vmware

VMware vCenter Orchestrator :
https://Server_Name_or_IP:8281/vco – VMware vCenter Orchestrator
https://Server_Name_or_IP:8283 – VMware vCenter Orchestrator Configuration

  • Username: vmware
  • Password: vmware

VMware vCloud Connector Server (VCC) / Node (VCN):

  • Username: admin
  • Password: vmware
  • Username: root
  • Password: vmware

VMware vSphere Data Protection Appliance:

  • Username: root
  • Password: changeme

VMware HealthAnalyzer:

  • Username: root
  • Password: vmware

VMware vShield Manager:

  • Username: admin
  • Password: default
    type enable to enter Privileged Mode, password is 'default' as well

Teradici PCoIP Management Console:

  • The default password is blank

Trend Micro Deep Security Virtual Appliance (DS VA):

  • Login: dsva
  • password: dsva

Citrix Merchandising Server Administrator Console:

  • User name: root
  • password: C1trix321

VMTurbo Operations Manager:

  • User name: administrator
  • password: administrator
    If DHCP is not enabled, configure a static address by logging in with these credentials:
  • User name: ipsetup
  • password: ipsetup
    Console access:
  • User name: root
  • password: vmturbo

Scripting an alert for checking the availability of individual CIFS server shares

It was recently asked to come up with a method to alert on the availability of specific CIFS file shares in our environment.  This was due to a recent issue we had on our VNX with our data mover crashing and causing the corruption of a single file system when it came back up.  We were unaware for several hours of the one file system being unavailable on our CIFS server.

This particular script would require maintenance whenever a new file system share is added to a CIFS server.  A unique line must to be added for every file system share that you have configured.  If a file system is not mounted and the share is inaccessible, an email alert will be sent.  If the share is accessible the script does nothing when run from the scheduler.  If it’s run manually from the CLI, it will echo back to the screen that the path is active.

This is a bash shell script, I run it on a windows server with Cygwin installed using the ‘email’ package for SMTP.  It should also run fine from a linux server, and you could substitute the ‘email’ syntax for sendmail or whatever other mail application you use.   I have it scheduled to check the availability of CIFS shares every one hour.


DIR1=file_system_1; SRV1=cifs_servername;  echo -ne $DIR1 && echo -ne “: ” && [ -d //$SRV1/$DIR1 ] && echo “Network Path is Active” || email -b -s “Network Path \\\\$SRV1\\$DIR1 is offline”

DIR2=file_system_2; SRV1=cifs_servername;  echo -ne $DIR1 && echo -ne “: ” && [ -d //$SRV1/$DIR2 ] && echo “Network Path is Active” || email -b -s “Network Path \\\\$SRV1\\$DIR2 is offline”

DIR3=file_system_3; SRV1=cifs_servername;  echo -ne $DIR1 && echo -ne “: ” && [ -d //$SRV1/$DIR3 ] && echo “Network Path is Active” || email -b -s “Network Path \\\\$SRV1\\$DIR3 is offline”

DIR4=file_system_4; SRV1=cifs_servername;  echo -ne $DIR1 && echo -ne “: ” && [ -d //$SRV1/$DIR4 ] && echo “Network Path is Active” || email -b -s “Network Path \\\\$SRV1\\$DIR4 is offline”

DIR5=file_system_5; SRV1=cifs_servername;  echo -ne $DIR1 && echo -ne “: ” && [ -d //$SRV1/$DIR5 ] && echo “Network Path is Active” || email -b -s “Network Path \\\\$SRV1\\$DIR5 is offline”

EMC World 2015

I’m at EMC World in Las Vegas this week and it’s been fantastic so far.  I’m excited about the new 40TB XtremIO X-bricks and how we might leverage that for our largest and most important 80TB oracle database, also excited about possible use cases for  the Virtual VNX in our small branch locations, and all the other exciting futures that I can’t publicly share because I’m under an NDA with EMC.  Truly exciting and innovative technology is coming from them.  VXblock was also really impressive, although that’s not likely something my company will implement anytime soon.

I found out for the first time today that the excellent VNX Monitoring and Reporting application is now free for the VNX1 platform as well as VNX2.  If you would like to get a license for any of your VNX1 arrays, simply ask your local  sales representative to submit a zero dollar sales order for a license.  We’re currently evaluating ViPR SRM as a replacement for our soon to be “end of life” Control Center install, but until then VNX MR is a fantastic tool that provides nearly the same performance data for no cost at all.  SRM adds much more functionality beyond just VNX monitoring and reporting (i.e., monitoring SAN switches) and I’d highly recommend doing a demo if you’re also still using Control Center.

We also implemented a VPLEX last year and it’s truly been a lifesaver and is an amazing platform.  We currently have a VPLEX local implantation in our primary data center and it’s allowed us to easily migrate workloads from one array to another seamlessly with no disruption to applications.   I’m excited about the possibilities with RecoverPoint as well, I’m still learning about it.

If anyone else who’s at EMC World happens to read this, comment!  I’d love to hear your experiences and what you’re most excited about with EMC’s latest technology.

Rescan Storage System command on Celerra results in conflict:storageID-devID error

I was attempting to extend our main production NAS file pool on our NS-960 and ran into an issue.  I had recently freed up 8 SATA disks from a block pool and was attempting to re-use them and extend a Celerra file pool.  I created a new RAID Group and LUN that used the maximum capacity of the RAID Group.  I then added the LUN to the celerra storage group, making sure to set the HLU to a number greater than 15.  I then changed the setting on our main production file pool to auto-extend, and clicked on the “Rescan Storage Systems” option.  Unfortunately rescanning produced an error every time it was run.  I have done this exact same procedure in the past and it’s worked fine.  Here is the error:

conflict:storageID-devID: disk=17 old:symm=APM00100600999,dev=001F new:symm=APM00100600999,dev=001F addr=c16t1l11

I checked the disks on the Celerra using the nas_disk –l command, and the new disk shows up as “in use” even though the rescan command didn’t properly complete.

[nasadmin@Celerra tools]$ nas_disk -l
id   inuse  sizeMB    storageID-devID      type   name  servers
17    y     7513381   APM00100600999-001F  CLATA  d17   <BLANK>

Once the dvol is presented to Celerra (assuming the rescan goes fine) it should not be inuse until it is assigned to a storage pool and a file system uses it.  In this case that didn’t happen.  If you run /nas/tools/whereisfs (depending on your DART version, it may be “.whereisfs” with the dot) it shows a listing of every file system and which disk and which LUN they reside on.  I verified that the disk was not in use using that command.

In order to be on the safe side, I opened an SR with EMC rather than simply deleting the disk.  They suggested that the NAS database has a corruption. I’m going to have EMC’s Recovery Team check the usage of the diskvol and then delete it and re-add it.  In order to engage the recovery team you need to sign a “Data Deletion Form” absolving EMC of any liability for data loss, which is standard practice when they delete volumes on a customer array.  If there are any further caveats or important things to note after EMC has taken care of this I’ll update this post.

VPLEX initiator paths dropped

We recently ran into an SP bug check on one of our VNX arrays and after it came back up several of the initiator paths to the VPLEX did not come back up.  We were also seeing IO timeouts.  This is a known bug that happens when there is an SP reboot and is fixed with Patch 1 for GeoSynchrony 5.3.  EMC has released a script that provides a workaround until the patch can be applied:

The following pre-conditions need to happen during a VNX NDU to see this issue on VPLEX:
1] During a VNX NDU, SPA goes down.
2] At this point IO time-outs start happening on IT nexus’s pertaining to SPA.
3] The IO time-outs cause the VPLEX SCSI Layer to send LU Reset TMF’s. These LU Reset TMF’s get timed out as well.

You can review ETA 000193541 on EMC’s support site for more information.  It’s a critical bug and I’d suggest patching as soon as possible.


VPLEX Health Check

This is a brief post to share the CLI commands and sample output for a quick VPLEX health check.  Our VPLEX had a dial home event and below are the commands that EMC ran to verify that it was healthy.  Here is the dial home event that was generated:

SymptomCode: 0x8a266032
SymptomCode: 0x8a34601a
Category: Status
Severity: Error
Status: Failed
Component: CLUSTER
ComponentID: director-1-1-A
SubComponent: stdf
CallHome: Yes
FirstTime: 2014-11-14T11:20:11.008Z
LastTime: 2014-11-14T11:20:11.008Z
CDATA: Compare and Write cache transaction submit failed, status 1 [Versions:MS{D30., D30.0.0.112, D30.60.0.3}, Director{}, ClusterWitnessServer{unknown}] RCA: The attempt to start a cache transaction for a Scsi Compare and Write command failed. Remedy: Contact EMC Customer Support.

Description: The processing of a Scsi Com pare and Write command could not complete.
ClusterID: cluster-1

Based on that error the commands below were run to make sure the cluster was healthy.

This is the general health check command:

VPlexcli:/> health-check
 Product Version:
 Product Type: Local
 Hardware Type: VS2
 Cluster Size: 2 engines
 Cluster TLA:
 cluster-1: FNM00141800023
 Cluster Cluster Oper Health Connected Expelled Local-com
 Name ID State State
 --------- ------- ----- ------ --------- -------- ---------
 cluster-1 1 ok ok True False ok
 Meta Data:
 Cluster Volume Volume Oper Health Active
 Name Name Type State State
 --------- ------------------------------- ----------- ----- ------ ------
 cluster-1 c1_meta_backup_2014Nov21_100107 meta-volume ok ok False
 cluster-1 c1_meta_backup_2014Nov20_100107 meta-volume ok ok False
 cluster-1 c1_meta meta-volume ok ok True
 Director Firmware Uptime:
 Director Firmware Uptime
 -------------- ------------------------------------------
 director-1-1-A 147 days, 16 hours, 15 minutes, 29 seconds
 director-1-1-B 147 days, 15 hours, 58 minutes, 3 seconds
 director-1-2-A 147 days, 15 hours, 52 minutes, 15 seconds
 director-1-2-B 147 days, 15 hours, 53 minutes, 37 seconds
 Director OS Uptime:
 Director OS Uptime
 -------------- ---------------------------
 director-1-1-A 12:49pm up 147 days 16:09
 director-1-1-B 12:49pm up 147 days 16:09
 director-1-2-A 12:49pm up 147 days 16:09
 director-1-2-B 12:49pm up 147 days 16:09
 Inter-director Management Connectivity:
 Director Checking Connectivity
 -------------- -------- ------------
 director-1-1-A Yes Healthy
 director-1-1-B Yes Healthy
 director-1-2-A Yes Healthy
 director-1-2-B Yes Healthy
 Front End:
 Cluster Total Unhealthy Total Total Total Total
 Name Storage Storage Registered Ports Exported ITLs
 Views Views Initiators Volumes
 --------- ------- --------- ---------- ----- -------- -----
 cluster-1 56 0 299 16 353 9802
 Cluster Total Unhealthy Total Unhealthy Total Unhealthy No Not visible With
 Name Storage Storage Virtual Virtual Dist Dist Dual from Unsupported
 Volumes Volumes Volumes Volumes Devs Devs Paths All Dirs # of Paths
 --------- ------- --------- ------- --------- ----- --------- ----- ----------- -----------
 cluster-1 203 0 199 0 0 0 0 0 0
 Consistency Groups:
 Cluster Total Unhealthy Total Unhealthy
 Name Synchronous Synchronous Asynchronous Asynchronous
 Groups Groups Groups Groups
 --------- ----------- ----------- ------------ ------------
 cluster-1 0 0 0 0
 Cluster Witness:
 Cluster Witness is not configured

This command checks the status of the cluster:

VPlexcli:/> cluster status
Cluster cluster-1
operational-status: ok
health-state: ok
local-com: ok

This command checks the state of the storage volumes:

VPlexcli:/> storage-volume summary
Storage-Volume Summary (no tier)
---------------------- --------------------

Health out-of-date 0
storage-volumes 203
unhealthy 0

Vendor DGC 203

Use meta-data 4
used 199

Capacity total 310T

Matching LUNs and UIDs when presenting VPLEX LUNs to Unix hosts

Our naming convention for LUNs includes the pool ID, LUN number, server name, filesystem/drive letter, last four digits of the array’s serial number, and size (in GB). Having all of this information in the LUN name makes for very easy reporting and identification of LUNs on a server.  This is what our LUN names look like: P1_LUN100_SPA_0000_servername_filesystem_150G

Typically, when presenting a new LUN to our AIX administration team for a new server build, they would assign the LUNs to specific volume groups based on the LUN names. The command ‘powermt display dev=hdiskpower#’ always includes the name & intended volume group for the LUN, making it easy for our admins to identify a LUN’s purpose.  Now that we are presenting LUNs through our VPlex, when they run a powermt display on the server the UID for the LUN is shown, not the name.  Below is a sample output of what is displayed.

root@VIOserver1:/ # powermt display dev=all
Pseudo name=hdiskpower0
VPLEX ID=FNM00141800023
Logical device ID=6000144000000010704759ADDF2487A6 (this would usually be displayed as a LUN name)
state=alive; policy=ADaptive; queued-IOs=0
————— Host ————— – Stor – — I/O Path — — Stats —
### HW Path I/O Paths Interf. Mode State Q-IOs Errors
1 fscsi1 hdisk8 CL1-0B active alive 0 0
1 fscsi1 hdisk6 CL1-0F active alive 0 0
0 fscsi0 hdisk4 CL1-0D active alive 0 0
0 fscsi0 hdisk2 CL1-07 active alive 0 0

Pseudo name=hdiskpower1
VPLEX ID=FNM00141800023
Logical device ID=6000144000000010704759ADDF2487A1 (this would usually be displayed as a LUN name)
state=alive; policy=ADaptive; queued-IOs=0
————— Host ————— – Stor – — I/O Path — — Stats —
### HW Path I/O Paths Interf. Mode State Q-IOs Errors
1 fscsi1 hdisk9 CL1-0B active alive 0 0
1 fscsi1 hdisk7 CL1-0F active alive 0 0
0 fscsi0 hdisk5 CL1-0D active alive 0 0
0 fscsi0 hdisk3 CL1-07 active alive 0 0

In order to easily match up the UIDs with the LUN names on the server, an extra step needs to be taken on the VPlex CLI. Log in to the VPlex using a terminal emulator, and once you’re logged in use the ‘vplexcli’ command. That will take you to a shell that allows for additional commands to be entered.

login as: admin
Using keyboard-interactive authentication.
Last login: Fri Sep 19 13:35:28 2014 from
admin@service:~> vplexcli
Trying ::1…
Connected to localhost.
Escape character is ‘^]’.

Enter User Name: admin



Once you’re in, run the ls -t command with the additional options listed below. You will need to substitute the STORAGE_VIEW_NAME with the actual name of the storage view that you want a list of LUNs from.

VPlexcli:/> ls -t /clusters/cluster-1/exports/storage-views/STORAGE_VIEW_NAME::virtual-volumes

The output looks like this:

Name Value
————— ————————————————————————————————–
virtual-volumes [(0,P1_LUN411_7872_SPB_VIOServer1_VIO_10G,VPD83T3:6000144000000010704759addf2487a6,10G),

Now you can easily see which disk UID is tied to which LUN name.

If you would like to get a list of every storage view and every LUN:UID mapping, you can substitute the storage view name with an asterisk (*).

VPlexcli:/> ls -t /clusters/cluster-1/exports/storage-views/*::virtual-volumes

The resulting report will show a complete list of LUNs, grouped by storage view:

Name Value
————— ————————————————————————————————–
virtual-volumes [(0,P1_LUN421_9322_SPB_/clusters/cluster-1/exports/storage-views/ VIOServer2:
Name Value
————— ————————————————————————————————–
virtual-volumes [(0,P1_LUN421_9322_SPB_VIOServer2_root_75G,VPD83T3:6000144000000010704759addf248ad9,75G),

Name Value
————— ————————————————————————————————
virtual-volumes [(1,R2_LUN1025_9322_SPB_VIOServer2_redo2_12G,VPD83T3:6000144000000010704759addf248b09,12G),

Name Value
————— ————————————————————————————————
virtual-volumes [(0,P0_LUN101_3432_SPA_VIOServer3_root_75G,VPD83T3:6000144000000010704759addf248a0a,75G),

Our VPlex has only been installed for a few months and our team is still learning.  There may be a better way to do this, but it’s all I’ve been able to figure out so far.

The steps for NFS exporting a file system on a VDM

I made a blog post back in January 2014 about creating an NFS export on a virtual data mover but I didn’t give much detail on the commands you need to use to actually do it. As I pointed out back then, you can’t NFS export a VDM file system from within Unisphere however when a file system is mounted on a VDM its path from the root of the physical Data Mover can be exported from the CLI.

The first thing that needs to be done is determining the physical Data Mover where the VDM resides.

Below is the command you’d use to make that determination:

[nasadmin@Celerra_hostname]$ nas_server -i -v name_of_your_vdm | grep server
server = server_4

That will show you just the physical data mover that it’s mounted on. Without the grep statement, you’d get the output below. If you have hundreds of filesystems it will cause the screen to scroll the info you’re looking for off the top of the screen. Using grep is more efficient.

[nasadmin@Celerra_hostname]$ nas_server -i -v name_of_your_vdm
id = 1
name = name_of_your_vdm
acl = 0
type = vdm
server = server_4
rootfs = root_fs_vdm_name_of_your_vdm
I18N mode = UNICODE
mountedfs = fs1,fs2,fs3,fs4,fs5,fs6,fs7,fs8,…
member_of =
status :
defined = enabled
actual = loaded, active
Interfaces to services mapping:
interface=10-3-20-167 :cifs
interface=10-3-20-130 :cifs
interface=10-3-20-131 :cifs

Next you need to determine the file system path from the root of the Data Mover. This can be done with the server_mount command. As in the prior step, it’s more efficient if you grep for the name of the file system. You can run it without the grep command, but it could generate multiple screens of output depending on the number of file systems you have.

[nasadmin@stlpemccs04a /]$ server_mount server_4 | grep Filesystem_03
Filesystem_03 on /root_vdm_3/Filesystem_03 uxfs,perm,rw

The final step is to actually export the file system using this path from the prior step. The file system must be exported from the root of the Data Mover rather than the VDM. Note that once you have exported the VDM file system from the CLI, you can then manage it from within Unisphere if you’d like to set server permissions. The “-option anon=0,access=server_name,root=server_name” portion of the CLI command below can be left off if you’d prefer to use the GUI for that.

[nasadmin@Celerra_hostname]$ server_export server_4 -Protocol nfs -option anon=0,access=server_name,root=server_name /root_vdm_3/Filesystem_03
server_4 : done

At this point the client can mount the path with NFS.

Dynamic allocation pool limit has been reached

We were having issues with our backup jobs failing on CIFS share backups using Symantec Netbackup.  The jobs died with a “status 24”, which means it was losing communicaiton with the source.  Our backup administrator provided me with the exact times & dates of the failures and I noticed that immediately preceding his failures this error appeared in the server log on the control station:

2012-08-05 07:09:37: KERNEL: 4: 10: Dynamic allocation pool limit has been reached. Limit=0x30000 Current=0x50920 Max=0x0

A quick google search came up with this description of the error:  “The maximum amount of memory (number of 8K pages) allowed for dynamic memory allocation has almost been reached. This indicates that a possible memory leak is in progress and the Data Mover may soon panic. If Max=0(zero) then the system forced panic option is disabled. If Max is not zero then the system will force a panic if dynamic memory allocation reaches this level.”

Based on the fact that the error shows up right before a backup failure I saw the correlation.  To fix it, you’lll need to modify the Heap Limit from the default of 0x00030000 to a larger size.  Here is the command to do that:

.server_config server_2 -v “param kernel mallocHeapLimit=0x40000” (to change the value)
.server_config server_2 -v “param kernel” (will list the kernel parameters).

Below is a list of all the kernel parameters:

Name                                                 Location        Current       Default
----                                                 ----------      ----------    ----------
kernel.AutoconfigDriverFirst                         0x0003b52d30    0x00000000    0x00000000
kernel.BufferCacheHitRatio                           0x0002093108    0x00000050    0x00000050
kernel.MSIXdebug                                     0x0002094714    0x00000001    0x00000001
kernel.MSIXenable                                    0x000209471c    0x00000001    0x00000001
kernel.MSI_NoStop                                    0x0002094710    0x00000001    0x00000001
kernel.MSIenable                                     0x0002094718    0x00000001    0x00000001
kernel.MsiRouting                                    0x0002094724    0x00000001    0x00000001
kernel.WatchDog                                      0x0003aeb4e0    0x00000001    0x00000001
kernel.autoreboot                                    0x0003a0aefc    0x00000258    0x00000258
kernel.bcmTimeoutFix                                 0x0002179920    0x00000002    0x00000002
kernel.buffersWatermarkPercentage                    0x0003ae964c    0x00000021    0x00000021
kernel.bufreclaim                                    0x0003ae9640    0x00000001    0x00000001
kernel.canRunRT                                      0x000208f7a0    0xffffffff    0xffffffff
kernel.dumpcompress                                  0x000208f794    0x00000001    0x00000001
kernel.enableFCFastInit                              0x00022c29d4    0x00000001    0x00000001
kernel.enableWarmReboot                              0x000217ee68    0x00000001    0x00000001
kernel.forceWholeTLBflush                            0x00039d0900    0x00000000    0x00000000
kernel.heapHighWater                                 0x00020930c8    0x00004000    0x00004000
kernel.heapLowWater                                  0x00020930c4    0x00000080    0x00000080
kernel.heapReserve                                   0x00020930c0    0x00022e98    0x00022e98
kernel.highwatermakpercentdirty                      0x00020930e0    0x00000064    0x00000064
kernel.lockstats                                     0x0002093128    0x00000001    0x00000001
kernel.longLivedChunkSize                            0x0003a23ed0    0x00002710    0x00002710
kernel.lowwatermakpercentdirty                       0x0003ae9654    0x00000000    0x00000000
kernel.mallocHeapLimit                               0x0003b5558c    0x00040000    0x00030000  (This is the parameter I changed)
kernel.mallocHeapMaxSize                             0x0003b55588    0x00000000    0x00000000
kernel.maskFcProc                                    0x0002094728    0x00000004    0x00000004
kernel.maxSizeToTryEMM                               0x0003a23f50    0x00000008    0x00000008
kernel.maxStrToBeProc                                0x0003b00f14    0x00000080    0x00000080
kernel.memSearchUsecs                                0x000208fa28    0x000186a0    0x000186a0
kernel.memThrottleMonitor                            0x0002091340    0x00000001    0x00000001
kernel.outerLoop                                     0x0003a0b508    0x00000001    0x00000001
kernel.panicOnClockStall                             0x0003a0cf30    0x00000000    0x00000000
kernel.pciePollingDefault                            0x00020948a0    0x00000001    0x00000001
kernel.percentOfFreeBufsToFreePerIter                0x00020930cc    0x0000000a    0x0000000a
kernel.periodicSyncInterval                          0x00020930e4    0x00000005    0x00000005
kernel.phTimeQuantum                                 0x0003b86e18    0x000003e8    0x000003e8
kernel.priBufCache.ReclaimPolicy                     0x00020930f4    0x00000001    0x00000001
kernel.priBufCache.UsageThreshold                    0x00020930f0    0x00000032    0x00000032
kernel.protect_zero                                  0x0003aeb4e8    0x00000001    0x00000001
kernel.remapChunkSize                                0x0003a23fd0    0x00000080    0x00000080
kernel.remapConfig                                   0x000208fe40    0x00000002    0x00000002
kernel.retryTLBflushIPI                              0x00020885b0    0x00000001    0x00000001
kernel.roundRobbin                                   0x0003a0b504    0x00000001    0x00000001
kernel.setMSRs                                       0x0002088610    0x00000001    0x00000001
kernel.shutdownWdInterval                            0x0002093238    0x0000000f    0x0000000f
kernel.startAP                                       0x0003aeb4e4    0x00000001    0x00000001
kernel.startIdleTime                                 0x0003aeb570    0x00000001    0x00000001                                 0x0003b00060    0x00000000    0x00000000
kernel.switchStackOnPanic                            0x000208f8e0    0x00000001    0x00000001
kernel.threads.alertOptions                          0x0003a22bf4    0x00000000    0x00000000
kernel.threads.maxBlockedTime                        0x000208f948    0x00000168    0x00000168
kernel.threads.minimumAlertBlockedTime               0x000208f94c    0x000000b4    0x000000b4
kernel.threads.panicIfHung                           0x0003a22bf0    0x00000000    0x00000000
kernel.timerCallbackHistory                          0x000208f780    0x00000001    0x00000001
kernel.timerCallbackTimeLimitMSec                    0x000208f784    0x00000003    0x00000003
kernel.trackIntrStats                                0x000209021c    0x00000001    0x00000001
kernel.usePhyDevName                                 0x0002094720    0x00000001    0x00000001

Get every new post delivered to your Inbox.

Join 155 other followers