Powerpath commands in AIX causing unexpected errors / initialization errors.

We recently had a problem with one of our AIX VIO servers not being able to run any powerpath commands.  Any attempt to run a command would result in an unexpected error or initialization error.   After speaking to EMC about it, the root cause is usually either running out of space on the root filesystem or having the data and stack ulimit paramenters set too low after adding a large number of new LUNs.   We are running AIX 6.1 on an IBM pSeries 550 with PowerPath 5.3 HF1.

Here are the errors that were popping up:

root@vioserver1:/script # powermt config
Unexpected error occured.

root@vioserver1:/script # powermt display dev=all
Initialization error.

root@vioserver1:/script # naviseccli -h <san_dns_name> lun -list -all
evp_enc.c(282): OpenSSL internal error, assertion failed: inl > 0
ksh: 503926 IOT/Abort trap(coredump)

Having too many LUNs caused the issue,  we had recently added an additional 35 for a total of  70.  Increasing the data and stack parameters to ‘unlimited’ resolved the problem.

root@vioserver1:/script # ulimit -a
time(seconds)        unlimited
file(blocks)         unlimited
data(kbytes)         unlimited
stack(kbytes)        unlimited
memory(kbytes)       unlimited
coredump(blocks)     2097151
nofiles(descriptors) 2000
threads(per process) unlimited
processes(per user)  unlimited

Advertisements

4 thoughts on “Powerpath commands in AIX causing unexpected errors / initialization errors.”

  1. Thank you so much for this tips, it help us trough a tricky situation when trying to migrate from one CLARiiON to another one. The new LUN’s created from the target CLARiiON couldn’t be shown after running cfgmgr. The same issue occurred and changing /etc/security/limits to -1 worked for us as well.

    Cheers

  2. Hi there emcsan,

    great blog. Came across it and was thinking if I could have your “feeling” on my
    situation: I administer several AIX/VIOS-based systems and 1-2 of them present
    I/O problems which can not be explained so far. The outcome is huge wio (>20%
    sustained) and disk utilizations which are not normal of the running application
    and load. The SAN guys keep telling me that powerpath is malfunctioning as they
    don’t see any such activity on their side, nor on the optical switches, thus to
    reboot the systems for the driver to be refreshed. VIOS seem fine and they just
    pass traffic through (NPIV). AIX is 6.1 and powerpath 5.5

    What do you think?

    Cheers,
    Vourasa
    to work fine and the

    1. Vourasa,

      I’m not 100% sure what your issue is based on your description. Opening a ticket with both IBM and EMC would probably be a good idea. I understand how hard it is to get downtime for a VIO server, but it could very well be a powerpath issue and you should consider upgrading to the latest powerpath version on your VIO server. I’d check and make sure that all of your fiber cards are functioning normally and all are seeing traffic, as we’ve had issues before with a faulty HBA or faulty fiber patch cable causing strange issues. Sorry I can’t be of more help, without being able to look at your system directly it’s hard to guess what the problem is. Good Luck.

      Steve

      1. Thanks for this post. I will try setting the ulimits for root. I have a new 9117-MMC, (8) fiber connections to EMC VMAX. When I boot with all 8 fiber connections the lpar hangs at 999 (i waited 12 hours). The lpar boots okay with 2 fiber connections. I have a fairly large storage allocation (125TB) and each lun has 32 paths, so very large number of hdisks. I was thinking that it had something to do with the number of paths. Seems like am hitting some type of limit.

Leave a Reply to emcsan Cancel reply