How to troubleshoot EMC Control Center WLA Archive issues

We’re running EMC Control Center 6.1 UB12, and we use it primarly for it’s robust performance data collection and reporting capabilities.  Performance Manager is a great tool and I use it frequently.

Over the years I’ve had occasional issues with the WLA Archives not collecting performance data and I’ve had to open service requests to get it fixed.  Now that I’ve been doing this for a while, I’ve collected enough info to troubleshoot this issue and correct it without EMC’s assistance in most cases.

Check your ..\WLAArchives\Archives directory and look under the Clariion (or Celerra) folder, then the folder with your array’s serial number, then the interval folder.  This is where the “*.ttp” (text) and “*.btp” (binary) performance data files are stored for Performance Manager.  Sort by date.  If there isn’t a new file that’s been written in the last few hours data is not being collected.

Here are the basic items I generally review when data isn’t being collected for an array:

  1. Log in to every array in Unisphere, go to system properties, and on the ‘General’ tab make sure statistics logging is enabled.  I’ve found that if you don’t have an analyzer license on your array and start the 7 day data collection for a “naz” file, after the 7 days is up the stats logging option will be disabled.  You’ll have to go back in and re-enable it after the 7 day collection is complete.  If stats logging isn’t enabled on the array the WLA data collection will fail.
  2. If you recently changed the password on your clarion domain account, Make sure that naviseccli is updated properly for security access to all of your arrays (use the “addusersecurity” CLI option) and perform a rediscovery of all your arrays as well from within the ECC console.  There is no way from within the ECC console to update the password on an array, you must go through the discovery process again for all of them.
  3.  Verify the agents are running.  In the ECC console, click on the gears icon in the lower right hand corner.  It will create a window that shows the status of all the agents, including the WLA Archiver.  If WLA isn’t started, you can start it by right clicking on any array, choosing Agents, then start.  Check the WLAArchives  directories again (after waiting about an hour) and see if it’s collecting data again.

If those basic steps don’t work, checking the logs may point you in the right direction:

  1.  Review the Clariion agent logs for errors.  You’re not looking for anything specific here, just do a search for “error”, “unreachable” or for the specific IP’s of your arrays and see if there is anything obvious wrong. 
            %ECC_INSTALL_ROOT%\exec\MGL610\MGL.log
            %ECC_INSTALL_ROOT%\exec\MGL610\MGL_Bx.log.gz
            %ECC_INSTALL_ROOT%\exec\MGL610\MGL.ini
            %ECC_INSTALL_ROOT%\exec\MGL610\MGL_Err.log
            %ECC_INSTALL_ROOT%\exec\MGL610\MGL_Bx_Err.log
            %ECC_INSTALL_ROOT%\exec\MGL610\MGL_Discovery.log.gz
 

Here’s an example of an error I found in one case:

            MGL 14:10:18 C P I 2536   (29.94 MB) [MGLAgent::ProcessAlert] => Processing SP
            Unreachable alert. MO = APM00100600999, Context = Clariion, Category = SP
            Element = Unreachable
 

      2.   Review the WLA Agent logs.  Again, just search for errors and see if there is anything obvious that’s wrong. 

            %ECC_INSTALL_ROOT%\exec\ENW610\ENW.log
            %ECC_INSTALL_ROOT%\exec\ENW610\ENW_Bx.log.gz
            %ECC_INSTALL_ROOT%\exec\ENW610\ENW.ini
            %ECC_INSTALL_ROOT%\exec\ENW610\ENW_Err.log
            %ECC_INSTALL_ROOT%\exec\ENW610\ENW_Bx_Err.log
 

If the logs don’t show anything obvious, here are the steps I take to restart everything.  This has worked on several occasions for me.

  1. From the Control Center console, stop all agents on the ECC Agent server.  Do this by right clicking on the agent server (in the left pane), choose agents and stop.  Follow the prompts from there.
  2. Log in to the ECC Agent server console and stop the master agent.  You can do this in Computer Management | Services, stop the service titled “EMC ControlCenter Master Agent”.
  3. From the Control Center console, stop all agents on the Infrastructure server.  Do this by right clicking on the agent server (in the left pane), choose agents and stop.  Follow the prompts from there.
  4. Verify that all services have stopped properly.
  5. From the ECC Agent server console, go to C:\Windows\ECC\ and delete all .comfile and .lck files.
  6. Restart all agents on the Infrastructure server.
  7. Restart the Master Agent on the Agent server.
  8. Restart all other services on the Agent server.
  9. Verify that all services have restarted properly.
  10. Wait at least an hour and check to see if the WLA Archive files are being written.

If none of these steps resolve your problem and you don’t see any errors in the logs, it’s time to open an SR with EMC.  I’ve found the EMC staff  that supports ECC to be very knowledgeable and helpful.

 

 

Advertisements

6 thoughts on “How to troubleshoot EMC Control Center WLA Archive issues”

  1. Thank you! This helped in sorting out some errors I had recently in WLA, but it was on ECC 6.0

    A quick question – do we upgrade the UB12 bundle, or do we engage EMC? I have read some issues with UB12 and VMax. Don’t want to run into multiple issues, and have the other arrays not being managed for some time with UB12. ANy thoughts?

    1. I’m glad my post was able to help you. I’ve applied all the upgrade bundles myself without engaging EMC, I’ve just downloaded and read all of the documentation first. I don’t support VMax hardware and and don’t have any experience with it, so if you’re concerned I’d recommend opening an SR to at least confirm that it won’t cause any problems for you. I’ve noticed that arrays will drop out after a FLARE upgrade or if the password is changed on the clariion/VNX admin ID (not sure about applying update bundles). If you can’t get an existing array recognized (that previously was), first try upgrading to the latest unisphere host agent on the ECC servers then try rediscovering the arrays from the ECC master console.

  2. Great info! I just updated my ECC from UB8 to UB14 ever since my CX4s were upgraded to Flare 30 above. One of my CX boxes was not providing any performance data and true enough the “statistics logging” was not enabled for that array. The only thing about upgrading ECC is that you need a significant amount of space to hold the backups prior to upgrading

  3. @emcsan – Thank you for the reply.

    @ewan – glad it worked for you.

    I have a basic question. I’m looking to upgrade our ECC 6.1 to UB14. Is this the same as clicking one of the menu options and click on Patches and Install Patch ? If that is indeed the route, I tried that before but no success.

    While I understand the documentation available on EMC website for the upgrade process, is there a simpler step by step that you can provide which will help with the process?

    thanks in advance!

    1. I actually need to upgrade ECC to UB14 as well, it’s been on my to-do list for a while now. I don’t have any steps written out now, but I can certainly take notes and make another post about the patch process once I complete it. I’m going to be doing that sometime in the next week. There is a bit more to it than just clicking on patches and install patch if I remember correctly.

Leave a Reply