How to troubleshoot EMC Control Center WLA Archive issues

We’re running EMC Control Center 6.1 UB12, and we use it primarly for it’s robust performance data collection and reporting capabilities.  Performance Manager is a great tool and I use it frequently.

Over the years I’ve had occasional issues with the WLA Archives not collecting performance data and I’ve had to open service requests to get it fixed.  Now that I’ve been doing this for a while, I’ve collected enough info to troubleshoot this issue and correct it without EMC’s assistance in most cases.

Check your ..\WLAArchives\Archives directory and look under the Clariion (or Celerra) folder, then the folder with your array’s serial number, then the interval folder.  This is where the “*.ttp” (text) and “*.btp” (binary) performance data files are stored for Performance Manager.  Sort by date.  If there isn’t a new file that’s been written in the last few hours data is not being collected.

Here are the basic items I generally review when data isn’t being collected for an array:

  1. Log in to every array in Unisphere, go to system properties, and on the ‘General’ tab make sure statistics logging is enabled.  I’ve found that if you don’t have an analyzer license on your array and start the 7 day data collection for a “naz” file, after the 7 days is up the stats logging option will be disabled.  You’ll have to go back in and re-enable it after the 7 day collection is complete.  If stats logging isn’t enabled on the array the WLA data collection will fail.
  2. If you recently changed the password on your clarion domain account, Make sure that naviseccli is updated properly for security access to all of your arrays (use the “addusersecurity” CLI option) and perform a rediscovery of all your arrays as well from within the ECC console.  There is no way from within the ECC console to update the password on an array, you must go through the discovery process again for all of them.
  3.  Verify the agents are running.  In the ECC console, click on the gears icon in the lower right hand corner.  It will create a window that shows the status of all the agents, including the WLA Archiver.  If WLA isn’t started, you can start it by right clicking on any array, choosing Agents, then start.  Check the WLAArchives  directories again (after waiting about an hour) and see if it’s collecting data again.

If those basic steps don’t work, checking the logs may point you in the right direction:

  1.  Review the Clariion agent logs for errors.  You’re not looking for anything specific here, just do a search for “error”, “unreachable” or for the specific IP’s of your arrays and see if there is anything obvious wrong. 
            %ECC_INSTALL_ROOT%\exec\MGL610\MGL.log
            %ECC_INSTALL_ROOT%\exec\MGL610\MGL_Bx.log.gz
            %ECC_INSTALL_ROOT%\exec\MGL610\MGL.ini
            %ECC_INSTALL_ROOT%\exec\MGL610\MGL_Err.log
            %ECC_INSTALL_ROOT%\exec\MGL610\MGL_Bx_Err.log
            %ECC_INSTALL_ROOT%\exec\MGL610\MGL_Discovery.log.gz
 

Here’s an example of an error I found in one case:

            MGL 14:10:18 C P I 2536   (29.94 MB) [MGLAgent::ProcessAlert] => Processing SP
            Unreachable alert. MO = APM00100600999, Context = Clariion, Category = SP
            Element = Unreachable
 

      2.   Review the WLA Agent logs.  Again, just search for errors and see if there is anything obvious that’s wrong. 

            %ECC_INSTALL_ROOT%\exec\ENW610\ENW.log
            %ECC_INSTALL_ROOT%\exec\ENW610\ENW_Bx.log.gz
            %ECC_INSTALL_ROOT%\exec\ENW610\ENW.ini
            %ECC_INSTALL_ROOT%\exec\ENW610\ENW_Err.log
            %ECC_INSTALL_ROOT%\exec\ENW610\ENW_Bx_Err.log
 

If the logs don’t show anything obvious, here are the steps I take to restart everything.  This has worked on several occasions for me.

  1. From the Control Center console, stop all agents on the ECC Agent server.  Do this by right clicking on the agent server (in the left pane), choose agents and stop.  Follow the prompts from there.
  2. Log in to the ECC Agent server console and stop the master agent.  You can do this in Computer Management | Services, stop the service titled “EMC ControlCenter Master Agent”.
  3. From the Control Center console, stop all agents on the Infrastructure server.  Do this by right clicking on the agent server (in the left pane), choose agents and stop.  Follow the prompts from there.
  4. Verify that all services have stopped properly.
  5. From the ECC Agent server console, go to C:\Windows\ECC\ and delete all .comfile and .lck files.
  6. Restart all agents on the Infrastructure server.
  7. Restart the Master Agent on the Agent server.
  8. Restart all other services on the Agent server.
  9. Verify that all services have restarted properly.
  10. Wait at least an hour and check to see if the WLA Archive files are being written.

If none of these steps resolve your problem and you don’t see any errors in the logs, it’s time to open an SR with EMC.  I’ve found the EMC staff  that supports ECC to be very knowledgeable and helpful.

 

 

Advertisements

Disabling Telnet on Brocade Switches

We were recently directed by audit requirements to disable telnet access on all of our brocade switches.  We’re going to use ssh only for remote access.   The steps for disabling telnet aren’t obvious although it’s not difficult to do.  I’ve outlined two different procedures below, as it’s different if you’re running an FOS version below 5.3.x.

Commands for disabling telnet for ipv4 and ipv6

For FOS 5.3.x and above:

You cannot change the default filter sets,  you have to clone the default_ipv4 and default_ipv6 to new sets.  While logged on to the switch using ssh enter the following command:

ipfilter –clone BlockPort23 -from default_ipv4 ipfilter –clone BlockPort23ipv6 -from default_ipv6

A filter set is built on a list of numbered rules.   You need to verify the number of the rule for the telnet port (23). This can be done with this command:

ipfilter –show  

The default rule for telnet is 2.

The next step is to delete the old rule and create a new one.  Change the -rule 2 to the appropriate rule number from the previous step, if needed.

ipfilter –delrule BlockPort23 -rule 2

ipfilter –delrule BlockPort23ipv6 -rule 2

ipfilter –addrule BlockPort23 -rule 2 -sip any -dp 23 -proto tcp -act deny

ipfilter –addrule BlockPort23ipv6 -rule 2 -sip any -dp 23 -proto tcp -act deny

Next you need to save the new filter set and activate it:

ipfilter –save BlockPort23 ipfilter –save BlockPort23ipv6

ipfilter –activate BlockPort23 ipfilter –activate BlockPort23ipv6

Now all traffic on port 23 is blocked.  You can verify it by typing in  ipfilter –show again:

Name: BlockPort23ipv6, Type: ipv6, State: active
Rule    Source IP                               Protocol   Dest Port   Action
1     any                                            tcp       22     permit 
2     any                                            tcp       23     deny 
3     any                                            tcp      897     permit 
4     any                                            tcp      898     permit 
5     any                                            tcp      111     permit 
6     any                                            tcp       80     permit 
7     any                                            tcp      443     permit 
8     any                                            udp      161     permit 
9     any                                            udp      111     permit 
10    any                                            udp      123     permit 
11    any                                            tcp      600 - 1023     permit 
12    any                                            udp      600 - 1023     permit 

For FOS 5.2.x and below:

It’s a bit simpler for the older FOS versions.  Simply type “configure” at the prompt, type yes for system services, then ‘off’ for telnetd.

switchname:admin> configure
Not all options will be available on an enabled switch. To disable the switch, use the “switchDisable” command.
Configure…
  System services (yes, y, no, n): [no] y
    rstatd (on, off): [off]
    rusersd (on, off): [off]
    telnetd (on, off): [on] off
    ssl attributes (yes, y, no, n): [no]
   http attributes (yes, y, no, n): [no]
   snmp attributes (yes, y, no, n): [no]
   rpcd attributes (yes, y, no, n): [no]
   cfgload attributes (yes, y, no, n): [no]
   webtools attributes (yes, y, no, n): [no]