Alarms and Warnings Appendix

This appendix summarizes all the Alarms and Warnings and the actions to take when they occur.

Alarms and Warnings related to Network Issues detected by WBC Network Health Monitor. All the following will generate emails.

  • New Device, A new device joined the network
    • ACTION: Investigate the device if it is unknown. NOTE: Emails are disabled during the first 2 hours of setting a new top parent.
  • New Mac, A new MAC joined which has never been seen before,

    • WBC Network Health Monitor maintains a table of MAC addresses at the host and supervisor level. New MACs which have never been seen by the system create events.
  • Critical Disconnect, A critical Always-On device disconnected
    • A device expected to always be connected has disconnected.
    • ACTION: Find the cause and resolve. Delete the device in IntraVUE‎ if no longer in service. If this is causing excessive alarms, either change state from Always On or use the Disable Alarms for this device option.
  • Possible duplicate IP, Many MAC changes for an IP indicate a duplicate IP

    • If there are repeating IntraVUE‎ event log messages for changed MACs of the same IP addresses (eg device 1.2.3.4 changed mac from A to B), there is a possible duplicate IP in the network. WBC Network Health Monitor checks the last 24 hours to create an alarm. Call Tech Support for assistance validating the messages.
  • Top Parent Disconnect, Top parent not responding

    • If the top parent is disconnected, IntraVUE‎ will not be able to get SNMP ARP and other data.
    • ACTION: If the top parent IP is no longer available, the network in IntraVUE‎ settings needs to be removed and a new network added with the new top parent ip address.
  • New Cloaked Mac, A new MAC discovered in ARP data which is not responding to Pings

    • An IP has been discovered in ARP data and the IP is unknown to IntraVUE‎. IntraVUE‎ only discovers devices which respond to pings. See XXXX
    • ACTION: Awareness. Possible find the switch reporting the MAC a port to determine what the device really is.
  • NHM successive high incidents, The number of critical devices having over-range NHM Incidents exceeds current limit in successive minutes.,

    • A large percentage of devices are exceeding their nominal high value indicating a possible network issue.
    • ACTION: If this occurs often, contact Tech Support for possible adjustment in the values used to determine the expected limits for devices.
  • Possible Broadcast Storm. A broadcast storm is detected when a default 20% of the devices in a network exceed a default 30% Receive bandwidth.
    • ACTION: Check the IntraVUE‎ event log for the parent of the first device reporting receive bandwidth exceeding the default limit.
  • Managed Switch Lost SNMP. When a Managed Switch loses SNMP it can no longer provide position infomation about devices or provide other important network information.
    • ACTION: Verify the SNMP community. Run a SwitchProbe report and check the results. Call Tech Support.

Alarms and Warning related to IntraVUE‎ settings or operation

  • Scanner Stopped, IntraVUE Scanner has stopped.
    • ACTION: Restart or Reboot the Intravue host. Call WBC support if the scanner does not restart.
  • Scanner Speed Low, Intravue setting for scanner speed is too low, pings may be missed.,
    • The scanner speed is continuously monitored. If it is too slow to be able to get ping and bandwidth data from all the devices each minute an alarm is issued. The time it takes to get all the SNMP data from the switches is also monitored and a warning is issued if that appears to take too long.
    • ACTION: increase the scanner speed setting
  • Too many Unknowns, Too many devices have an Unknown critical state.,

    • WBC Network Health Monitor focuses on the state of Critical Always On and Critical Intermittent devices. Devices which are not critical should be set as Ignored.
    • ACTION: If there is a large percentage of devices in the unknown state it indicates the user needs to use the Critical State Recommendations tool.
  • Too many Unresolved, There are too many connected devices under the unresolved node.,

    • Normally newly discovered devices are initially placed under the unresolved node until their mac address can be discovered from to top parent or a router. In a properly configured network devices are not expected to stay under unresolved and will quickly move under the appropriate switch when the mac is discovered. If many devices remain under unresolved it is an indication the wrong top parent has been selected or the top parent SNMP configuration needs to change.
    • ACTIONS:
      • If cause is due to the top parent's SNMP, resolved the SNMP issue so data is returned.
      • If the devices are remote and the top parent is not expected to get the mac addresses, manually move the devices from unresolved to either the top parent or create a manually inserted node below the top parent as a holding node with appropriate name and move the devices there.
      • If the device mac addresses are expected to be resolved and top parent is providing SNMP, the top parent selection must be wrong.
  • Host KPI settings, The host KPI settings include irrelevant event types.,

    • IntraVUE‎'s Analytics used counts every event except ping over threshold as an IntraVUE‎ event. This includes such thiings as trap messages, registration events, verifying a device, every event. If WBC Network Health Monitor recognized the default settings have not been changed a warning is issued.
    • ACTION: fix the situation using KPI Management
  • No Critical Devices, There are no critical devices, essential for good health management.,

    • A warning is issued if all devices are set to the unknown critical state or if only the Top Parent is Critical Always On (set automatically by WBC Network Health Monitor).
    • ACTION: Use Critical State Recommendations to set the devices to an appropriate critical state.
  • IntraVUE Restored ,

    • This is a warning and action will have been taken to adjust data to a possible earlier time period.
    • ACTION: If it appears that data collected by WBC Network Health Monitor is not correct, Go to Configuration / Connections and select Remove Host and then choose to delete the database. Then add the host back in and data will be good.

Alarms and Warnings related to operating enviroment

  • JVM Memory Low, The JVM free memory is low.,
    • This is usually benign, the application has locked all the memory needed and will start reusing memory as needed.
    • ACTION: usually none required but restarting host will free memory
  • Database Memory Low, The database free memory is low.,
    • This is usually benign, the application has locked all the memory needed and will start reusing memory as needed.
    • ACTION: usually none required but restarting host or restarting the Intravue mysql client or WBC Network Health Monitor mysql host will start over.

Alarms and Warnings related to WBC Network Health Monitor

  • Host License Expired, The WBC-INS license for the IntraVUE host has expired and data collection has stopped.,

    • ACTION: Contact WBC-INS tech support to resolve.
  • Host Connection Failure, Unable to connect to host.,

    • The host may have been replaced if the host shows as disconnected yet all other devices are connected. This will result in Intravue not being able to get the SNMP data needed to locate devices.
    • ACTION: If the IP of a top parent changes and can not be set back to the original IP, the IntraVUE‎ network will need to be deleted and a new network added using the new IP.
  • Host License Expiring Soon, The WBC-INS license for the host expires in less than 30 days.,

    • ACTION: Check to see that a quote has been received or call tech support.
  • Collection Stopped, The WBC-NHM collection service stopped.,

    • The WBC Network Health Monitor software may have been busy. This warning is issued after a 30 second check. Collection can actually stop for up to 6 hours and the software will catch up on all events and threshold data. (6 hours is the amount of 1 minute resolution threshold data maintained by IntraVUE‎, WBC Network Health Monitor maintains the last 48 hours of 1 minute threshold data.
    • ACTION: usually none required but is rare cases restart the WBC Network Health Monitor host computer.
  • ivDashboard not responding, The ivDashboard war file is not deployed to IntraVUE or has stopped.,

    • ACTION: Contact WBC tech support
  • Host DB Connections, The Host DB unsuccessful connection count increased.,

    • ACTION: This is a benign statistic maintained, after an unsuccessful connection another attempt is made.
  • Host DB Clients, The Host DB unsuccessful client count increased.,

    • ACTION: This is a benign statistic useful of tech support.
  • IvDashboard.war out of date, IvDashboard.war version is not current,

    • Each time WBC Network Health Monitor starts it checks the version of ivDashboard in the IntraVUE‎ host. If it is not current, an attempt is made to automatically update it.
    • ACTION: If this alarm persists, restart the Windows service 'Apache Tomcat 8.5.54 wbc Tomcat8', if this is not resolved contact tech support.