12.04 - About Server Management Events and Alerts - Server Management

Teradata Server Management Web Services User Guide

prodname
Server Management
vrm_release
12.04
created_date
January 2017
category
User Guide
featnum
B035-5350-017K

The CMIC collects operational events, problem events, and changes in system and component status for all components in the collectives it manages, and stores them in a database.

An event indicates a change in condition that may require intervention. The change may be good, bad, or insignificant to the ongoing operation of the system. For example, events include changes due to operations performed on the system, such as power off, and problem events reported to node event logs by node software.

Events that are serious enough to affect the operation of the system generate alerts. An alert is a report of a condition that requires attention to protect system availability. The system consolidates alerts and stores them in the alert database. The severity of an alert reflects how urgent it is to correct the problem condition. The system and Teradata software include alert detection agents that report problem conditions. The alert detection agents are configured to detect problem conditions for the following:
  • Teradata Database
  • BYNET
  • SCSI adapters
  • Disk arrays
  • Node operating systems
  • Platform hardware
  • Tape libraries
The conditions reflect known issues, such as configuration issues or potential hardware or software failures or improvements to the conditions (for example, a condition that is AutoSolved). The alert detection agents may also report indications of improving conditions.

Summary alerts are groups of alerts known to occur together during a problem condition. The summary identifies the root cause, the components reporting the issue, and the recommended action. Alerts with a status of degraded or worse, which do not align with a known root cause, are grouped as Uncategorized Alerts.

Data bundles are packages of data used to diagnose problems. Data bundles are generated by software subsystems when they detect fault conditions. Data bundles reported during known fault conditions are automatically escalated to the TVI backend as soon as they are available, and become available as attachments to an associated summary alert.