Certain service actions may impact the operation or severely impact the performance of the system.
Failure to follow these guidelines may result in loss of data and extended downtime, which may not be supported by Teradata.
Coordinate any service actions that result in the disruption of infrastructure resources with Teradata to minimize impact. The following table describes prerequisites of service actions, observed impact, and recovery actions:
Service Action / Event | Prerequisite action | Impact | Recovery action |
---|---|---|---|
VM pause/stop/restart (1 VantageCore VMware VM affected) | Engage HSN to move database off affected VM | Database is restarted once to transition from affected VM | No additional recovery needed. System can remain running on HSN |
VM pause/ stop/restart (>1 VantageCore VMware VM affected) | Database offline | Database needs to stay offline until all Analytics Database VMs are available | Start database after Analytics Database VMs are available |
SQLE VM movement | vMotion not yet supported for VantageCore VMware VMs in this implementation | NA | NA |
Host Reboot (1 VantageCore VMware VM affected) | Engage HSN to move database off affected Host | Database is restarted once to transition to HSN | No additional recovery needed. System can remain running on HSN |
Host Reboot (>1 VantageCore VMware VM affected) | Database offline | Database needs to stay offline until all Analytics Database VMs are available | Start database after all Analytics Database VMs are available |
Planned Storage LUN unavailability (all paths to any LUN unavailable) | Database offline | Database needs to stay offline until all LUNs are available | Start database after all Analytics Database VMs are available |
Unplanned Storage LUN Failure/Corruption (LUN RAID failure) | Database Business Continuity up to date (Back, Fallback) | Based on the number and location of the missing LUNs database may remain available on Fallback | Database recovery through Fallback or BC Recovery |
Loss of Storage Array Redundancy (one controller offline/ controller reboot) | No action | Expect loss of performance while controller offline | Database performance returns to normal once storage performance restored |
Loss of Storage Drive Redundancy (Drive removal/failure) | No action | Expect loss of performance while raid degraded | Database performance returns to normal once storage performance restored |
Loss of Network Redundancy (One switch reboot, power off, replace) | No action | Performance and resiliency impacted while network degraded | Database performance returns to normal once network performance restored |
Loss of Network Node-Switch Connectivity (1 cable failure/replace) | No action | Possible performance impact for impacted node | No additional action required |
Loss of Network Storage-Switch Connectivity (cable failure/replace) | No action | Performance and resiliency impacted while network degraded | No additional action required |
Loss of Network Switch-Switch Connectivity (cable failure/replace) |
No action | Performance and resiliency impacted while network degraded | No additional action required |
Management Cluster Node outage (1 node) | Recommend moving VMs for planned outages | No impact. May experience temporary loss of BYNET (perf) | No additional action required |
Management Cluster Node outage/failure (>1 node) | If sufficient resources are available after node outage to maintain management services, event can be handed like “1 node” failure else database must be taken offline | Database needs to stay offline until all Management Cluster Applications have restarted. | Restart management Cluster services then restart the database |
Management Cluster VSAN offline/failure | Database offline | Database needs to stay offline until all Management Cluster Applications have restarted | Assuming no loss of data restart management Cluster then restart the database. If data loss occurs, recovery is to redeploy apps as required. |