Introduction to Backing up Data - Aster Analytics on AWS

Teradata Aster Analytics on AWS Getting Started Guide

Product
Aster Analytics on AWS
Release Number
1.5
Published
October 2017
Language
English (United States)
Last Update
2018-04-13
dita:mapPath
gkp1499384139306.ditamap
dita:ditavalPath
AWS.ditaval
dita:id
tsu1471542219751
lifecycle
previous
Product Category
Cloud

You can back up Teradata Aster Analytics on AWS data in the AWS public cloud using EBS Snapshot.

For the EBS Snapshot to be effectively used as a backup mechanism, you make sure that the /data directory is the mount-point of an EBS volume that contains all the data, and this volume is separate from any other volume. Whenever a backup is desired, the System Administrator can trigger an EBS snapshot from the AWS Console.

Creating a new snapshot is described in the Creating an Amazon EBS Snapshot section in the Amazon User Guide for Linux Instances.

Deleting snapshots that are no longer required is described in the Deleting an Amazon EBS Snapshot section in the Amazon User Guide for Linux Instances.

When a snapshot of the volume is started, the backup process begins, which creates a copy of the data written to the EBS volume up until the point when the snapshot command was issued.

Snapshot creation is an async process, which means that you can continue using the node or volume while the backup is being taken, any changes between the time the snapshot command is issued and when the async snapshot operation finishes are not backed-up. This means that there is no-guessing required on what all got backed-up.

Constraints

EBS snapshot is a copy of only the data that is already written to the EBS volume. Any data that is written from a process perspective, but is still not written to the disk (it is instead waiting in a kernel buffer), is not backed up.

To work around this issue, Amazon recommends that you always unmount the volume and then issue the snapshot command, although EBS Snapshot creation is an async process and does not require the disk to be offline during the whole process. Once the command is issued, there is no need to wait for the operation to finish, the disk can be remounted immediately. The unmount is done only to force the kernel to flush the file-buffers to the volume. The volume does NOT need to be offline during the whole process.

There are other constraints on the number of snapshot operations that can be running in-parallel on the same volume, which are discussed in the Creating an Amazon EBS Snapshot section in the Amazon User Guide for Linux Instances.

A summary of the EBS snapshot process is:
  • Unmount the volume from within the node by using the umount command.
  • Issue the EBS Snapshot from the AWS Console or use the AWS CLI create-snapshot command.
  • Mount the volume back on the node for the node operation to resume.

Because a momentary unmount is required, if the node is left running at this point, it may result in unexpected behavior. To avoid this issue, the node must be brought offline momentarily for the above process to complete without issues.

Considerations

An Aster cluster consists of Queen and Worker nodes, and users expect a backup and restore operation to be a cluster-wide operation, in the sense that a backup consists data from all the nodes at the same point in time.

AWS does not currently provide a mechanism to create such a snapshot.

Teradata Aster provides a Cluster-Wide Backup and Restore Utility to support this requirement.