Potential Data Risks When Archiving/Restoring Selected Partitions - TARA/ABU

Teradata Archive/Recovery Utility Reference

Product
TARA/ABU
Release Number
16.10
Published
May 2017
Language
English (United States)
Last Update
2018-05-01
dita:mapPath
utr1488824663491.ditamap
dita:ditavalPath
Audience_PDF_include.ditaval
dita:id
B035-2412
lifecycle
previous
Product Category
Teradata Tools and Utilities

Be careful when archiving partitioned tables: a number of undesirable conditions can occur. For additional issues that might occur during restore operations, see Considerations Before Restoring Data.

The following cases generally do not display an error or give any indication that a problem has occurred. In most instances, the only indication is that data is incorrect or is missing from a table.
  • Use Correct Specifications – The incorrect use of specifications causes the following problems:
    • An incorrect PARTITIONS WHERE specification during backup can result in an incomplete archive or difficulties during a restore operation.
    • An incorrect PARTITIONS WHERE or ALL PARTITIONS specification during restore can result in data lost from a table or the restoration of stale data to a table if the archive being restored contains partial, incomplete, or stale versions of an already existing partition.
  • Restrict Updates to Active Partitions – It is not possible to determine which partitions have been modified since the last backup. If changed partitions are not re-archived, the changes are lost when restored.

    For example, if the following scenarios exist for a table:

    • The backup strategy is to back up only the active (latest) partition of the table.
    • A change is made to a non-active partition (to fix an incorrect update).

      The only way to archive the change is to archive the changed partitions separately.

      The remedy for this situation is to restrict updates to the active partitions only (by using views to control which rows/partitions are updated) or to re-archive all modified partitions.

  • Do Not Change Values of Functions or Variables – If a built-in SQL function or variable is used in the PARTITIONS WHERE condition, and the value of the function or variable changes during the job, a different set of partitions might be archived (or restored) for some objects in that single archive.

    For example, if an archive job uses the CURRENT_DATE built-in function to determine which is the active partition, and the backup runs past midnight, the date change causes a different partition to be selected. This means that objects archived after midnight will archive the new (and probably empty) partition.

    The remedy for this situation is to do one of the following:

    • Avoid using a changing function or variable in the PARTITIONS WHERE condition.
    • Run the backup at a time when the value will not change.
    • Modify the PARTITIONS WHERE condition to take the value change into account when selecting partitions. For example, define a range, such as ‘BETWEEN CURRENT_DATE – n AND_CURRENT_DATE’ to archive the active partition even if the date changes.
  • Always Specify PARTITIONS WHERE or ALL PARTITIONS – If PARTITIONS WHERE or ALL PARTITIONS are not specified for a RESTORE or COPY operation, the default action is to overwrite the entire table with the archived table definition and data. Essentially, this is the same as a full-table restore.

    For example, if PARTITIONS WHERE is omitted when restoring a single-partition backup, data is dropped from the table and the single partition stored on the archive is restored.

    To solve this problem, always specify PARTITIONS WHERE or ALL PARTITIONS when restoring partitions into an existing table. Otherwise, the existing table will be overwritten.

  • Know What Partitions are Being Deleted – In a RESTORE or COPY operation, all partitions that match the PARTITIONS WHERE condition are deleted, even if they are not stored on the archive.

    For example, if an archive is restored that contains the data for April 2007, and has a PARTITIONS WHERE condition that matches both March and April 2007, then the data for March and April 2007 are deleted, and only April 2007 is restored.

    Therefore, be careful when using PARTITONS WHERE. If there is any doubt about which partitions are affected, COPY the selected partition backup to a staging table, and manually copy the desired partition(s) into the target table using INSERT ... SELECT and/or DELETE.

  • Avoid Restoring From a Previous Partitioning Scheme – When changing the partitioning expression for a table, changing the boundaries of existing partitions is feasible. If these partitions are restored, Teradata might drop more data than expected or restore less data than expected, if the archive does not include data for all of the selected partitions.

    For example, if an archive is done on a table partitioned by month with the archive data corresponding to March 2004, and the table is re-partitioned by week, then a PPI restore of the March backup (using ALL PARTITIONS) overwrites the data for all weeks that contain at least one day in March. As a result, the last few days of February and the first few days of April might be deleted and not restored.

    Therefore, avoid restoring partition backups from a previous partitioning scheme to an updated table. Or, use LOG WHERE for the weeks that contain days in both March and February/April, and manually copy the rows into the table.

  • Track the Partitions in Each Archive – Manual steps are required to determine which partitions are archived by a given backup job, or to determine which backup job has the latest version of a given partition. ANALYZE displays the Teradata-generated bounding condition that defines the archived partitions. (This differs from a user-entered condition that might only qualify partial partitions.) In this case, inconsistent or old data might be restored to the table if the wrong archive is restored for a partition, or if partition-level archives are restored out-of-order and the archives contain an overlapping set of partitions.

    For example, updated data is lost in the following situation. Assume that a final backup for a March 2007 partition is performed on April 1, 2007. On April 5, a mistake is found in a row dated March 16, so the row is updated, and a new backup of the March partition is done. If, for instance, the table is accidentally deleted a month later, and an attempt is made to restore the April 1 backup instead of the April 5 backup, the updated data is lost.

    To determine the partitions in each archive, keep track of the partition contents of each archived table, retain the output listing associated with a tape, or run ANALYZE jobs on archives.