Not all problems with data integrity are the result of keypunch errors or semantic integrity violations. Problems originating in disk drive and disk array firmware can also corrupt user data, typically at the block or sector levels. Block- and sector-level errors are the most common origins of disk I/O corruption encountered in user data.
A major problem with handling this type of data corruption is that it generally is not detected until some time after it has occurred. As a result, queries against the corrupted data return semantically correct, but factually incorrect answer sets, and update or delete operations can either miss relevant rows or can change them in error. Once the system detects the corruption, the affected AMP is typically taken offline, various utilities such as ScanDisk and CheckTable are performed, and the data is either repaired or reloaded. Each of these actions either removes access to, or reduces the availability of, the data warehouse to its users until corrections have been made.
Levels of Disk I/O Integrity Checking
To protect against physical data corruption, Vantage permits you to select various levels of disk I/O integrity checking of your table, hash index, and join index data. Secondary index subtables assume the level of disk I/O integrity checking that is defined for their parent table or join index. These checks detect corruption of disk blocks (checksum sampling can also detect some forms of bit and byte corruption) using one of the following integrity methods, which are ranked in order of their ability to detect errors:
- Full end-to-end checksums.
Detects lost writes and most bit, byte, and byte string errors.
- Statistically sampled partial end-to-end checksums.
Detects lost writes and intermediate levels of bit, byte, and byte string errors.
- No checksum integrity checking.
Detects some forms of lost writes using standard file system metadata verification.
Disk I/O Integrity Checking Detects and Logs Errors But Does Not Fix Them
This feature detects and logs disk I/O errors: it does not fix them. When the system detects data corruption, it removes the affected AMP from service and you must then take the appropriate measures to repair the corrupted data.