When you copy PARTITION statistics, the statistics copied to the target table might not correctly represent the data in the target table because of differences in internal partition number mapping between the source and target tables. This is true even if the table definitions returned by a SHOW TABLE statement are identical and the data is the same in both tables.
If you use a CREATE TABLE … AS … WITH DATA AND STATISTICS statement to create a target table, the PARTITION statistics you copy from the source table are not valid if the internal partition numbers in the target table are different than the source table.
It is critical to understand that there is no way to guarantee that a target table created using a CREATE TABLE … AS … WITH DATA AND STATISTICS statement is identical to the source table from which its statistics are copied down to the level of internal partition numbers, and it is important to understand that even though the 2 tables might appear to be identical from comparing their definitions using the output of SHOW TABLE statements on the tables.
All data in the columns whose statistics are copied using a COLLECT STATISTICS … FROM source_table must be identical in both the source and target tables.
When there is a mismatch between the source and target tables, the request does not abort or return an error. Instead, it returns 1 of 2 possible warnings, depending on the nature of the mismatch.
Warning message | Description |
---|---|
Statistics are not available in the source table |
This warning can be returned when statistics have been collected on the source table, but cannot be copied to the target table for some reason. |
Only some of the statistics were copied to the target table | Not all of the statistics from the source table could be copied to the target table. The database does not inform you which statistics were or were not copied. |
You do not obtain the same results from a COLLECT STATISTICS … FROM source_table statement and a CREATE TABLE … AS … WITH DATA AND STATISTICS statement if the internal partition numbers are different. While the database copies the same statistics using both methods, in the COLLECT STATISTICS … FROM source_table case, there are cases where the statistics are not valid for the data even though data is the same in both tables.
As a general rule, you should always recollect the PARTITION statistics for the target table when you copy them from a source table.