Example: Altering Table Row Partitioning with Partitioning Expression Based on Multiple CURRENT_DATE Functions
Assume that the table is partitioned with the latest 2 quarters in separate row partitions and all other data in another row partition. The following example reconciles CURRENT_DATE at the beginning of each quarter. Assume that the table is created on January 1, 2009; therefore, CURRENT_DATE is resolved to January 1, 2009 at creation time.
CREATE SET TABLE sales, NO FALLBACK ( storeID INTEGER, amount DECIMAL(10,2), sale_date DATE FORMAT 'YYYY/MM/DD' NOT NULL) PRIMARY INDEX (storeID) PARTITION BY CASE_N(sale_date>=CURRENT_DATE/*latest quarter data*/, sale_date<CURRENT_DATE AND sale_date>=CURRENT_DATE-INTERVAL '3' MONTH, NO CASE);
The table contains the following rows with a resolved CURRENT_DATE of January 1, 2009:
sales | |||
---|---|---|---|
StoreID | Amount | Sale_Date | PARTITION |
1 | 1000.00 | 2009-01-31 | 1 |
1 | 2000.00 | 2009-01-01 | 1 |
1 | 3500.00 | 2009-01-15 | 1 |
1 | 500.00 | 2008-09-15 | 2 |
1 | 2000.00 | 2008-12-15 | 2 |
1 | 5000.00 | 2009-04-01 | 3 |
Assume that you submit the following request on April 1, 2009:
ALTER TABLE sales TO CURRENT;
The resolved CURRENT_DATE is changed to April 1, 2009 and the rows in the table are reconciled as follows.
sales | |||
---|---|---|---|
StoreID | Amount | Sale_Date | PARTITION |
1 | 1000.00 | 2009-01-31 | 2 |
1 | 2000.00 | 2009-01-01 | 2 |
1 | 3500.00 | 2009-01-15 | 2 |
1 | 500.00 | 2008-09-15 | 3 |
1 | 2000.00 | 2008-12-15 | 3 |
1 | 5000.00 | 2009-04-01 | 3 |
Suppose you submit the following CREATE TABLE request on January 1, 2009.
CREATE SET TABLE customer, NO FALLBACK ( cust_name CHARACTER(10), cust_no INTEGER, policy_expiration_date DATE FORMAT 'YYYY/MM/DD' NOT NULL) PRIMARY INDEX (cust_no) PARTITION BY CASE_N(policy_expiration_date>=CURRENT_DATE, policy_expiration_date<CURRENT_DATE AND policy_expiration_date>=CURRENT_DATE- INTERVAL '3' MONTH);
The rows in the table are as follows, with the resolved CURRENT_DATE as January 1, 2009.
customer | |||
---|---|---|---|
cust_name | cust_no | policy_expiration_date | PARTITION |
Li | 1 | 2009-01-31 | 1 |
Khan | 2 | 2009-01-01 | 1 |
Reddy | 3 | 2009-01-15 | 1 |
Smith | 5 | 2008-12-15 | 2 |
You submit the following ALTER TABLE … TO CURRENT request on April 1, 2009.
ALTER TABLE customer TO CURRENT;
Because the following row cannot be moved to any row partition based on the revised CURRENT_DATE value, the system returns an error message to the requestor.
customer | |||
---|---|---|---|
cust_name | cust_no | policy_expiration_date | PARTITION |
Smith | 5 | 2008-12-15 | 2 |
To avoid this error, you instead submit the following ALTER TABLE … TO CURRENT request on April 1, 2009:
ALTER TABLE customer TO CURRENT WITH DELETE;
As a result of this ALTER TABLE … TO CURRENT request, Vantage deletes the following row from the table because it is no longer of interest.
customer | |||
---|---|---|---|
cust_name | cust_no | policy_expiration_date | PARTITION |
Smith | 5 | 2008-12-15 | 2 |
The rows in the table are then the following, with the resolved CURRENT_DATE as April 1, 2009:
customer | |||
---|---|---|---|
cust_name | cust_no | policy_expiration_date | PARTITION |
Li | 1 | 2009-01-31 | 2 |
Khan | 2 | 2009-01-01 | 2 |
Reddy | 3 | 2009-01-15 | 2 |
Example: Non-optimized Use of a CURRENT_DATE Function in a CREATE TABLE Request
This example specifies the CURRENT_DATE function in a CASE_N condition in a way that reconciliation using an ALTER TABLE … TO CURRENT request cannot be optimized to scan fewer than all of the partitions. As a result, each time you submit an ALTER TABLE … TO CURRENT request on customer, Vantage must perform a full-table scan to reconcile the row partitioning for the newly resolved date.
CREATE SET TABLE customer, NO FALLBACK ( cust_name CHARACTER(10), cust_no INTEGER, policy_expiration_date DATE FORMAT 'YYYY/MM/DD') PRIMARY INDEX(cust_no) PARTITION BY CASE_N(policy_expiration_date=CURRENT_DATE, NO CASE);
Example: Non-optimized Use of CURRENT_DATE Functions in CREATE TABLE Requests
This example also specifies the CURRENT_DATE function as an argument for CASE_N conditions in a way that reconciliation using an ALTER TABLE … TO CURRENT request cannot be optimized to scan fewer than all of the row partitions of customer.
The request specifies a CURRENT_DATE function for two different partitioning columns, so Vantage cannot determine when all rows in certain row partitions are not impacted because there is no known relationship between the policy_expiration_date and birth_date columns that can be used to determine the impacts of a newly resolved date. As a result, each time you submit an ALTER TABLE … TO CURRENT request on customer, Vantage must perform a full-table scan to reconcile the row partitioning for the new resolution date.
CREATE SET TABLE customer, NO FALLBACK ( cust_name CHARACTER(10), cust_no INTEGER, birth_date DATE, policy_expiration_date DATE FORMAT 'YYYY/MM/DD') PRIMARY INDEX (cust_no) PARTITION BY CASE_N(policy_expiration_date >= CURRENT_DATE, policy_expiration_date >= CURRENT_DATE, birth_date < CURRENT_DATE - INTERVAL '20' YEAR, NO CASE);
Example: Altering the Row Partitioning of a Join Index
Assume that for the table in “Example: Altering the Row Partitioning of a Table When the Partitioning Expression Is Based on Multiple CURRENT_DATE Functions," you define a row-partitioned sparse join index j_sales on January 1, 2009 to contain the data of the current quarter in one row partition and all other data in another row partition for sale_amt greater than 2000.00.
CREATE JOIN INDEX j_sales AS SELECT * FROM sales WHERE sale_amt>=2000.00 PRIMARY INDEX (store_ID) PARTITION BY CASE_N(sale_date >= CURRENT_DATE, NO CASE);
Join index j_sales contains the following rows assuming that the resolved CURRENT_DATE is January 1, 2009:
j_sales | |||
---|---|---|---|
store_ID | amount | sale_date | PARTITION |
1 | 2000.00 | 2009-01-01 | 1 |
1 | 3500.00 | 2009-01-15 | 1 |
1 | 2000.00 | 2008-12-15 | 2 |
1 | 5000.00 | 2008-04-01 | 1 |
On April 1, 2009, you submit the following ALTER TABLE TO CURRENT request.
ALTER TABLE j_sales TO CURRENT;
The rows in join index j_sales are reconciled as follows:
j_sales | |||
---|---|---|---|
store_ID | amount | sale_date | PARTITION |
1 | 2000.00 | 2009-01-01 | 2 |
1 | 3500.00 | 2009-01-15 | 2 |
1 | 2000.00 | 2008-12-15 | 2 |
1 | 5000.00 | 2008-04-01 | 2 |
Example: Altering the Row Partitioning
This example demonstrates how using CURRENT_TIMESTAMP or CURRENT_DATE functions in a partitioning expression is most appropriate when the data must be partitioned as one or more current row partitions and one or more history row partitions, where current and history are defined with respect to the resolved CURRENT_TIMESTAMP or CURRENT_DATE function in the partitioning expression.
The example shows how you can periodically can reconcile the table to move older data from the current row partition into one or more history partitions using an ALTER TABLE … TO CURRENT request rather than redefining the row partitioning for the table using explicit dates that must be determined each time you submit an ALTER TABLE … DROP RANGE or ALTER TABLE … ADD RANGE request. You should be take care to evaluate your intended use of CURRENT_DATE and CURRENT_TIMESTAMP functions in a partitioning expression before you define the expressions on a table or join index.
Assume that the following table definition is created in the year 2009 (the CURRENT year at the time).
CREATE TABLE orders ( o_orderkey INTEGER NOT NULL, o_custkey INTEGER, o_orderstatus CHARACTER(1) CASESPECIFIC, o_totalprice DECIMAL(13,2) NOT NULL, o_orderdate DATE FORMAT 'yyyy-mm-dd' NOT NULL, o_orderpriority CHARACTER(21), o_comment VARCHAR(79)) PRIMARY INDEX(o_orderkey) PARTITION BY RANGE_N(o_orderdate BETWEEN DATE '2004-01-01' AND DATE '2010-12-31' EACH INTERVAL '1' MONTH) UNIQUE INDEX(o_orderkey);
Assume that the table is row-partitioned to record 5 years of historical data, data for the current year, and data for one future year.
If you decide to alter the row partitioning for orders next year to maintain 5 years of history data, data for the current year, and data for one future year, you can submit an ALTER TABLE request such as this in 2011:
ALTER TABLE orders MODIFY PRIMARY INDEX(o_orderkey) DROP RANGE WHERE PARTITION BETWEEN 1 AND 12 ADD RANGE BETWEEN DATE '2011-01-01' AND DATE '2011-12-31' EACH INTERVAL '1' MONTH WITH DELETE;
For this case, you must compute the new dates and specify them explicitly in the ADD RANGE clause of the request. This requires manual intervention every year you submit the request.
Suppose that instead of creating orders as was done the first time, you create it using a CURRENT_DATE function in the partitioning expression to simplify altering its partitions. This request assumes that as of the CREATE TABLE date, the last 5 years of historical data, data for the current year, and data for one future year are to be stored in the table for a total of 7 years, which is identical to the earlier case example.
The CREATE TABLE DDL for this case looks like this.
CREATE TABLE orders ( o_orderkey INTEGER NOT NULL, o_custkey INTEGER, o_orderstatus CHARACTER(1) CASESPECIFIC, o_totalprice DECIMAL(13,2) NOT NULL, o_orderdate DATE FORMAT 'yyyy-mm-dd' NOT NULL, o_orderpriority CHARACTER(21), o_comment VARCHAR(79)) PRIMARY INDEX (o_orderkey) PARTITION BY RANGE_N(o_orderdate BETWEEN CAST(((EXTRACT(YEAR FROM CURRENT_DATE)-5-1900)*10000+0101) AS DATE) AND CAST(((EXTRACT(YEAR FROM CURRENT_DATE)+1-1900)*10000+1231) AS DATE) EACH INTERVAL '1' MONTH) UNIQUE INDEX (o_orderkey);
You can schedule the following ALTER TABLE … TO CURRENT request to be submitted annually. This request rolls the row partition window forward by efficiently dropping and adding partitions.
ALTER TABLE orders TO CURRENT WITH DELETE;
This statement does not need to be changed each time the data needs to be repartitioned based on the new dates, as would be the case if the row partitioning expression had not been specified using a CURRENT_DATE function.
In all of the preceding cases, the row partitioning begins on a year boundary.
The first case alters the partitioning in such a way that it continues to begin on a year boundary, but could be altered to roll forward to start on some month in a year by specifying the desired dates in the ALTER TABLE request.
The second case is designed only to roll forward to start on a year boundary.
You can use the following CREATE TABLE request to roll forward to begin partitioning on the first of a month. The case assumes that, as of the CREATE TABLE date, the last 71 months of history, the current month, and 12 future months are to be stored in the table, for a total of 84 months.
CREATE TABLE orders ( o_orderkey INTEGER NOT NULL, o_custkey INTEGER, o_orderstatus CHARACTER(1) CASESPECIFIC, o_totalprice DECIMAL(13,2) NOT NULL, o_orderdate DATE FORMAT 'yyyy-mm-dd' NOT NULL, o_orderpriority CHARACTER(21), o_comment VARCHAR(79)) PRIMARY INDEX (o_orderkey) PARTITION BY RANGE_N(o_orderdate BETWEEN CAST(((EXTRACT(YEAR FROM CURRENT_DATE)-1900)*10000 + EXTRACT(MONTH FROM CURRENT_DATE)*100+01) AS DATE) - INTERVAL '71' MONTH AND CAST(((EXTRACT(YEAR FROM CURRENT_DATE)+1-1900)*10000 + EXTRACT(MONTH FROM CURRENT_DATE)*100+01) AS DATE)+ INTERVAL '13' MONTH - INTERVAL '1' DAY EACH INTERVAL '1' MONTH) UNIQUE INDEX (o_orderkey);
You can schedule the following ALTER TABLE … TO CURRENT request to be submitted monthly or less frequently, but at some time before you run out of future months.
The request rolls the partition window forward by efficiently dropping and adding partitions so that, as of the ALTER TABLE TO CURRENT date, the last 71 months of history data, the data for the current month, and data for 12 months in the future can be contained in the table for a total of 84 months.
ALTER TABLE orders TO CURRENT WITH DELETE;
The following case uses simpler row partitioning, but possibly is not optimized because the entire table might need to be scanned in order to reconcile its rows when you submit an ALTER TABLE … TO CURRENT request to change its row partitioning. The case assumes that, as of the CREATE TABLE date, approximately 2,191 days of history data, the data for the current day, and approximately 365 days of future data can be contained in the table, for a total of about 7 years.
CREATE TABLE orders ( o_orderkey INTEGER NOT NULL, o_custkey INTEGER, o_orderstatus CHARACTER(1) CASESPECIFIC, o_totalprice DECIMAL(13,2) NOT NULL, o_orderdate DATE FORMAT 'yyyy-mm-dd' NOT NULL, o_orderpriority CHARACTER(21), o_comment VARCHAR(79)) PRIMARY INDEX (o_orderkey) PARTITION BY RANGE_N(o_orderdate BETWEEN CURRENT_DATE - INTERVAL '6' YEAR AND CURRENT_DATE + INTERVAL '1' YEAR EACH INTERVAL '1' MONTH) UNIQUE INDEX (o_orderkey);
You could schedule the following ALTER TABLE … TO CURRENT request to be submitted daily or less frequently, but at some time before you run out of future days. The request rolls the partition window forward by dropping and adding row partitions only if CURRENT_DATE is the same day of the month as the last CREATE TABLE or ALTER TABLE … TO CURRENT request was submitted. Otherwise, the entire table must be scanned to reconcile the rows.
ALTER TABLE orders TO CURRENT WITH DELETE;
This request can be very inefficient if you do not submit the ALTER TABLE … TO CURRENT request on the same day of the month as the last CREATE TABLE or ALTER TABLE … TO CURRENT request was submitted.
Performance degrades as a function of the increase in the number of days between the last resolved date and the new resolved date because of the increasing number rows that must be moved. For example, if the last resolved date was January 1, 2010 and the next ALTER TABLE … TO CURRENT request is submitted on February 2, 2010, then Vantage would have to move all of the orders table rows to new partitions.
Also note that using the partitioning specified for this case, the system returns an error message reporting a non-valid date error if the CREATE TABLE or ALTER TABLE … TO CURRENT request is submitted on February 29.