Prerequisite
- Vantage downtime is required because scaling out reconfigures the database and migrates the EBS data volumes. Schedule a time that minimally impacts users.
- Make sure you increased limits. Read about COP entries and make sure your system is properly configured.
- Create a new system image if a previous restore image was generated or if you have upgraded software since deploying an instance that you are scaling out.
Use this procedure for any of the following:
- If you did not initially scale out a system during deployment
- If you previously scaled out a system during deployment and want to scale out again after deployment
- [First time you scale out] Check if your system can be scaled out:
# tdc-scale-out -d
- Stop the database.
# tpareset –x –y stop for scaling out
- Verify the database is in a DOWN/HARDSTOP state.
# pdestate -a
PDE state: DOWN/HARDSTOPPutting the database in this state may take several minutes. - Type # tdc-scale-out [node_count] where node_count is the number of nodes to expand to, and must be greater than the current node count.The output displays how the configuration will change after scaling out the system. In the following example, the node count is changed from 5 to 6.
Current Configuration: =========================================================================== Nodes: Node Count: 5 --------------------------------------------------------------------------- CPU(Core)/Mem(GB): CPUs/Node: 16 CPUs Total: 80 Mem/Node: 63 Mem Total: 315 --------------------------------------------------------------------------- AMPs/PEs: *AMPs/Node: 21 AMPs Total: 96 PEs/Node: 2 PEs Total: 10 * Only highest AMPs/Node is displayed. =========================================================================== This system can be unfolded to ['6', '7', '8', '11', '16', '32'] nodes. Please enter a valid unfolding option [6/7/8/11/16/32]:6 scale-out (unfold) the current system to [6] nodes: =========================================================================== Nodes: Node Count: 5 => 6 --------------------------------------------------------------------------- CPU(Core)/Mem(GB): CPUs/Node: 16 == 16 CPUs Total: 80 => 96 Mem/Node: 63 == 63 Mem Total: 315 => 378 --------------------------------------------------------------------------- AMPs/PEs: *AMPs/Node: 21 => 18 AMPs Total: 96 == 96 PEs/Node: 2 == 2 PEs Total: 10 => 12 * Only highest AMPs/Node is displayed. =========================================================================== Note: 1. Scaling out (unfolding) a system will INCREASE the node count by provisioning additional instances and other needed resources, including network interfaces, IP addresses. Therefore, the system will COST MORE for both infrastructure and software. 2. The additional IP addresses in the scale out operation will consume additional subnet space. If the subnet this system is operating in does not have enough IP addresses for the new instances being added to the system, this operation will fail. 3. Scaling out a system will NOT INCREASE data storage. The database capacity will NOT be changed after scaling out. 4. Scaling out will boost the overall performance of the system by adding more computation nodes (i.e., CPU and Memory) and increase the total storage bandwidth available to the system by decreasing the data volumes managed per node. 5. A system can always be scaled back (scale in) after scaling out. Continue? [yes/no] yes
In the previous example, after expanding from five to six nodes the AMPs will not be evenly distributed across all nodes. The AMPs/PEs section includes the note that * Only highest AMPs/Node is displayed.
- Type yes.When the process completes, the new configuration appears under Current Configuration.
- [Optional] Check the database status.
# pdestate –a
- Bring up the Teradata system configuration to confirm the number of nodes is correct.
# tdinfo
- If you are using Teradata DSC to run jobs, type the following command on all Vantage nodes to update the configuration of the media server.
# /etc/init.d/clienthandler restart-hwupgrade
- [Optional] Check the logs for troubleshooting.While scaling out, if you encounter the following error: Error: Task Error:[Snapshot Pdisk Information] Failed to execute command /usr/pde/bin/psh -sum 0 nvme list -o json. Execution timeout, then turn off cloudwatch_log option using /usr/local/bin/tdc-scale-out <node-count> -a -t --cloudwatch_log=no.