The SQL Parser
The SQL Parser is a component of the Parsing Engine (PE).
SQL requests are sent to a PE’s Parser in CLIv2 Request parcels. Request parcels consist of the following elements:
- Zero or more SQL statements
- Control information
- Optional USING request modifier data
The Parsing Engine has the following components related to query processing:
- Request cache
- Discretionary access control security checking
- Object use count cache
- Statistics cache
- Parameterized Value Peek
- Query Rewrite
- Push projections into spooled views
- Convert outer joins to inner joins
- Fold views
- Solve satisfiability and transitive closure
- Push conditions into spooled views
- Eliminate joins
- Access planning
- Join planning
- Join index planning
- Complex outer join planning
- Subquery planning
- Aggregation planning
- Insert, delete, and update planning
- Index Wizard
- Optimizer query rewrites
- Plastic steps generation
- Steps packaging
- Concrete steps generation
- Execution control
- Response control
- Transaction and request abort management
- Queue table cache management
The Parser also maintains a dictionary cache for the Resolver and Security, and manages sessions between client applications and the Teradata platform.
The parsing engine generates AMP steps from Request parcels, as shown in the following illustration:
The diagram does not include the Teradata Index Wizard because it is not used to process queries in a production environment. For more information about the Teradata Index Wizard, see Teradata Index Wizard User Guide.
Parsing Engine Component Processes
Parsing engine components perform the following functions for each SQL request sent to the Teradata platform from a client application:
- The Syntaxer analyzes the high-level syntax of the statement for errors.
If the syntax passes the check, then the SQL request components are converted into a data structure called a parse tree. This structure is an exact mapping of the original query text (see Parse Tree Representations of an SQL Request).
This skeletal parse tree is called a SynTree, which the Syntaxer then passes on to the Resolver. The SynTree is also referred to as the Black Tree for the query.
- The Resolver takes the SynTree and fleshes it out with information about any required data conversions and discretionary access control security checks, adds column names and notes any underlying relationships with other database objects, and then passes the more fleshed out tree, now known as a ResTree, to Parameterized Value Peek.
The ResTree is also referred to as the Red Tree for the query.
- The Parameterized Value Peek subsystem examines the ResTree for any parameterized request or DATE, CURRENT_DATE, TEMPORAL_DATE, CURRENT_TIMESTAMP, or USER constant data that can be used to generate more specific query plans than are possible without having such resolved data. Parameterized Value Peek then passes the ResTree to Query Rewrite.
- The Query Rewrite subsystem takes the ResTree from Parameterized Value Peek and rewrites the SQL text to prepare it for optimization.
This subsystem includes query rewrites, such as:
- Converting outer joins to inner joins
- Type 1 and Type 2 View folding
- Pushing projections into views
- Pushing conditions into views
- Satisfiability and transitive closure
- Join elimination
The Query Rewrite subsystem then passes the revised, semantically equivalent, ResTree ’ to the Optimizer. See Query Rewrite and Optimization for further information.
The Optimizer analyzes the ResTree´ using various statistical and configuration data about the database and the system hardware components to determine the optimum plans to access and join the tables specified by the request. This optimization may include further query rewrites.
The Optimizer first checks the statistics cache to see if the statistics it needs have already been retrieved from the Data Dictionary. If not, the Optimizer retrieves them from the Data Dictionary.
The Optimizer then examines any locks placed by the SQL request and attempts to optimize their placement to enhance performance and avoid deadlocks.
The Optimized Parse Tree, now transformed from a simple statement tree to a complete operation tree, is then passed to the Steps Generator for further processing.
This optimized version of the parse tree is referred to as the White Tree, or Operation Tree, for the request. As the ResTree is transformed into the Operation Tree, it is sometimes referred to as a Pink Tree because at that intermediate point it is a mix of red and white, hence pink.
When you perform an EXPLAIN of a request, the report the system produces is a verbal description of the White Tree the Optimizer produces for the request plus some additional information about uncosted steps that Teradata Database inserts into the White Tree for use by the Teradata dynamic workload management software.
See Query Rewrite and Optimization, Teradata Workload Analyzer User Guide, and the documentation for the EXPLAIN modifier in SQL Data Manipulation Language and Teradata Visual Explain User Guide for further information.
- The Steps Generator creates Plastic Steps from the White Tree. Plastic Steps are, except for statement literals, a data-free skeletal tree of AMP directives derived from the Optimized Parse Tree.
The completed Plastic Steps tree is then passed to the Request Cache and to Steps Packaging for further processing.
- Steps Packaging adds context to the Plastic Steps by integrating various user- and session-specific information.
If any Data parcels were passed to the Parser via a parameterized request, then that data is also added to the steps tree (in this chapter, the term Data parcel always refers to a Data parcel set. A non-iterated request is associated with only one Data parcel, while an iterated request is associated with multiple Data parcels. A request can also have no Data parcels associated with it.).
The final product of this process is referred to as Concrete Steps.
- Steps Packaging passes the Concrete Steps to the Dispatcher for assignment to the AMPs.
- The Dispatcher sequentially, incrementally, and atomically transmits the Concrete Steps, called AMP Steps or Spoil Steps at this point in the process, across the BYNET to the appropriate AMPs for processing.
- The Dispatcher manages any abort processing that might be required.
- The Dispatcher receives the results of the AMP Steps from the BYNET and returns them to the requesting application.