The SQL Parser
The SQL Parser is a component of the Parsing Engine (PE).
- Zero or more SQL statements
- Control information
- Optional USING request modifier data (data parcels)
The Parsing Engine has two main components related to query processing: the Parser and the Dispatcher.
- Dictionary cache
- Statistics cache
- Object use count cache
- Request cache
- Syntaxer
- Dynamic parameterization of literals (DPL)
- Resolver
- Discretionary access control security checking
- Parameterized value peek
- Query rewrite
- Optimizer
- Generator
- Plastic steps generation
- Steps packaging
- OptApply
- Concrete steps generation
- Execution control
- Response control
- Transaction and request abort management
- Queue table cache management
The parsing engine generates AMP steps from request parcels, as shown in the following simplified illustration.
Parsing Engine Component Processes
The following is an overview of the major parsing engine components.
- The Syntaxer analyzes the high-level syntax of the statement for errors.
If the syntax passes the check, then the SQL request components are converted into a data structure called a parse tree. This structure is an exact mapping of the original query text (see Parse Tree Representations of an SQL Request).
This skeletal parse tree is called a SynTree, which the Syntaxer then passes on to the Resolver. The SynTree is also referred to as the Black Tree for the query.
- The Resolver takes the SynTree and fleshes it out with information about any required data conversions and discretionary access control security checks, adds column names and notes any underlying relationships with other database objects, and then passes the more fleshed out tree, now known as a ResTree, to Parameterized Value Peek.
The ResTree is also referred to as the Red Tree for the query.
- The Query Rewrite subsystem takes the ResTree and performs various rewrites such as:
- Converting outer joins to inner joins
- Type 1 and Type 2 View folding
- Pushing projections into views
- Pushing conditions into views
- Satisfiability and transitive closure
- Join elimination
The Query Rewrite subsystem then passes the revised, semantically equivalent, ResTree’ to the Optimizer. See Query Rewrite, Statistics, and Optimization for further information.
- The Optimizer analyzes the ResTree´ using various statistical and configuration data about the database and the system hardware components to determine the optimum plans for the request. This subsystem includes:
- Access planning
- Join planning
- Join index planning
- Complex outer join planning
- Subquery planning
- Aggregation planning
- Insert, delete, and update planning
- Optimizer query rewrites
The Optimizer first checks the statistics cache to see if the statistics it needs have already been retrieved from the Data Dictionary. If not, the Optimizer retrieves them from the Data Dictionary.
The Optimizer then examines any locks placed by the SQL request and attempts to optimize their placement to enhance performance and avoid deadlocks.
The Optimized Parse Tree, now transformed from a simple statement tree to a complete operation tree, is then passed to the Steps Generator for further processing.
This optimized version of the parse tree is referred to as the White Tree, or Operation Tree, for the request. As the ResTree is transformed into the Operation Tree, it is sometimes referred to as a Pink Tree because at that intermediate point it is a mix of red and white.
When you perform an EXPLAIN of a request, the report the system produces is a verbal description of the White Tree the Optimizer produces for the request plus some additional information about uncosted steps that Teradata Vantage™ inserts into the White Tree for use by the Teradata dynamic workload management software.
See Query Rewrite, Statistics, and Optimization and the documentation for the EXPLAIN modifier in Teradata Vantage™ - SQL Data Manipulation Language, B035-1146 for further information.
- The Steps Generator creates Plastic Steps from the White Tree. Plastic Steps are, except for statement literals, a data-free skeletal tree of AMP directives derived from the Optimized Parse Tree.
The completed Plastic Steps tree is then passed to the Request Cache and to Steps Packaging for further processing.
- Steps Packaging adds context to the Plastic Steps by integrating various user- and session-specific information.
If any data parcels were passed to the Parser via a parameterized request, then that data is also added to the steps tree (in this discussion, the term data parcel always refers to a data parcel set. A non-iterated request is associated with only one data parcel, while an iterated request is associated with multiple data parcels. A request can also have no data parcels associated with it.).
The final product of this process is referred to as Concrete Steps.
- Steps Packaging passes the Concrete Steps to the Dispatcher for assignment to the AMPs.
- The Dispatcher sequentially, incrementally, and atomically transmits the Concrete Steps, called AMP Steps or Spoil Steps at this point in the process, across the BYNET to the appropriate AMPs for processing.
- The Dispatcher manages any abort processing that might be required.
- The Dispatcher receives the results of the AMP Steps from the BYNET and returns them to the requesting application.