Teradata Parallel Transporter (TPT) introduction with basic structure of creation of TPT script

Teradata Parallel Transporter (TPT)

Teradata Parallel Transporter aka TPT is new generation utility tool aim to create one stop tool for all the activities related to loading and exporting of date from/to Teradata database. Teradata advise to prefer TPT over existing utilities such as fastload, fastexport, multiload & tpump.

Some of the operation that can be performed using TPT:
TPT job/script consist of three major parts:

  1. DEFINE JOB: It is mandatory to define the TPT job. It is the outermost layer where 'Operators','Schema' and 'Apply' can be defined for TPT job.
    Syntax: Without defining character set(no UNICODE data)
    DEFINE JOB [tpt_job_name] DESCRIPTION '[description of tpt job, description statement is optional]'
    (
      DEFINE OPERATOR [operator_name]
      [Define operator statements & attribute];
      
      DEFINE SCHEMA [schema_name]
      [Define operator statements & attribute];
      
      APPLY
      [apply statements to for defined operators];
    );
    

    Syntax: Defining character set(UNICODE data)
    USING CHARACTER SET UTF8
    DEFINE JOB [tpt_job_name] DESCRIPTION '[description of tpt job, description statement is optional]'
    (
       DEFINE OPERATOR [operator_name]
       [Define operator statements & attribute];
       
       DEFINE SCHEMA [schema_name]
       [Define operator statements & attribute];
       
       APPLY
       [apply statements to for defined operators];
    );
    

    Syntax: With external Schema using 'Include'
    USING CHARACTER SET UTF8
    DEFINE JOB [tpt_job_name] DESCRIPTION '[description of tpt job, description statement is optional]'
    (
    INCLUDE '$HOME/tpt/schema/external_schema_file.schema';
       DEFINE OPERATOR [operator_name]
       [Define operator statements & attribute];
    
       
       APPLY
       [apply statements to for defined operators];
    );
    

    APPLY: This is required to execute the operations defined in the TPT job. Basic syntax of apply statement
    APPLY ('DML Statement')
    TO OPERATOR (tpt_consumer_operator_name1 [n])
    SELECT * FROM (OPERATOR tpt_producer_operator_name);
    
    Note: 1. 'n' specify the number of instances of each operator to control the degree of parallelism.
          2. Supports multiple data sources using the UNION ALL query.
          3. Supports multiple data targets using multiple APPLY clauses.
          4. Supports derivation of new columns using CASE clause.
    

    There can be three types of TPT jobs
    • Single APPLY statement in the job without 'STEP' command, example defined above are this type of jobs.
    • Single APPLY statement enclosed in job 'STEP' command
    • Multiple APPLY statement enclosed in job 'STEP' command
    Sometime it is required to execute commands in sequence i.e. some steps prior to some other steps. Example:table should be dropped first before recreation the table.

    Syntax: Single APPLY with single 'STEP'
    DEFINE JOB [tpt_job_name] DESCRIPTION '[description of tpt job, description statement is optional]'
     (
        DEFINE OPERATOR [operator_name]
        [Define operator statements & attribute];
        
        DEFINE SCHEMA [schema_name]
        [Define operator statements & attribute];
        
        STEP [step_name]
        (
            APPLY
            (
              [apply statements to for defined operators (insert into table]
              [other required statements]
    
            )
         );
     );
    

    Syntax: Single APPLY with multiple 'STEP'
    DEFINE JOB [tpt_job_name] DESCRIPTION '[description of tpt job, description statement is optional]'
    (
        DEFINE OPERATOR [operator_name]
        [Define operator statements & attribute];
        
        DEFINE SCHEMA [schema_name]
        [Define operator statements & attribute];
        
        STEP [step_name1]
        (
            APPLY
              [executable statements such as drop table]
              [executable statements such as create table]
              [other required statements]
         );
         
        STEP [step_name2]
        (
            APPLY
            (
              [apply statements to for defined operators (insert into table)]
              [other required statements]
            )
         );
    );
    


  2. DEFINE OPERATOR: One or multiple type of operator/s are required to be defined in order to perform any operations in TPT.

    Available 'Define Operators' are listed below:
    Operator Type Description
    DATACONNECTOR PRODUCER This operator used as a producer to get data from source.
    DATACONNECTOR CONSUMER This operator used as a consumer to push data to target.
    DDL This is standalone operator for perfomring DDL/DML operation.
    EXPORTExport operator works similar to Teradata FASTEXPORT.
    FASTEXPORT OUTMOD FastExport OUTMOD Adapter operator works similar to Teradata FASTEXPORT OUTMOD.
    FASTLOAD INMOD FastLoad INMOD Adapter operator works similar to Teradata FASTLOAD INMOD.
    INSERTERConsumer operator that to insert data into given table.
    LOADLoad operator works similar to Teradata FASTLOAD.
    MULTILOAD INMODMultiLoad INMOD Adapter operator works similar to Teradata MULTILOAD INMOD.
    ODBC This operator allows to read data from ODBC compliant data sources.
    OS COMMAND This is standalone operator for executing OS commands.
    SELECTORProducer operator to retrieve data from a given table.
    STREAMStream operator works similar to Teradata TPUMP.
    UPDATE Update operator works similar to Teradata MULTILOAD.

    Syntax: Defining a operator in TPT
    DEFINE OPERATOR [tpt_required_operator_name]  DESCRIPTION '[desc of operator, desc statement is optional]'
     TYPE [Operator type ]
     SCHEMA [* or actual tpt_schema_name]
     ATTRIBUTES
     (
         VARCHAR attribute_name1 [optionally can assign value here with "='atrribute_value'"]
         INTEGER attribute_name2
     );
    
    Note:  1. Attributes name defined should be valid for the operator.
           2. '*' defer schema identification until run time.
    

    Example: Defining a operator in TPT
    DEFINE OPERATOR define_consumer_operator
     TYPE LOAD
     SCHEMA *
     ATTRIBUTES
     (
       VARCHAR UserName,
       VARCHAR UserPassword,
       VARCHAR LogTable,
       VARCHAR TargetTable,
       VARCHAR Tdpid
     );
    

    Note: For more detail on INMOD (Input modification) and OUTMOD(output modification), click here which will redirect to teradata site

  3. DEFINE SCHEMA: This is used to describe the structure of the source dataset or target dataset. Each field definition will contain column name and its data type. One defined schema can be used to point multiple data objects.

    Syntax: Defining a schema in TPT
    DEFINE SCHEMA [tpt_schema_name]   DESCRIPTION '[description of schema, description statement is optional]'
     ( 
        column1 VARCHAR(11), 
        column3 DATE), 
        column2 INTEGER
     ); 
    
    Note: If export/import format of the file is delimited with some character/pattern then all the fields in schema should be defined as VARCHAR only.
    

    Example: Defining a schema in TPT
    DEFINE SCHEMA Define_Schema_structure
     (
         dept_no INTEGER,
         department_name VARCHAR(30),
         loc_name  VARCHAR(30)
     );