Teradata Join Index
Join Index provides better performance by providing an alternate access path
Join Index(JI) characteristics:
- Popularly compared with the Oracle's materialized view.
- Create its own new physical data table as per the definition.
- Optimizer will decide during runtime whether to use Join Index or not,users cannot directly query Join Index.
- Teradata automatically updates Join Index whenever there is any change in base tables.
- Can have different primary index than underlying base tables.
- Fallback protection can be used
- Column can be specified as value ordered
- A table can have up to 32 indexes (including join/hash & secondary indexes).
- Can be defined using both left and right outer joins, but full outer joins are prohibited.
Join Index drawbacks:
- More space consumption.
- System overhead involved in automatic maintenance of Join Index.
- MultiLoad, FastLoad, or Restore utilities cannot be used against tables that have join indexes defined on them.
When to use a Join Index:
- Data retrieval benefits are greater than the cost of maintaining and storing the join index.
- Table is consistently join with other tables on non-primary index columns.
Below points briefly describe how Join Index is different from Hash Index, visit
Hash Index Vs Join Index to get more detail.
- By default,RowId of the row is included in the Hash Index. This is not the case with Join Index.
- Hash index automatically include the base table 'Primary Index' if no primary index is specified but join index does not.
- Join index can be created on NOPI(No Primary index) table but hash index cannot be.
- Join index can be created on column partitioned table but hash index cannot be.
- Column partition can be created on Join index but it cannot be created on Join index.
- PPI can be defined on Join index but it cannot be created defined on hash index.
- Multivalue compression from the base table may be carried over to a join index, but it is not carried over to a hash index.
There are 3 basic types of Join Indexes
- Single Table Join Index: Fewer columns from single table with different primary index.
Example 1:
CREATE JOIN INDEX Single_table_Join_Index_name
AS
SELECT
manager_id,
hire_date,
salary
FROM tutorial_db.employee
PRIMARY INDEX (manager_id) ;
- Aggregate Join Index: Aggregates one or more columns from the base tables.
Example 2:
CREATE JOIN INDEX Aggregate_Join_Index_name
AS
SELECT
dept_no
,SUM (salary)
FROM tutorial_db.department
GROUP BY 1;
** Only COUNT and SUM clause are allowed in aggregate join index.
- Multi-Table Join Index: Pre-Joins multiple tables from the base tables.
Example 3:
CREATE JOIN INDEX Multi_table_Join_Index_name AS
SELECT
emp.emp_name,
emp.job_title,
emp.salary,
emp.dept_no,
dept.department_name,
dept.loc_name
FROM tutorial_db.employee as emp
INNER JOIN tutorial_db.department as dept
ON emp.dept_no = dept.dept_no
PRIMARY INDEX(dept_no);
Cover Query: When all columns in a query can be satisfied by a join index table it is called a COVER QUERY!
Example 4: All the column for below query is satisfied by Multi_table_Join_Index_name Join Index.
SELECT
emp.emp_name,
dept.department_name,
emp.salary
FROM tutorial_db.employee as emp
INNER JOIN tutorial_db.department as dept
ON emp.dept_no = dept.dept_no
Sparse Index: A Sparse Index is a join index with a 'WHERE' Clause. Sparse Index(Join Index) table are smaller since data is filtered with 'WHERE' clause and as a result.
- queries run faster
- lesser system maintainence and overhead
CREATE JOIN INDEX Sparse_Index_name AS
SELECT
emp.emp_name,
emp.job_title,
emp.salary,
emp.dept_no,
dept.department_name,
dept.loc_name
FROM tutorial_db.employee as emp
INNER JOIN tutorial_db.department as dept
ON emp.dept_no = dept.dept_no
WHERE emp.dept_no=300
PRIMARY INDEX(dept_no);
Global Join Index: Global join index is a type of join index where ROWID of base table/tables is added in the column list. Teradata internally uses these RowIds to found other columns (not listed in Join Index) quickly.
CREATE JOIN INDEX Global_Multi_table_Join_Index_name AS
SELECT
emp.emp_name,
emp.job_title,
emp.salary,
emp.dept_no,
dept.department_name,
dept.loc_name,
emp.ROWID,
dept.ROWID
FROM tutorial_db.employee as emp
INNER JOIN tutorial_db.department as dept
ON emp.dept_no = dept.dept_no
PRIMARY INDEX(dept_no);
Value Ordered Join Index: By default, join index subtable rows are sorted locally on each AMP by the row hash value of the NUPI column,it is possible to store rows in value order.
CREATE JOIN INDEX Emp_Ordered_Join_Index AS
SELECT
emp.emp_name,
emp.job_title,
emp.salary,
emp.dept_no,
dept.department_name,
dept.loc_name
FROM tutorial_db.employee as emp
INNER JOIN tutorial_db.department as dept
ON emp.dept_no = dept.dept_no
ORDER BY emp.dept_no
PRIMARY INDEX(emp_name);