Teradata Hash Index
Hash Index provides better performance by providing an alternate access path
Hash Index characteristics:
- Hash Indexes are limited to single table
- Can have different primary index than underlying base table.
- Create its own new physical data table as per the definition.
- Optimizer will decide during runtime whether to use Hash Index or not,users cannot directly query Index.
- Teradata automatically updates Hash Index table whenever there is any change in base table.
- Fallback protection can be used
- Column can be specified as value ordered or hash ordered
- A table can have up to 32 indexes (including join/hash & secondary indexes).
Hash Index drawbacks:
- More space consumption.
- System overhead involved in automatic maintenance of Hash Index.
- MultiLoad, FastLoad, or Restore utilities cannot be used against tables that have hash indexes defined on them.
- Triggers cannot be defined on the tables that have Hash Indexes.
- Order by column must be less than or equal to 4 bytes for 'Value Ordered Hash Index'
When to use a Hash Index:
- Data retrieval benefits are greater than the cost of maintaining and storing the hash index.
Below points briefly describe how Hash Index is different from Join Index, visit
Hash Index Vs Join Index to get more detail.
- By default,RowId of the row is included in the Hash Index. This is not the case with Join Index
- Hash index automatically include the base table 'Primary Index' if no primary index is specified but join index does not.
- Hash index cannot be created on NOPI(No Primary index) table but single table join index can be.
- Hash index cannot be created on column partitioned table but single table join index can be.
- Column partition cannot be created on Hash index but it can be created on single table join index.
- PPI can not be defined on Hash index but it can be created on single table join index.
- Multivalue compression from the base table may be carried over to a join index, but it is not carried over to a hash index.
Note: Unlike Join Index & tables,columns used for data distribution is specified by 'By' clause and not by 'Primary Index'.
Remember that the columns used for data distribution must be part of the columns which make up the Hash Index.
If no 'By' clause is mentioned then hash index will distribute data considering base table's primary index
Example 1: Normal Hash Index
CREATE HASH INDEX hash_index_name(manager_id,salary)
ON tutorial_db.employee;
Example 2: Hash Ordered Hash Index different PI and hash column test
CREATE HASH INDEX employee_hash_index_ordered (manager_id,salary) ON tutorial_db.employee
BY (manager_id) ORDER BY HASH (salary);
Output: *** Failure 5566 Error in Hash Index DDL, Primary index and order by hash columns should be the same.
Example 3: Hash Ordered Hash Index
CREATE HASH INDEX employee_hash_index_ordered (manager_id,salary) ON tutorial_db.employee
BY (manager_id) ORDER BY HASH (manager_id);
Example 4: Value Ordered Hash Index 4 byte test
CREATE HASH INDEX employee_hash_index_ordered (manager_id,salary) ON tutorial_db.employee
BY (manager_id) ORDER BY (salary);
Output: *** Failure 5566 Error in Hash Index DDL, Order by column is non-numeric, or is more than 4 bytes.
Example 5: Value Ordered Hash Index
CREATE HASH INDEX employee_hash_index_ordered (manager_id,emp_no) ON tutorial_db.employee
BY (manager_id) ORDER BY (emp_no);