Surrogate Key is a unique, database supplied or generated identifier generally used as the primary key/index of a table.
➠ When to use a surrogate key
➠ Advantages of surrogate key
➠ Disadvantages of surrogate key
➠ How to generate surrogate key in Teradata
SELECT
(SELECT ZEROIFNULL(MAX(emp_no)) FROM employee) + CSUM(1,1) AS surrogate_key_1_new_emp_no
,emp_no
,emp_name
from employee;
Single AMP (usually vproc 0) generally processes all the data when using CSUM(1,1)
SELECT
(SELECT ZEROIFNULL(MAX(emp_no)) FROM employee) + SUM(1) OVER (ROWS UNBOUNDED PRECEDING) AS surrogate_key_2_new_emp_no
,emp_no
,emp_name
FROM employee;
SELECT
(SELECT ZEROIFNULL(MAX(emp_no)) FROM employee) + ROW_NUMBER() OVER (ORDER BY 1) AS surrogate_key_3_new_emp_no
,emp_no
,emp_name
FROM employee;
Note: "(SELECT ZEROIFNULL(MAX(emp_no)) FROM employee)" is used in above examples to generate new sequence which must be greater than the current max value present in the table.
CREATE MULTISET TABLE surrogate_key_test1
( Id DECIMAL(38,0) GENERATED ALWAYS AS IDENTITY
(
START WITH 1
INCREMENT BY 1
MAXVALUE 4000000
NO CYCLE
),
name VARCHAR(100) NOT NULL
) PRIMARY INDEX (id);
INSERT INTO surrogate_key_test(name) VALUES('Teradata');
INSERT INTO surrogate_key_test(name) VALUES('hive');
INSERT INTO surrogate_key_test(name) VALUES('oracle');
SELECT * FROM surrogate_key_test;
*** Query completed. 3 rows found. 2 columns returned.
*** Total elapsed time was 1 second.
Id Name
--- ---------
3 oracle
1 Teradata
2 hive
Note: There will always be gaps in generated number when using identity column because in Teradata it is not one sequence but multiple parallel sequences(one on each AMP).