Teradata Regular Expression Functions

Teradata Regular Expression Functions

There are multiple Teradata Regular Expression Functions which helps to perform variety of String related operation which are not possible using regular string functions. There are five regular expression functions available in Teradata, click on the function below to go directly to that function.
All regexp function has regular expression string as one of the input parameter which distinguishes them from normal functions. These regexp string can be created using metacharacters. Lets see some of available metacharacters used in regexp functions.

Regexp Metacharacters: Lets understand or revise regular expression metacharacters which will be helpful in creating regular expression string used in regexp functions. These metacharacters work same way in most of the modern programming languages.

Complementary Metacharacters: Below list of metacharacters works as a supplement to previous character class or expression.


Last argument for all the regular function is same and it called match parameter and below values are possible inputs for that parameter of functions.

  1. REGEXP_SUBSTR: Extracts a substring from source string that matches a regular expression string parameter.
    
    Syntax: REGEXP_SUBSTR(source string, regular expression string [,start from position number, nth occurrence, match argument ])
    
    Note: This function supports 2, 3, 4, or 5 parameters.

    Examples:
    • With two parameters: Function will assume to start from beginning of the string and return first matched occurrence.
      
      SELECT regexp_substr('tractor,scooter,bike,','[s][a-z]+');
      
      Output: scooter
      

    • With three parameters: This will return first matched occurrence starting from the specified position.
      
      SELECT regexp_substr('tractor,scooter,bike,scooty','[s][a-z]+',15); 
      
      Output: scooty
      

    • With four parameters: This will return nth matched occurrence. In the below 2 examples, 1st occurrence & 2nd occurrence is returned starting from position 1.
      
      1st occurrence
      SELECT regexp_substr('tractor,scooter,bike','[s][a-z]+',1,1);
      Output: scooter
      
      2nd occurrence
      SELECT regexp_substr('tractor,scooter,bike,scooty','[s][a-z]+',1,2);
      Output: scooty
      

    • With five parameters: Match argument can be passed as specified in the table above. This example will do case insensitive matching.
      
      SELECT regexp_substr('tractor,scooter,bike,SCOOTY','[s][a-z]+',1,2,'mi');
      
      Output: SCOOTY
      

    • Find first word of Sentence:
      
      SELECT regexp_substr('tractor scooter bike','[^ ]+',1,1);
      
      Output: tractor
      


  2. REGEXP_REPLACE: Replaces portions of source string that match regular expression string with the replace string.
    
    Syntax: REGEXP_REPLACE(source string, regular expression string [,replace string, start from position number, nth occurrence, match argument ])
    
    Note: This function supports 2, 3, 4, 5 or 6 parameters.

    Examples:
    • With two parameters: Function will assume replace string as blank & to start from beginning of the string and replace all matched occurrence with blank and return string.
      
      SELECT regexp_replace('tractor,scooter,bike','[s][a-z]+');
      
      Output: tractor,,bike
      

    • With three parameters: Function will assume to start from beginning of the string and replace all matched occurrence with replace string and return string.
      
      SELECT regexp_replace('tractor,scooter,bike,summer','[s][a-z]+','scooty'); 
      
      Output: tractor,scooty,bike,scooty
      

    • With four parameters: Function will assume to start from specified position of the string (15 in the below example) and replace all matched occurrence with replace string after that position and return string.
      
      SELECT regexp_replace('tractor,scooter,bike,summer','[s][a-z]+','scooty',15); 
      
      Output: tractor,scooter,bike,scooty
      

    • With five parameters: Function will assume to start from beginning position of the string and replace nth matched occurrence with replace string and return string. If "0" is passed then it will replace all occurence.
      
      2nd occurrence
      SELECT regexp_replace('tractor,scooter,bike,summer','[s][a-z]+','scooty',1,2);
      
      Output: tractor,scooter,bike,scooty
      

    • With six parameters:
      
      SELECT regexp_replace('tractor,scooter,bike,SCOOTY','[s][a-z]+','scooty',1,2,'mi');
      
      Output: tractor,scooter,bike,scooty
      


  3. REGEXP_INSTR: Searches source string for a regular expression string match.
    
    Syntax: REGEXP_INSTR(source string, regular expression String [start from position argument, nth occurrence, return option,  match argument ])
    
    Note: This function supports 2, 3, 4, 5 or 6 parameters.

    Examples:
    • With two parameters: Function will assume to start from beginning of the string and return first matched occurrence start position.
      
      SELECT regexp_instr('tractor,scooter,bike,','[s][a-z]+'); --start position of scooter
      
      Output: 9
      

    • With three parameters: This will return first matched occurrence starting position from the specified position.
      
      SELECT regexp_instr('tractor,scooter,bike,scooty','[s][a-z]+',15);  --start position of scooty
      
      Output: 22
      

    • With four parameters: This will return nth matched occurrence starting position. In the below example, position of 2nd occurrence is returned starting from position 1.
      
      2nd occurrence
      SELECT regexp_instr('tractor,scooter,bike,scooty,baby','[b][a-z]+',1,2); --start position of baby
      
      Output: 29
      

    • With five parameters: Fifth parameter is whether to return start position or end position of matched string. possible values are 0 & 1, 0 specify to return start position of the string and 1 specify to end position of string.
      
      SELECT regexp_instr('tractor,scooter,bike,scooty,baby','[b][a-z]+',1,2,1); --end position of baby
      
      Output: 33
      
    • With six parameters: Match argument can be passed as specified in the table above. Belowa example will do case insensitive matching and return position.
      
      SELECT regexp_instr('tractor,scooter,bike,SCOOTY','[s][a-z]+',1,2,1,'mi'); --end position of SCOOTY
      
      Output: 28
      


  4. REGEXP_SIMILAR: Compares source string to regular expression string and returns integer value.
    
    Syntax: REGEXP_SIMILAR(source string, regular expression String [, match argument ])
    
    Note: This function supports 2 or 3 parameters.

    Example: Function will try to match whole string with the regular expression to return 0 or 1. 1 specify that string is match and 0 specify string is not matching.
    
    SELECT user_id,regexp_similar(user_id,'(b|p)eta','i') FROM alphanumeric_test;
    
    Output: 
    User_Id     regexp_similar(User_Id,'(b|p)eta','i')
    ----------  --------------------------------------
    alpha                                            0
    324BETA                                          0
    alpha4532                                        0
    PETA                                             1
    beta                                             1
    


  5. REGEXP_SPLIT_TO_TABLE: Splits source string into a table of strings using regular expression string as the delimiter. This function is explained separately on String to table tutorials.