Break/split the string on delimiters and use the sub-strings in In-clause of subquery

Teradata: Convert String for In-Clause

There are scenarios where we get data in the form of strings/character sequences with some delimiter. And we want to break/split the string on delimiters and use the sub-strings in In-clause of subquery.


Lets assume, we have got a string of vehicle type with comma as delimiter and want to break into each vehicle category, For example: 'tractor,scooter,bike'

Syntax : Here is the example to break the String and get values.

SELECT regexp_substr('tractor,scooter,bike','[^,]+',1,day_of_calendar) fields
FROM sys_calendar.calendar
WHERE day_of_calendar <= (CHAR('tractor,scooter,bike') - CHAR(oreplace('tractor,scooter,bike',',','')))+1;

Output:

fields
~~~~~~~~
tractor
scooter
bike

Here is the systax to use this in WHERE clause

SELECT * FROM  
WHERE yourtable.requiredColumn in 
(
  SELECT regexp_substr('tractor,scooter,bike','[^,]+',1,day_of_calendar) fields
  FROM sys_calendar.calendar
  WHERE 
    day_of_calendar <= (CHAR('tractor,scooter,bike') -  CHAR(oreplace('tractor,scooter,bike',',','')))+1
);


How the query is working: Basically, the query is executing in 3 steps.
  1. First step is identifying the number of values present in each received string.
    And below statement is calculating the number of occurence of commas in the string by calculating length difference before and after all the commas are replaced from the received String.
    
    SELECT (CHAR('tractor,scooter,bike') - CHAR(oreplace('tractor,scooter,bike',',','')))+1;
    
    Output: 3
    
    Here CHAR is used to get the length of string and visit Teradata String Functions to learn more about string functions

  2. Second step is to get rows with 1 to 'n' number, where n is equivalent to number of values in the String.
    
    SELECT day_of_calendar
    FROM sys_calendar.calendar
    WHERE day_of_calendar <= (CHAR('tractor,scooter,bike') - CHAR(oreplace('tractor,scooter,bike',',','')))+1;
    
    Output:
    
    day_of_calendar
    ~~~~~~~~~~~~~~~
    1
    2
    3
    
  3. In the third step, String is being split for each comma using regular expression substring function.
    
    regexp_substr('tractor,scooter,bike','[^,]+',1,n);
    
    **Here n is number and will return nth value in the string after splitting on delimiter.
    
There is another way to split the input string and use it in In-Clause. It can be achieved by breaking the string to table using in-built Teradata functions(strtok_split_to_table & regexp_split_to_table) . If you are interested then you may get more detail here on String to Table functions