Sqoop Hive Import

Sqoop Hive Import

Sqoop can be used to import data seamlessly into Hive tables from RDBMS systems. This Sqoop tutorial will cover simple hive import, hive import with overwrite existing table, hive import specific columns and Hive import examples.

Generic Arguments to Hive Import command

Attribute

Description

--hive-import

This attribute indicate that this import is Hive import.

--hive-database

This attribute is used to specify hive database where hive table is present.

--hive-table

This attribute is used to specify hive table where data need to be imported.

--hive-overwrite

This attribute is used to overwrite existing hive table where data need to be imported.

--map-column-hive

This attribute is used to specify column names and datatype for the custom hive import.



→ Simple Hive Import: This will import data to Hive table and import utility will create table if not present.
sqoop import  \
 --connect jdbc:mysql://localhost:3306/retail_db \
 --username root \
 --password mysqlrootpassword \
 --driver com.mysql.cj.jdbc.Driver \
 --table orders \
 --hive-import \
 --hive-database retail \
 --hive-table orders_hive 


→ Import Overwrite Hive Table: "--hive-overwrite" attribute can be used to truncate existing hive table before importing data again to that hive table.
 sqoop import  \
 --connect jdbc:mysql://localhost:3306/retail_db \
 --username root \
 --password mysqlrootpassword \
 --driver com.mysql.cj.jdbc.Driver \
 --table order_items \
 --hive-import \
 --hive-database retail \
 --hive-table order_items_hive \
 --hive-overwrite \
 -m 1


→ Import Custom Columns in Hive: "--map-column-hive" attribute can be used to specify data types of the custom import columns
  sqoop import  \
 --connect jdbc:mysql://localhost:3306/retail_db \
 --username root \
 --password mysqlrootpassword \
 --driver com.mysql.cj.jdbc.Driver \
 --table orders \
 --columns order_status,order_id,order_date \
 --bindir $SQOOP_HOME/lib/ \
 --split-by order_id \
 --hive-import \
 --hive-database retail \
 --hive-table orders_hive_1 \
 --map-column-hive "order_status=STRING,order_id=INT,order_date=STRING"
Note: Hive import will fail if column name is mentioned different in --map-column-hive than RDBMS table name then it will fail with below error.
ERROR [main] tool.ImportTool (ImportTool.java:run(630)) - Import failed: No column by the name order_ts_string found while importing data