Sqoop Export Part 2

Sqoop Export Part 2

Sqoop can be used to export data seamlessly from HDFS into RDBMS systems. Sqoop provides many options to handle different scenarios. Many of them will be explained in multiple tutorials with examples. This Sqoop tutorial will cover simple export with null values, null handling with export and export hive table.

Generic Arguments to export command

Attribute

Description

--export-dir

This is used to specify HDFS directory from where data need to be exported.

--table

This is used to specify RDBMS table name where data need to be exported.

--hcatalog-database

This is used to specify hive database where table is present for data need to be exported.

--hcatalog-table

This is used to specify hive table name from where data need to be exported.



→ Export Data with Nulls: If nulls are not handled properly then null data may be exported as blank string for string columns. There are different arguments to handle nulls in string and number. Both " --input-null-string" & "--input-null-non-string" clauses can be used in a single export.

In the below example, blank string is used as argument for "--input-null-string" to tell Sqoop that blank strings for string columns should be treated as NULL. And "100" is used as argument to "--input-null-non-string" to tell Sqoop that all non string field with value as 100 should be treated as NULL.
sqoop export \
 --connect jdbc:mysql://localhost:3306/retail_db \
 --username root \
 --password mysqlrootpassword \
 --driver com.mysql.cj.jdbc.Driver \
 --table order_export_null_test \
 --export-dir hdfs://localhost:9000/user/username/scoop_import/query_orders_null/part-m-00000 \
 --input-fields-terminated-by "," \
 --bindir $SQOOP_HOME/lib/ \
 --input-null-string "" \
 --input-null-non-string "100" 
Note: Blank for non string columns are always considered as null.


→ Export Hive table: "--hcatalog-database" and "--hcatalog-table" attributes can be used to specify hive database and tablename from where data need to be exported.
sqoop export \
 --connect jdbc:mysql://localhost:3306/retail_db \
 --username root \
 --password mysqlrootpassword \
 --driver com.mysql.cj.jdbc.Driver \
 --table orders_sqoop \
 --bindir $SQOOP_HOME/lib/ \
 --hcatalog-database retail \
 --hcatalog-table orders_hive