Hadoop - Sqoop

> Database > (Apache) Hadoop > Hadoop - Sqoop

1 - About

Sqoop is designed to:

  • import tables from a database into HDFS.
  • export HDFS data into a database

Sqoop is a Hadoop command line program to (process/transfer) data between:

data sources through MapReduce programs.

You can use Sqoop to import and export data.

Sqoop is a collection of related tools.

Advertising

3 - Architecture

The sqoop command-line program is a wrapper which runs the bin/hadoop script shipped with Hadoop.

4 - Installation / Configuration

################
## Binary location
################
export HADOOP_COMMON_HOME=/path/to/some/hadoop \
export HADOOP_MAPRED_HOME=/path/to/some/hadoop-mapreduce \
# if HADOOP_COMMON_HOME and HADOOP_MAPRED_HOME are not used
export HADOOP_HOME=
# if  HADOOP_HOME is not set default installation locations for Apache Bigtop, 
#    * /usr/lib/hadoop 
#    * and /usr/lib/hadoop-mapreduce
 
################
## Configuration location
################
# default value 
export HADOOP_CONF_DIR=$HADOOP_HOME/conf/
 
 
################
## Run
################
sqoop import --arguments...

5 - Management

5.1 - Configuration

To use Sqoop, you must configure Sqoop properties in a JDBC connection and run the mapping in the Hadoop environment.

5.2 - Location

Users of a packaged deployment of Sqoop (such as an RPM shipped with Apache Bigtop) will see this program installed in.

/usr/bin/

5.3 - Tool

Sqoop is a collection of related tools.

  • Sqoop - sqoop Cli - The main tool that calls the other one.
  • sqoop-codegen
  • sqoop-create-hive-table
  • sqoop-eval - primitive SQL execution shell
  • sqoop-export
  • sqoop-help
  • sqoop-import
  • sqoop-import-all-tables - - list the available tables within a schema
  • sqoop-job
  • sqoop-list-databases - list the available database schemas
  • sqoop-list-tables
  • sqoop-merge
  • sqoop-metastore
  • sqoop-version
Advertising

5.4 - Run / Job

6 - Documentation / Reference

db/hadoop/sqoop/start.txt · Last modified: 2019/05/28 09:39 by gerardnico