sqoop 操作
sqoop help
Available commands:codegen Generate code to interact with database recordscreate-hive-table Import a table definition into Hiveeval Evaluate a SQL statement and display the resultsexport Export an HDFS directory to a database tablehelp List available commandsimport Import a table from a database to HDFSimport-all-tables Import tables from a database to HDFSimport-mainframe Import datasets from a mainframe server to HDFSjob Work with saved jobslist-databases List available databases on a serverlist-tables List available tables in a databasemerge Merge results of incremental importsmetastore Run a standalone Sqoop metastoreversion Display version informationsqoop import/export/job/merge/list-databases/list-tables --connect 其它属性
列出数据库
sqoop list-databases \
--connect jdbc:mysql://hadoop01:3306 \
--username root --password 123456;
列出表
sqoop list-tables \
--connect jdbc:mysql://hadoop01:3306/urldb \
--username root --password 123456;
导入数据到hdfs
sqoop import --connect jdbc:mysql://hadoop01:3306/urldb \
--username root --password 123456 \
--table stu \
-m 1 \
--fields-terminated-by '\t' \
--compress \
--columns 'id,name,age' \
--target-dir hdfs://hadoop01:9000/sq24/02CREATE TABLE loz_test1(
id bigint,
firstname string,
age string
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
location '/user/hive/warehouse/loz_test'
;
使用query导入
–query 和 --table 不能同时存在
sql写成1行
query后使用’’ 和 “” 的区别 注意:$CONDITIONS
sqoop import --connect jdbc:mysql://hadoop01:3306/test \
--driver com.mysql.jdbc.Driver \
--username root --password root \
-m 1 \
--split-by id \
--fields-terminated-by '\t' \
--query 'SELECT s1.id, s1.`name`, s1.age FROM stu s1 LEFT JOIN stu1 s2 ON s1.id = s2.id WHERE s2.`name` is NULL AND s1.id > 3 and $CONDITIONS' \
--as-parquetfile \
--target-dir hdfs://hadoop01:9000/sq24/06sqoop import --connect jdbc:mysql://hadoop01:3306/test \
--driver com.mysql.jdbc.Driver \
--username root --password root \
-m 1 \
--split-by id \
--fields-terminated-by '\t' \
--query 'SELECT s1.id, s1.`name`, s1.age FROM stu s1 LEFT JOIN stu1 s2 ON s1.id = s2.id WHERE s2.`name` is NULL AND s1.id > 3 and $CONDITIONS' \
--as-parquetfile \
--delete-target-dir \
--target-dir hdfs://hadoop01:9000/sq24/08
导入到hive表:
CREATE TABLE sq1(
id bigint,
firstname string,
age string
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
;sqoop import --connect jdbc:mysql://hadoop01:3306/urldb \
--username root --password 123456 \
--table stu \
--columns 'id,name,age' \
--null-string '\\N' \
--null-non-string '\\N' \
--delete-target-dir \
--fields-terminated-by '\t' \
--hive-import \
--hive-overwrite \
--hive-table sq1
增量导入
增量导入:
全量导入:(整张表全部导出)
增量分为两种:append 和 lastmodified(使用时间戳)
sqoop import --connect jdbc:mysql://hadoop01:3306/test \
--driver com.mysql.jdbc.Driver \
--username root --password root \
--table stu \
-m 1 \
--incremental append \
--check-column id \
--last-value 0 \
--target-dir hdfs://hadoop01:9000/sq24/09sqoop import --connect jdbc:mysql://hadoop01:3306/test \
--driver com.mysql.jdbc.Driver \
--username root --password root \
--table stu \
-m 1 \
--incremental append \
--check-column id \
--last-value 7 \
--target-dir hdfs://hadoop01:9000/sq24/09
sqoop job
sqoop的job:--create <job-id> Create a new saved job--delete <job-id> Delete a saved job--exec <job-id> Run a saved job--help Print usage instructions--list List saved jobs--meta-connect <jdbc-uri> Specify JDBC connect string for themetastore--show <job-id> Show the parameters for a saved job--verbose Print more information while workingsqoop job --list
sqoop job --create myjob -- import --connect jdbc:mysql://hadoop01:3306/test \
--driver com.mysql.jdbc.Driver \
--username root --password root \
--table stu \
-m 1 \
--incremental append \
--check-column id \
--last-value 0 \
--target-dir hdfs://hadoop01:9000/sq24/10sqoop job --exec myjob
sqoop 导出
sqoop的导出:
sqoop export \
--connect jdbc:mysql://hadoop01:3306/test \
--username root \
--password root \
--table stu2 \
--input-null-string '' \
--input-null-non-string '' \
--export-dir hdfs://hadoop01:9000/sq24/10/* CREATE TABLE `stu2` (`id` int(11) NOT NULL DEFAULT '0',`student_id` int(11) DEFAULT NULL,`name` varchar(45) DEFAULT NULL,`age` int(1) DEFAULT NULL,`sex` varchar(1) DEFAULT NULL,`birthday` varchar(45) DEFAULT NULL,PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;