当前位置: 代码迷 >> 综合 >> sqoop 操作 练习
  详细解决方案

sqoop 操作 练习

热度:77   发布时间:2023-12-29 14:43:15.0

sqoop 操作

sqoop help

Available commands:codegen            Generate code to interact with database recordscreate-hive-table  Import a table definition into Hiveeval               Evaluate a SQL statement and display the resultsexport             Export an HDFS directory to a database tablehelp               List available commandsimport             Import a table from a database to HDFSimport-all-tables  Import tables from a database to HDFSimport-mainframe   Import datasets from a mainframe server to HDFSjob                Work with saved jobslist-databases     List available databases on a serverlist-tables        List available tables in a databasemerge              Merge results of incremental importsmetastore          Run a standalone Sqoop metastoreversion            Display version informationsqoop import/export/job/merge/list-databases/list-tables --connect 其它属性

列出数据库

sqoop list-databases \
--connect jdbc:mysql://hadoop01:3306 \
--username root --password 123456;

列出表

sqoop list-tables \
--connect jdbc:mysql://hadoop01:3306/urldb \
--username root --password 123456;

导入数据到hdfs

sqoop import --connect jdbc:mysql://hadoop01:3306/urldb \
--username root --password 123456 \
--table stu \
-m 1 \
--fields-terminated-by '\t' \
--compress \
--columns 'id,name,age' \
--target-dir hdfs://hadoop01:9000/sq24/02CREATE TABLE loz_test1(
id bigint,  
firstname string,
age string
) 
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' 
location '/user/hive/warehouse/loz_test'
;

使用query导入

–query 和 --table 不能同时存在
sql写成1行
query后使用’’ 和 “” 的区别 注意:$CONDITIONS

sqoop import --connect jdbc:mysql://hadoop01:3306/test \
--driver com.mysql.jdbc.Driver \
--username root --password root \
-m 1 \
--split-by id \
--fields-terminated-by '\t' \
--query 'SELECT s1.id, s1.`name`, s1.age FROM stu s1 LEFT JOIN stu1 s2 ON s1.id = s2.id WHERE s2.`name` is NULL AND s1.id > 3 and $CONDITIONS' \
--as-parquetfile \
--target-dir hdfs://hadoop01:9000/sq24/06sqoop import --connect jdbc:mysql://hadoop01:3306/test \
--driver com.mysql.jdbc.Driver \
--username root --password root \
-m 1 \
--split-by id \
--fields-terminated-by '\t' \
--query 'SELECT s1.id, s1.`name`, s1.age FROM stu s1 LEFT JOIN stu1 s2 ON s1.id = s2.id WHERE s2.`name` is NULL AND s1.id > 3 and $CONDITIONS' \
--as-parquetfile \
--delete-target-dir \
--target-dir hdfs://hadoop01:9000/sq24/08

导入到hive表:

CREATE TABLE sq1(
id bigint,  
firstname string,
age string
) 
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' 
;sqoop import --connect jdbc:mysql://hadoop01:3306/urldb \
--username root --password 123456 \
--table stu \
--columns 'id,name,age' \
--null-string '\\N' \
--null-non-string '\\N' \
--delete-target-dir \
--fields-terminated-by '\t' \
--hive-import \
--hive-overwrite \
--hive-table sq1 

增量导入

增量导入:
全量导入:(整张表全部导出)
增量分为两种:append 和 lastmodified(使用时间戳)
sqoop import --connect jdbc:mysql://hadoop01:3306/test \
--driver com.mysql.jdbc.Driver \
--username root --password root \
--table stu \
-m 1 \
--incremental append \
--check-column id \
--last-value 0 \
--target-dir hdfs://hadoop01:9000/sq24/09sqoop import --connect jdbc:mysql://hadoop01:3306/test \
--driver com.mysql.jdbc.Driver \
--username root --password root \
--table stu \
-m 1 \
--incremental append \
--check-column id \
--last-value 7 \
--target-dir hdfs://hadoop01:9000/sq24/09

sqoop job

sqoop的job:--create <job-id>            Create a new saved job--delete <job-id>            Delete a saved job--exec <job-id>              Run a saved job--help                       Print usage instructions--list                       List saved jobs--meta-connect <jdbc-uri>    Specify JDBC connect string for themetastore--show <job-id>              Show the parameters for a saved job--verbose                    Print more information while workingsqoop job --list
sqoop job --create myjob -- import --connect jdbc:mysql://hadoop01:3306/test \
--driver com.mysql.jdbc.Driver \
--username root --password root \
--table stu \
-m 1 \
--incremental append \
--check-column id \
--last-value 0 \
--target-dir hdfs://hadoop01:9000/sq24/10sqoop job --exec myjob

sqoop 导出

sqoop的导出:
sqoop export \
--connect jdbc:mysql://hadoop01:3306/test \
--username root \
--password root \
--table stu2 \
--input-null-string '' \
--input-null-non-string '' \
--export-dir hdfs://hadoop01:9000/sq24/10/* CREATE TABLE `stu2` (`id` int(11) NOT NULL DEFAULT '0',`student_id` int(11) DEFAULT NULL,`name` varchar(45) DEFAULT NULL,`age` int(1) DEFAULT NULL,`sex` varchar(1) DEFAULT NULL,`birthday` varchar(45) DEFAULT NULL,PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
  相关解决方案