环境:ubuntu16.0.4,hadoop-2.7.3,hbase-1.3.0,zookeeper-3.4.10,hive-2.1.1
从网上下载sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz。点我下载
然后解压到用户主目录下面
ubuntuserver@Master:~$tar -zxvf sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz
重命名
ubuntuserver@Master:~$sudo mv sqoop-1.4.6.bin__hadoop-2.0.4-alpha sqoop-1.4.6
移动到/usr/local
ubuntuserver@Master:~$sudo mv sqoop-1.4.6 /usr/local/
修改sqoop-env.sh文件
ubuntuserver@Master:/usr/local/sqoop-1.4.6/conf$sudo cp sqoop-env-template.sh sqoop-env.sh
ubuntuserver@Master:/usr/local/sqoop-1.4.6/conf$sudo vim sqoop-env.sh
修改内容如下:
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.# included in all the hadoop scripts with source command
# should not be executable directly
# also should not be passed any arguments, since we need original $*# Set Hadoop-specific environment variables here.#Set path to where bin/hadoop is available
#export HADOOP_COMMON_HOME=export HADOOP_COMMON_HOME=/usr/local/hadoop#Set path to where hadoop-*-core.jar is available
#export HADOOP_MAPRED_HOME=export HADOOP_MAPRED_HOME=/usr/local/hadoop#set the path to where bin/hbase is available
#export HBASE_HOME=export HBASE_HOME=/usr/local/hbase-1.3.0#Set the path to where bin/hive is available
#export HIVE_HOME=export HIVE_HOME=/usr/local/apache-hive-2.1.1-bin#Set the path for where zookeper config dir is
#export ZOOCFGDIR=export ZOOCFGDIR=/usr/local/zookeeper-3.4.10
修改环境变量
ubuntuserver@Master:~$sudo vim /etc/profile
添加以下内容:
export SQOOP_HOME=/usr/local/sqoop-1.4.6
export PATH=$SQOOP_HOME/bin:$PATH
保存文件,执行source /etc/profile使环境变量生效。
执行configure-sqoop命令,如果报如下错误的话:
ubuntuserver@Master:~$ configure-sqoop
Warning: /usr/local/sqoop-1.4.6/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /usr/local/sqoop-1.4.6/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
ubuntuserver@Master:~$
在$SQOOP_HOME/bin目录下面修改configure-sqoop文件,注释掉以下内容:
#if [ -z "${HCAT_HOME}" ]; then
# if [ -d "/usr/lib/hive-hcatalog" ]; then
# HCAT_HOME=/usr/lib/hive-hcatalog
# elif [ -d "/usr/lib/hcatalog" ]; then
# HCAT_HOME=/usr/lib/hcatalog
# else
# HCAT_HOME=${SQOOP_HOME}/../hive-hcatalog
# if [ ! -d ${HCAT_HOME} ]; then
# HCAT_HOME=${SQOOP_HOME}/../hcatalog
# fi
# fi
#fi
#if [ -z "${ACCUMULO_HOME}" ]; then
# if [ -d "/usr/lib/accumulo" ]; then
# ACCUMULO_HOME=/usr/lib/accumulo
# else
# ACCUMULO_HOME=${SQOOP_HOME}/../accumulo
# fi
#fi## Moved to be a runtime check in sqoop.
#if [ ! -d "${HCAT_HOME}" ]; then
# echo "Warning: $HCAT_HOME does not exist! HCatalog jobs will fail."
# echo 'Please set $HCAT_HOME to the root of your HCatalog installation.'
#fi#if [ ! -d "${ACCUMULO_HOME}" ]; then
# echo "Warning: $ACCUMULO_HOME does not exist! Accumulo imports will fail."
# echo 'Please set $ACCUMULO_HOME to the root of your Accumulo installation.'
#fi
接下来验证是否可以使用。先执行start-dfs.sh与start-yarn.sh启动hadoop集群。然后执行hive命令进入到CLI。
现在可以看到要上传的表(techbbs_2017_06_18),表结构,已经表的内容。
然后进入到mysql,创建数据库以及表。
验证是否可以连接mysql:
执行命令:sqoop list-databases -connect jdbc:mysql://localhost:3306 -username root -password 123456
注意,经本人实验,这里的连接语句中的localhost要根据hive的元数据库配置来写,如果是使用的本地mysql数据库作为元数据库,则需要写localhost,如果是使用的远程服务器的mysql作为hive的元数据库,则应该要写ip地址。
可以看到红色框中有个techbbs,表明连接成功了。
现在开始从hdfs导入数据到mysql中,执行以下命令:
sqoop export --connect jdbc:mysql://localhost:3306/techbbs --username root --password 123456 --table techbbs_logs_stat --fields-terminated-by '\001' --export-dir '/user/hive/warehouse/techbbs_2017_06_18'
注意该命令中没有换行。
显示成功导入了一条数据到mysql中,现在到mysql中看是否一致:
可以看到mysql中的数据跟在hdfs中查询到的数据是完全一致的。
如果在导入的时候报如下的错误:
java.util.ServiceConfigurationError: org.apache.hadoop.fs.FileSystem: Provider org.apache.hadoop.fs.s3a.S3AFileSystem could not be instantiatedat java.util.ServiceLoader.fail(ServiceLoader.java:232)at java.util.ServiceLoader.access$100(ServiceLoader.java:185)at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:384)at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)at java.util.ServiceLoader$1.next(ServiceLoader.java:480)at org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2631)at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2650)at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2667)at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:172)at org.apache.sqoop.mapreduce.ExportJobBase.getInputPath(ExportJobBase.java:223)at org.apache.sqoop.mapreduce.ExportJobBase.getInputFileType(ExportJobBase.java:505)at org.apache.sqoop.mapreduce.JdbcExportJob.configureInputFormat(JdbcExportJob.java:68)at org.apache.sqoop.mapreduce.ExportJobBase.runExport(ExportJobBase.java:424)at org.apache.sqoop.manager.SqlManager.exportTable(SqlManager.java:912)at org.apache.sqoop.tool.ExportTool.exportTable(ExportTool.java:81)at org.apache.sqoop.tool.ExportTool.run(ExportTool.java:100)at org.apache.sqoop.Sqoop.run(Sqoop.java:143)at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
Caused by: java.lang.NoClassDefFoundError: com/amazonaws/AmazonServiceExceptionat java.lang.Class.getDeclaredConstructors0(Native Method)at java.lang.Class.privateGetDeclaredConstructors(Class.java:2671)at java.lang.Class.getConstructor0(Class.java:3075)at java.lang.Class.newInstance(Class.java:412)at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:380)... 23 more
Caused by: java.lang.ClassNotFoundException: com.amazonaws.AmazonServiceExceptionat java.net.URLClassLoader.findClass(URLClassLoader.java:381)at java.lang.ClassLoader.loadClass(ClassLoader.java:424)at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)at java.lang.ClassLoader.loadClass(ClassLoader.java:357)... 28 more
解决方法:
把$HADOOP_HOME/share/hadoop/tools/lib中的aws-java-sdk*.jar文件复制到sqoop的lib目录当中,则不会报错了。
可以参考点此参考该问题解决方法。