当前位置: 代码迷 >> SQL >> sparksql与hive调整
  详细解决方案

sparksql与hive调整

热度:66   发布时间:2016-05-05 09:55:50.0
sparksql与hive整合

hive配置

编辑 $HIVE_HOME/conf/hive-site.xml,增加如下内容:

<property>?<name>hive.metastore.uris</name>?<value>thrift://master:9083</value>?<description>Thrift uri for the remote metastore. Used by metastore client to connect to remote metastore.</description></property>12345

启动hive metastore

启动 metastore: $hive --service metastore &查看 metastore: $jobs[1]+ ?Running ? ? ? ? ? ? ? ? hive --service metastore &关闭 metastore:$kill %1kill %jobid,1代表job id1234567891011

spark配置

将 $HIVE_HOME/conf/hive-site.xml copy或者软链 到 $SPARK_HOME/conf/将 $HIVE_HOME/lib/mysql-connector-java-5.1.12.jar copy或者软链到$SPARK_HOME/lib/copy或者软链$SPARK_HOME/lib/ 是方便spark standalone模式使用123

启动spark-sql

  1. standalone模式

    ./bin/spark-sql --master spark:master:7077 --jars /home/stark_summer/spark/spark-1.4/spark-1.4.1/lib/mysql-connector-java-5.1.12.jar
  • 1

  • yarn-client模式

  • $./bin/spark-sql --master yarn-client --jars /home/stark_summer/spark/spark-1.4/spark-1.4.1/lib/mysql-connector-java-5.1.12.jar执行 sql:select count(*) from o2o_app;结果:302Time taken: 0.828 seconds, Fetched 1 row(s)2015-09-14 18:27:43,158 INFO ?[main] CliDriver (SessionState.java:printInfo(536)) - Time taken: 0.828 seconds, Fetched 1 row(s)spark-sql> 2015-09-14 18:27:43,160 INFO ?[SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - Finished stage: [email protected] 18:27:43,161 INFO ?[SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - task runtime:(count: 1, mean: 242.000000, stdev: 0.000000, max: 242.000000, min: 242.000000)2015-09-14 18:27:43,161 INFO ?[SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - ? ?0% ? ? ?5% ? ? ?10% ? ? 25% ? ? 50% ? ? 75% ? ? 90% ? ? 95% ? ? 100%2015-09-14 18:27:43,161 INFO ?[SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - ? ?242.0 ms ? ? ? ?242.0 ms ? ? ? ?242.0 ms ? ? ? ?242.0 ms ? ? ? ?242.0 ms ? ? ? ?242.0 ms ? ?242.0 ms 242.0 ms ? ? ? ?242.0 ms2015-09-14 18:27:43,162 INFO ?[SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - fetch wait time:(count: 1, mean: 0.000000, stdev: 0.000000, max: 0.000000, min: 0.000000)2015-09-14 18:27:43,162 INFO ?[SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - ? ?0% ? ? ?5% ? ? ?10% ? ? 25% ? ? 50% ? ? 75% ? ? 90% ? ? 95% ? ? 100%2015-09-14 18:27:43,162 INFO ?[SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - ? ?0.0 ms ?0.0 ms ?0.0 ms ?0.0 ms ?0.0 ms ?0.0 ms ?0.0 ms ?0.0 ms ?0.0 ms2015-09-14 18:27:43,163 INFO ?[SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - remote bytes read:(count: 1, mean: 31.000000, stdev: 0.000000, max: 31.000000, min: 31.000000)2015-09-14 18:27:43,163 INFO ?[SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - ? ?0% ? ? ?5% ? ? ?10% ? ? 25% ? ? 50% ? ? 75% ? ? 90% ? ? 95% ? ? 100%2015-09-14 18:27:43,163 INFO ?[SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - ? ?31.0 B ?31.0 B ?31.0 B ?31.0 B ?31.0 B ?31.0 B ?31.0 B ?31.0 B ?31.0 B2015-09-14 18:27:43,163 INFO ?[SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - task result size:(count: 1, mean: 1228.000000, stdev: 0.000000, max: 1228.000000, min: 1228.000000)2015-09-14 18:27:43,163 INFO ?[SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - ? ?0% ? ? ?5% ? ? ?10% ? ? 25% ? ? 50% ? ? 75% ? ? 90% ? ? 95% ? ? 100%2015-09-14 18:27:43,163 INFO ?[SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - ? ?1228.0 B ? ? ? ?1228.0 B ? ? ? ?1228.0 B ? ? ? ?1228.0 B ? ? ? ?1228.0 B ? ? ? ?1228.0 B ? ?1228.0 B 1228.0 B ? ? ? ?1228.0 B2015-09-14 18:27:43,164 INFO ?[SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - executor (non-fetch) time pct: (count: 1, mean: 69.834711, stdev: 0.000000, max: 69.834711, min: 69.834711)2015-09-14 18:27:43,164 INFO ?[SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - ? ?0% ? ? ?5% ? ? ?10% ? ? 25% ? ? 50% ? ? 75% ? ? 90% ? ? 95% ? ? 100%2015-09-14 18:27:43,164 INFO ?[SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - ? ?70 % ? ?70 % ? ?70 % ? ?70 % ? ?70 % ? ?70 % ? ?70 % ? ?70 % ? ?70 %2015-09-14 18:27:43,165 INFO ?[SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - fetch wait time pct: (count: 1, mean: 0.000000, stdev: 0.000000, max: 0.000000, min: 0.000000)2015-09-14 18:27:43,165 INFO ?[SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - ? ?0% ? ? ?5% ? ? ?10% ? ? 25% ? ? 50% ? ? 75% ? ? 90% ? ? 95% ? ? 100%2015-09-14 18:27:43,165 INFO ?[SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - ? ? 0 % ? ? 0 % ? ? 0 % ? ? 0 % ? ? 0 % ? ? 0 % ? ? 0 % ? ? 0 % ? ? 0 %2015-09-14 18:27:43,166 INFO ?[SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - other time pct: (count: 1, mean: 30.165289, stdev: 0.000000, max: 30.165289, min: 30.165289)2015-09-14 18:27:43,166 INFO ?[SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - ? ?0% ? ? ?5% ? ? ?10% ? ? 25% ? ? 50% ? ? 75% ? ? 90% ? ? 95% ? ? 100%2015-09-14 18:27:43,166 INFO ?[SparkListenerBus] scheduler.StatsReportListener (Logging.scala:logInfo(59)) - ? ?30 % ? ?30 % ? ?30 % ? ?30 % ? ?30 % ? ?30 % ? ?30 % ? ?30 % ? ?30 %12345678910111213141516171819202122232425262728293031
    1. yarn-cluster模式

    ./bin/spark-sql --master yarn-cluster ?--jars /home/dp/spark/spark-1.4/spark-1.4.1/lib/mysql-connector-java-5.1.12.jarError: Cluster deploy mode is not applicable to Spark SQL shell.Run with --help for usage help or --verbose for debug output2015-09-14 18:28:28,291 INFO ?[Thread-0] util.Utils (Logging.scala:logInfo(59)) - Shutdown hook calledCluster deploy mode 不支持的123456

    启动 spark-shell

    1. standalone模式

    ./bin/spark-shell --master spark:master:7077 --jars /home/stark_summer/spark/spark-1.4/spark-1.4.1/lib/mysql-connector-java-5.1.12.jar1
    1. yarn-client模式

    ./bin/spark-shell --master yarn-client ? --jars /home/dp/spark/spark-1.4/spark-1.4.1/lib/mysql-connector-java-5.1.12.jarsqlContext.sql("from o2o_app SELECT count(appkey,name1,name2)").collect().foreach(println)1234

    尊重原创,拒绝转载,http://blog.csdn.net/stark_summer/article/details/48443147

    ?

    1 楼 cfan37 19 小时前  
      相关解决方案