我们公司用的是 HBase
HBase是一个开源的NoSQL产品,它是实现了Google BigTable论文的一个开源产品,和Hadoop和HDFS一起,可用来存储和处理海量column family的数据。官方网址是:http://hbase.apache.org。Hbase的体系结构比较复杂,本文只探讨基本的安装测试问题,首先从镜像下载最新版的HBase:
[root@localhost hbase]# wget http://mirror.bjtu.edu.cn/apache/hbase/stable/hbase-0.90.3.tar.gz
Hbase是Java开发的,先检查下本地的JDK版本是否在1.6以上:
[root@localhost hbase]# java -version
java version “1.6.0_24″
解压并配置数据文件的路径:
[root@localhost hbase]# tar zxvf hbase-0.90.3.tar.gz
[root@localhost hbase]# cd hbase-0.90.3/conf
[root@localhost conf]# vi hbase-site.xml
</configuration>
<configuration><property>
<name>hbase.rootdir</name>
<value>/home/banping/hbase/data</value>
</property>
Hbase依赖于Hadoop来运行,它有两种配置方式,一种是比较简单的单实例方式,它已经内置了0.20版本的Hadoop包和一个本地Zookeeper,它们运行在同一个JVM之下,可以用本地文件系统而不用HDFS。另外一种比较复杂的分布式部署,需要在每一个节点替换自带的Hadoop包以免有版本问题,而且必须有一个HDFS实例,同时需要独立的ZooKeeper集群。为简单起见,我们选择单实例部署的方式,启动Hbase进程:
[root@localhost hbase-0.90.3]# bin/start-hbase.sh
starting master, logging to /home/banping/hbase/hbase-0.90.3/bin/../logs/hbase-root-master-localhost.localdomain.out
[root@localhost webtest]# netstat -tnlp|grep java
tcp 0 0 :::2181 :::* LISTEN 9880/java
tcp 0 0 :::60010 :::* LISTEN 9880/java
tcp 0 0 ::ffff:127.0.0.1:63148 :::* LISTEN 9880/java
tcp 0 0 ::ffff:127.0.0.1:35539 :::* LISTEN 9880/java
tcp 0 0 :::60030 :::* LISTEN 9880/java
可以看到启动了5个端口,可以通过自带的shell命令来进行基本的操作:
[root@localhost hbase-0.90.3]# bin/hbase shell
hbase(main):002:0> create ‘test’,'cf’
0 row(s) in 0.9940 seconds
hbase(main):019:0> list
TABLE
test
1 row(s) in 0.0290 seconds
hbase(main):022:0> put ‘test’, ‘row1′, ‘cf:a’, ‘value1′
0 row(s) in 0.2130 seconds
hbase(main):023:0> put ‘test’, ‘row2′, ‘cf:b’, ‘value2′
0 row(s) in 0.0120 seconds
hbase(main):024:0> put ‘test’, ‘row3′, ‘cf:c’, ‘value3′
0 row(s) in 0.0130 seconds
hbase(main):025:0> scan ‘test’
ROW COLUMN+CELL
row1 column=cf:a, timestamp=1310027460885, value=value1
row2 column=cf:b, timestamp=1310027469458, value=value2
row3 column=cf:c, timestamp=1310027476262, value=value3
3 row(s) in 0.0870 seconds
hbase(main):026:0> get ‘test’, ‘row1′
COLUMN CELL
cf:a timestamp=1310027460885, value=value1
1 row(s) in 0.0250 seconds
hbase(main):027:0> disable ‘test’
0 row(s) in 2.0920 seconds
hbase(main):029:0> drop ‘test’
0 row(s) in 1.1440 seconds
hbase(main):030:0> exit
停止Hbase实例:
[root@localhost hbase-0.90.3]# ./bin/stop-hbase.sh
stopping hbase……
如果使用PHP操作Hbase,一种途径是使用Facebook开源出来的thrift,官网是:http://thrift.apache.org/ ,它是一个类似ice的中间件,用于不同系统语言间信息交换。首先下载最新的版本:
[root@localhost hbase]# wget http://mirror.bjtu.edu.cn/apache//thrift/0.6.1/thrift-0.6.1.tar.gz
安装需要的依赖包:
[root@localhost thrift-0.6.1]# sudo yum install automake libtool flex bison pkgconfig gcc-c++ boost-devel libevent-devel zlib-devel python-devel ruby-devel
[root@localhost thrift-0.6.1]# ./configure –prefix=/home/banping/hbase/thrift –with-php-config=/usr/local/php/bin/[root@localhost thrift-0.6.1]# make[root@localhost thrift-0.6.1]# make install
[root@localhost hbase]# thrift/bin/thrift –gen php /home/banping/hbase/hbase-0.90.3/src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift[root@localhost hbase]# cd gen-php/Hbase[root@localhost Hbase]# lltotal 320-rw-r–r– 1 root root 285433 Jul 7 19:22 Hbase.php-rw-r–r– 1 root root 27426 Jul 7 19:22 Hbase_types.php
[root@localhost Hbase]# cp -a /home/banping/hbase/thrift-0.6.1/lib/php /home/webtest/thrift/[root@localhost Hbase]# cd /home/webtest/thrift/[root@localhost thrift]# mkdir packages[root@localhost thrift]# cp -a /home/banping/hbase/gen-php/Hbase /home/webtest/thrift/packages
据一些资料说也可以使用PHP扩展的方式来使用thrift(php和HBase接口文件仍然是必须的),但是我测试没有通过,做法如下:
[root@localhost thrift]# cd ext/thrift_protocol[root@localhost thrift_protocol]# /usr/local/php/bin/phpize[root@localhost thrift_protocol]# ./configure –with-php-config=/usr/local/php/bin/php-config –enable-thrift_protocol[root@localhost thrift_protocol]# make[root@localhost thrift_protocol]# make install
[root@localhost hbase-0.90.3]# ./bin/start-hbase.sh[root@localhost hbase-0.90.3]# ./bin/hbase-daemon.sh start thriftstarting thrift, logging to /home/banping/hbase/hbase-0.90.3/bin/../logs/hbase-root-thrift-localhost.localdomain.out
thrift默认监听的端口是9090
[root@localhost webtest]# netstat -tnlp|grep java
tcp 0 0 ::ffff:127.0.0.1:9090 :::* LISTEN 10069/java
tcp 0 0 :::2181 :::* LISTEN 9880/java
tcp 0 0 :::60010 :::* LISTEN 9880/java
tcp 0 0 ::ffff:127.0.0.1:63148 :::* LISTEN 9880/java
tcp 0 0 ::ffff:127.0.0.1:35539 :::* LISTEN 9880/java
tcp 0 0 :::60030 :::* LISTEN 9880/java
写一个PHP文件来测试:
[root@localhost webtest]# vi hbase.php<?php$GLOBALS['THRIFT_ROOT'] = ‘/home/webtest/thrift/src’;require_once( $GLOBALS['THRIFT_ROOT'].’/Thrift.php’ );require_once( $GLOBALS['THRIFT_ROOT'].’/transport/TSocket.php’ );require_once( $GLOBALS['THRIFT_ROOT'].’/transport/TBufferedTransport.php’ );require_once( $GLOBALS['THRIFT_ROOT'].’/protocol/TBinaryProtocol.php’ );require_once( $GLOBALS['THRIFT_ROOT'].’/packages/Hbase/Hbase.php’ );$socket = new TSocket( ‘localhost’, 9090 );$socket->setSendTimeout( 10000 ); // Ten seconds (too long for production, but this is just a demo ;)$socket->setRecvTimeout( 20000 ); // Twenty seconds$transport = new TBufferedTransport( $socket );$protocol = new TBinaryProtocol( $transport );$client = new HbaseClient( $protocol );$transport->open();echo( “listing tables…\n” );$tables = $client->getTableNames();sort( $tables );foreach ( $tables as $name ) {echo( ” found: {$name}\n” );}$columns = array(new ColumnDescriptor( array(‘name’ => ‘entry:’,‘maxVersions’ => 10) ),new ColumnDescriptor( array(‘name’ => ‘unused:’) ));$t = “table1″;echo( “creating table: {$t}\n” );try {$client->createTable( $t, $columns );} catch ( AlreadyExists $ae ) {echo( “WARN: {$ae->message}\n” );}$t = “test”;echo( “column families in {$t}:\n” );$descriptors = $client->getColumnDescriptors( $t );asort( $descriptors );foreach ( $descriptors as $col ) {echo( ” column: {$col->name}, maxVer: {$col->maxVersions}\n” );}$t = “table1″;echo( “column families in {$t}:\n” );$descriptors = $client->getColumnDescriptors( $t );asort( $descriptors );foreach ( $descriptors as $col ) {echo( ” column: {$col->name}, maxVer: {$col->maxVersions}\n” );}$t = “table1″;$row = “row_name”;$valid = “foobar-\xE7\x94\x9F\xE3\x83\x93″;$mutations = array(new Mutation( array(‘column’ => ‘entry:foo’,‘value’ => $valid) ),);$client->mutateRow( $t, $row, $mutations );$table_name = “table1″;$row_name = ‘row_name’;$fam_col_name = ‘entry:foo’;$arr = $client->get($table_name, $row_name , $fam_col_name);// $arr = arrayforeach ( $arr as $k=>$v ) {// $k = TCellecho (“value = {$v->value} , <br> ”);echo (“timestamp = {$v->timestamp} <br>”);}$table_name = “table1″;$row_name = “row_name”;$arr = $client->getRow($table_name, $row_name);// $client->getRow return a arrayforeach ( $arr as $k=>$TRowResult ) {// $k = 0 ; non-use// $TRowResult = TRowResultvar_dump($TRowResult);}$transport->close();?>