Hbase--02

3.客户端操作

shell客户端

3.1HBase数据模型概念：

在hive表或者mysql表中说描述哪一个数据都是说的哪个库里面的哪张表里面的哪一行数据中的哪一列，才能定位到这个数据

但是在hbase中没有库的概念，说一个数据说的是哪一个名称空间下的那一张表下的哪一个行键的哪一个列族下面的哪一个列对应的是这个数据

namespace：doit

table：user_info

Rowkey

Column Family1（列族）

Column Family2（列族）

Name

age

gender

phoneNum

address

job

code

rowkey_001

柳岩

女

88888888

北京....

演员

123

rowkey_002

唐嫣

女

66666666

上海....

演员

213

rowkey_003

大郎

男

44444444

南京....

销售

312

rowkey_004

金莲

女

99999999

东京....

销售

321

...

namespace:hbase中没有数据库的概念,是使用namespace来达到数据库分类别管理表的作用

table：表，一个表包含多行数据

Row Key (行键):一行数据包含一个唯一标识rowkey、多个column以及对应的值。在HBase中，一张表中所有row都按照rowkey的字典序由小到大排序。

Column Family（列族）:在建表的时候指定，不能够随意的删减，一个列族下面可以有多个列(类似于给列进行分组，相同属性的列是一个组，给这个组取个名字叫列族)

Column Qualifier (列):列族下面的列，一个列必然是属于某一个列族的行

Cell:单元格，由（rowkey、column family、qualifier、type、timestamp，value）组成的结构，其中type表示Put/Delete操作类型，timestamp代表这个cell的版本。KV结构存储，其中rowkey、column family、qualifier、type以及timestamp是K，value字段对应KV结构的V。

Timestamp(时间戳):时间戳，每个cell在写入HBase的时候都会默认分配一个时间戳作为该cell的版本，用户也可以在写入的时候自带时间戳。HBase支持多版本特性，即同一rowkey、column下可以有多个value存在，这些value使用timestamp作为版本号，版本越大，表示数据越新。

3.2进入客户端命令：

Shell
如果配置了环境变量：在任意地方敲 hbase shell ，如果没有配置环境变量，需要在bin目录下./hbase shell
[root@linux01 conf]# hbase shell
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/app/hadoop-3.1.1/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/app/hbase-2.2.5/lib/client-facing-thirdparty/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
For Reference,please visit: http://hbase.apache.org/2.0/book.html#shell
Version 2.2.5,rf76a601273e834267b55c0cda12474590283fd4c,2020年 05月 21日星期四 18:34:40 CST
Took 0.0026 seconds
hbase(main):001:0> --代表成功进入了hbase的shell客户端

3.3命令大全

3.3.1通用命令

status: 查看HBase的状态，例如，服务器的数量。

Shell
hbase(main):001:0> status
1 active master,0 backup masters,3 servers,0 dead,0.6667 average load
Took 0.3609 seconds

version: 提供正在使用HBase版本。

Shell
hbase(main):002:0> version
2.2.5,2020年 05月 21日星期四 18:34:40 CST
Took 0.0004 seconds

table_help: 表引用命令提供帮助。

Shell
关于表的一些命令参考
如：
To read the data out,you can scan the table:
hbase> t.scan
which will read all the rows in table 't'.

whoami: 提供有关用户的信息。

Shell
hbase(main):004:0> whoami
root (auth:SIMPLE)
groups: root
Took 0.0098 seconds

3.3.2命名空间相关命令

list_namespace：列出所有的命名空间

Shell
hbase(main):005:0> list_namespace
NAMESPACE
default
hbase
2 row(s)
Took 0.0403 seconds

create_namespace：创建一个命名空间

Shell
hbase(main):002:0> create_namespace doit
NameError: undefined local variable or method `doit' for main:Object

hbase(main):003:0> create_namespace 'doit'
Took 0.2648 seconds

注意哦：名称需要加上引号，不然会报错的

describe_namespace：描述一个命名空间

Shell
hbase(main):004:0> describe_namespace 'doit'
DESCRIPTION
{NAME => 'doit'}
Quota is disabled
Took 0.0710 seconds

drop_namespace：删除一个命名空间

Shell
注意：只能删除空的命名空间，如果里面有表是删除不了的
hbase(main):005:0> drop_namespace 'doit'
Took 0.2461 seconds

--命名空间不为空的话
hbase(main):035:0> drop_namespace 'doit'

ERROR: org.apache.hadoop.hbase.constraint.ConstraintException: Only empty namespaces can be removed. Namespace doit has 1 tables
        at org.apache.hadoop.hbase.master.procedure.DeleteNamespaceProcedure.prepareDelete(DeleteNamespaceProcedure.java:217)
        at org.apache.hadoop.hbase.master.procedure.DeleteNamespaceProcedure.executeFromState(DeleteNamespaceProcedure.java:78)
        at org.apache.hadoop.hbase.master.procedure.DeleteNamespaceProcedure.executeFromState(DeleteNamespaceProcedure.java:45)
        at org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:194)
        at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:962)
        at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1662)
        at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1409)
        at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1100(ProcedureExecutor.java:78)
        at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1979)

For usage try 'help "drop_namespace"'

Took 0.1448 seconds

alter_namespace：修改namespace其中属性

Shell
hbase(main):038:0> alter_namespace 'doit',{METHOD => 'set','PROPERTY_NAME' => 'PROPERTY_VALUE'}
Took 0.2491 seconds

list_namespace_tables:列出一个命名空间下所有的表

Shell
hbase(main):037:0> list_namespace_tables 'doit'
TABLE
user
1 row(s)
Took 0.0372 seconds
=> ["user"]

3.3.3DDL相关命令

list:列举出默认名称空间下所有的表

Shell
hbase(main):001:0> list
TABLE
doit:user
1 row(s)
Took 0.3187 seconds
=> ["doit:user"]

create:建表

Shell
create ‘xx:t1’,{NAME=>‘f1’,VERSION=>5}
创建表t1并指明命名空间xx
{NAME} f1指的是列族
VERSION 表示版本数
多个列族f1、f2、f3
create ‘t2’,{NAME=>‘f1’},{NAME=>‘f2’},{NAME=>‘f3’}

hbase(main):003:0> create 'doit:student' 'f1','f2','f3'
Created table doit:studentf1
Took 1.2999 seconds
=> Hbase::Table - doit:studentf1

# 创建表得时候预分region
hbase(main):106:0> create 'doit:test','f1',SPLITS => ['rowkey_010','rowkey_020','rowkey_030','rowkey_040']
Created table doit:test
Took 1.3133 seconds
=> Hbase::Table - doit:test

drop:删除表

Shell
hbase(main):006:0> drop 'doit:studentf1'

ERROR: Table doit:studentf1 is enabled. Disable it first.

For usage try 'help "drop"'

Took 0.0242 seconds

注意哦：删除表之前需要禁用表
hbase(main):007:0> disable 'doit:studentf1'
Took 0.7809 seconds
hbase(main):008:0> drop 'doit:studentf1'
Took 0.2365 seconds
hbase(main):009:0>

drop_all：丢弃在命令中给出匹配“regex”的表

Shell
hbase(main):023:0> disable_all 'doit:student.*'
doit:student1
doit:student2
doit:student3
doit:studentf1

Disable the above 4 tables (y/n)?
y
4 tables successfully disabled
Took 4.3497 seconds

hbase(main):024:0> drop_all 'doit:student.*'
doit:student1
doit:student2
doit:student3
doit:studentf1

Drop the above 4 tables (y/n)?
y
4 tables successfully dropped
Took 2.4258 seconds

disable：禁用表

Shell
删除表之前必须先禁用表
hbase(main):007:0> disable 'doit:studentf1'
Took 0.7809 seconds

disable_all：禁用在命令中给出匹配“regex”的表

Shell
hbase(main):023:0> disable_all 'doit:student.*'
doit:student1
doit:student2
doit:student3
doit:studentf1

Disable the above 4 tables (y/n)?
y
4 tables successfully disabled
Took 4.3497 seconds

enable：启用表

Shell
删除表之前必须先禁用表
hbase(main):007:0> enable 'doit:student'
Took 0.7809 seconds

enable_all：启用在命令中给出匹配“regex”的表

Shell
hbase(main):032:0> enable_all 'doit:student.*'
doit:student
doit:student1
doit:student2
doit:student3
doit:student4

Enable the above 5 tables (y/n)?
y
5 tables successfully enabled
Took 5.0114 seconds

is_enabled：判断该表是否是启用的表

Shell
hbase(main):034:0> is_enabled 'doit:student'
true
Took 0.0065 seconds
=> true

is_disabled：判断该表是否是禁用的表

Shell
hbase(main):035:0> is_disabled 'doit:student'
false
Took 0.0046 seconds
=> 1

describe：描述这张表

Shell
hbase(main):038:0> describe 'doit:student'
Table doit:student is ENABLED
doit:student
COLUMN FAMILIES DESCRIPTION
{NAME => 'f1',VERSIONS => '1',EVICT_BLOCKS_ON_CLOSE => 'false',NEW_VERSION_BEHAVIOR => 'false',KEEP_DELETED_CELLS => 'FALSE',CACHE_DATA_ON_WRITE => 'false',DATA_BLO
CK_ENCODING => 'NONE',TTL => 'FOREVER',MIN_VERSIONS => '0',REPLICATION_SCOPE => '0',BLOOMFILTER => 'ROW',CACHE_INDEX_ON_WRITE => 'false',IN_MEMORY => 'false',CACHE
_BLOOMS_ON_WRITE => 'false',PREFETCH_BLOCKS_ON_OPEN => 'false',COMPRESSION => 'NONE',BLOCKCACHE => 'true',BLOCKSIZE => '65536'}

{NAME => 'f2',BLOCKSIZE => '65536'}

{NAME => 'f3',BLOCKSIZE => '65536'}

3 row(s)

QUOTAS
0 row(s)
Took 0.0349 seconds

VERSIONS => '1', -- 版本数量
EVICT_BLOCKS_ON_CLOSE => 'false',
NEW_VERSION_BEHAVIOR => 'false',
KEEP_DELETED_CELLS => 'FALSE', 保留删除的单元格
CACHE_DATA_ON_WRITE => 'false',
DATA_BLOCK_ENCODING => 'NONE',
TTL => 'FOREVER',-- 过期时间
MIN_VERSIONS => '0',-- 最小版本数
REPLICATION_SCOPE => '0',
BLOOMFILTER => 'ROW', --布隆过滤器
CACHE_INDEX_ON_WRITE => 'false',
IN_MEMORY => 'false',-- 内存中
CACHE_BLOOMS_ON_WRITE => 'false',--布隆过滤器
PREFETCH_BLOCKS_ON_OPEN => 'false',
COMPRESSION => 'NONE', -- 压缩格式
BLOCKCACHE => 'true',   -- 块缓存
BLOCKSIZE => '65536' -- 块大小

alter：修改表里面的属性

Shell
hbase(main):040:0> alter 'doit:student',NAME => 'cf1',VERSIONS => 5,TTL => 10
Updating all regions with the new schema...
1/1 regions updated.
Done.
Took 2.1406 seconds

alter_async：直接操作不等待，和上面的alter功能一样

Shell
hbase(main):059:0> alter_async 'doit:student',TTL => 10
Took 1.0268 seconds

alter_status：获取alter命令的执行状态

Shell
hbase(main):060:0> alter_status 'doit:student'
1/1 regions updated.
Done.
Took 1.0078 seconds

list_regions：列出一个表中所有的region

Shell

Examples:
hbase> list_regions 'table_name'
hbase> list_regions 'table_name','server_name'
hbase> list_regions 'table_name',{SERVER_NAME => 'server_name',LOCALITY_THRESHOLD => 0.8}
hbase> list_regions 'table_name',LOCALITY_THRESHOLD => 0.8},['SERVER_NAME']
hbase> list_regions 'table_name',{},['SERVER_NAME','start_key']
hbase> list_regions 'table_name','','start_key']

hbase(main):045:0> list_regions 'doit:student'
SERVER_NAME | REGION_NAME | START_KEY | END_KEY | SIZE | REQ | LOCALITY |
--------------------------- | ------------------------------------------------------------- | ---------- | ---------- | ----- | ----- | ---------- |
linux02,16020,1683636566738 | doit:student,1683642944714.39f7c8772bc476c4d38c663e879d50da. | | | 0 | 0 | 0.0 |
1 rows
Took 0.0145 seconds

locate_region：通过表名和row名方式获取region

Shell
hbase(main):062:0> locate_region 'doit:student','key0'
HOST                                            REGION
linux02:16020                                  {ENCODED => 39f7c8772bc476c4d38c663e879d50da,NAME => 'doit:student,1683642944714.39f7c8772bc476c4d38c663e879d50da.',STARTKEY => '',ENDK
                                                EY => ''}
1 row(s)
Took 0.0027 seconds

show_filters：显示hbase的所有的过滤器

Shell
hbase(main):058:0> show_filters
DependentColumnFilter
KeyOnlyFilter
ColumnCountGetFilter
SingleColumnValueFilter
PrefixFilter
SingleColumnValueExcludeFilter
FirstKeyOnlyFilter
ColumnRangeFilter
ColumnValueFilter
TimestampsFilter
FamilyFilter
QualifierFilter
ColumnPrefixFilter
RowFilter
MultipleColumnPrefixFilter
InclusiveStopFilter
PageFilter
ValueFilter
ColumnPaginationFilter
Took 0.0035 seconds

3.3.4DML相关命令

put插入/更新数据【某一行的某一列】（如果不存在，就插入，如果存在就更新）

Shell
hbase(main):007:0> put 'doit:user_info','rowkey_001','f1:name','zss'
Took 0.0096 seconds
hbase(main):008:0> put 'doit:user_info','f1:age','1'
Took 0.0039 seconds
hbase(main):009:0> put 'doit:user_info','f1:gender','male'
Took 0.0039 seconds
hbase(main):010:0> put 'doit:user_info','f2:phone_num','98889'
Took 0.0040 seconds
hbase(main):011:0> put 'doit:user_info','f2:gender','98889'

注意：put中需要指定哪个命名空间的那个表，然后rowkey是什么，哪个列族下面的哪个列名，然后值是什么
一个个的插入，不能一下子插入多个列名的值

get：获取一个列族中列这个cell

Shell
hbase(main):015:0> get 'doit:user_info','f2:gender'
COLUMN                         CELL
f2:gender                     timestamp=1683646645379,value=123
1 row(s)
Took 0.0242 seconds

hbase(main):016:0> get 'doit:user_info','rowkey_001'
COLUMN                         CELL
f1:age                        timestamp=1683646450598,value=1
f1:gender                     timestamp=1683646458847,value=male
f1:name                       timestamp=1683646443469,value=zss
f2:gender                     timestamp=1683646645379,value=123
f2:phone_num                  timestamp=1683646472508,value=98889
1 row(s)
Took 0.0129 seconds

# 如果遇到中文乱码的问题怎么办呢？在最后加上{'FORMATTER'=>'toString'}参数即可
hbase(main):137:0> get 'doit:student',{'FORMATTER'=>'toString'}
COLUMN                            CELL
f1:name                          timestamp=1683864047691,value=张三
1 row(s)
Took 0.0057 seconds
注意：get是hbase中查询数据最快的方式，但是只能每次返回一个rowkey的数据

scan：扫描表中的所有数据

Shell
hbase(main):012:0> scan 'doit:user_info'
ROW                            COLUMN+CELL
rowkey_001                    column=f1:age,timestamp=1683646450598,value=1
rowkey_001                    column=f1:gender,timestamp=1683646458847,value=male
rowkey_001                    column=f1:name,timestamp=1683646443469,value=zss
rowkey_001                    column=f2:gender,timestamp=1683646483495,value=98889
rowkey_001                    column=f2:phone_num,timestamp=1683646472508,value=98889
1 row(s)
Took 0.1944 seconds

scan 'tbname',{Filter（过滤器）}
scan 'itcast:t2'
#rowkey前缀过滤器
scan 'itcast:t2',{ROWPREFIXFILTER => '2021'}
scan 'itcast:t2',{ROWPREFIXFILTER => '202101'}
#rowkey范围过滤器
#STARTROW：从某个rowkey开始，包含，闭区间
#STOPROW：到某个rowkey结束，不包含，开区间
scan 'itcast:t2',{STARTROW=>'20210101_000'}
scan 'itcast:t2',{STARTROW=>'20210201_001'}
scan 'itcast:t2',{STARTROW=>'20210101_000',STOPROW=>'20210201_001'}
scan 'itcast:t2',{STARTROW=>'20210201_001',STOPROW=>'20210301_007'}

|- 在Hbase数据检索，==尽量走索引查询：按照Rowkey条件查询==

- 尽量避免走全表扫描

- 索引查询：有一本新华字典，这本字典可以根据拼音检索，找一个字，先找目录，找字
- 全表扫描：有一本新华字典，这本字典没有检索目录，找一个字，一页一页找

- ==Hbase所有Rowkey的查询都是前缀匹配==

# 如果遇到中文乱码的问题怎么办呢？在最后加上{'FORMATTER'=>'toString'}参数即可
hbase(main):130:0> scan 'doit:student',{'FORMATTER'=>'toString'}
ROW                               COLUMN+CELL
rowkey_001                       column=f1:name,timestamp=1683863389259,value=张三
1 row(s)
Took 0.0063 seconds

incr：一般用于自动计数的，不用记住上一次的值，直接做自增

Shell
注意哦：因为shell往米面设置的value的值是String类型的
hbase(main):005:0> incr 'doit:student','rowkey002','f1:age'
COUNTER VALUE = 1
Took 0.1877 seconds
hbase(main):006:0> incr 'doit:student','f1:age'
COUNTER VALUE = 2
Took 0.0127 seconds
hbase(main):007:0> incr 'doit:student','f1:age'
COUNTER VALUE = 3
Took 0.0079 seconds
hbase(main):011:0> incr 'doit:student','f1:age'
COUNTER VALUE = 4
Took 0.0087 seconds

count：统计一个表里面有多少行数据

Shell
hbase(main):031:0> count 'doit:user_info'
1 row(s)
Took 0.0514 seconds
=> 1

delete删除某一行中列对应的值

Shell
# 删除某一行中列对应的值
hbase(main):041:0> delete 'doit:student','f1:id'
Took 0.0152 seconds

deleteall：删除一行数据

Shell
# 根据rowkey删除一行数据
hbase(main):042:0> deleteall 'doit:student','rowkey_001'
Took 0.0065 seconds

append：追加，假如该列不存在添加新列，存在将值追加到最后

Shell
# 再原有值得基础上追加值
hbase(main):098:0> append 'doit:student','hheda'
CURRENT VALUE = zsshheda
Took 0.0070 seconds

hbase(main):100:0> get 'doit:student','f1:name'
COLUMN                       CELL
f1:name                     timestamp=1683861530789,value=zsshheda
1 row(s)
Took 0.0057 seconds

#注意：如果原来没有这个列，会自动添加一个列，然后将值set进去
hbase(main):101:0> append 'doit:student','f1:name1','hheda'
CURRENT VALUE = hheda
Took 0.0063 seconds
hbase(main):102:0> get 'doit:student','f1:name1'
COLUMN                       CELL
f1:name1                    timestamp=1683861631392,value=hheda
1 row(s)
Took 0.0063 seconds

truncate：清空表里面所有的数据

Shell
#执行流程
先disable表
然后再drop表
最后重新create表

hbase(main):044:0> truncate 'doit:student'
Truncating 'doit:student' table (it may take a while):
Disabling table...
Truncating table...
Took 2.5457 seconds

truncate_preserve:清空表但保留分区

Shell
hbase(main):008:0> truncate_preserve 'doit:test'
Truncating 'doit:test' table (it may take a while):
Disabling table...
Truncating table...
Took 4.1352 seconds
hbase(main):009:0> list_regions 'doit:test'
                 SERVER_NAME |                                                          REGION_NAME | START_KEY |    END_KEY | SIZE |   REQ |   LOCALITY |
--------------------------- | -------------------------------------------------------------------- | ---------- | ---------- | ----- | ----- | ---------- |
linux03,1684200651855 |           doit:test,1684205468848.920ae3e043ad95890c4f5693cb663bc5. |            | rowkey_010 |     0 |     0 |        0.0 |
linux01,1684205091382 | doit:test,rowkey_010,1684205468848.f8a21615be51f42c562a2338b1efa409. | rowkey_010 | rowkey_020 |     0 |     0 |        0.0 |
linux02,1684200651886 | doit:test,rowkey_020,1684205468848.25d62e8cc2fdaecec87234b8d28f0827. | rowkey_020 | rowkey_030 |     0 |     0 |        0.0 |
linux03,1684200651855 | doit:test,rowkey_030,1684205468848.2b0468e6643b95159fa6e210fa093e66. | rowkey_030 | rowkey_040 |     0 |     0 |        0.0 |
linux01,rowkey_040,1684205468848.fb12c09c7c73cfeff0bf79b5dda076cb. | rowkey_040 |            |     0 |     0 |        0.0 |
5 rows
Took 0.1019 seconds

get_counter：获取计数器

Shell
hbase(main):017:0> incr 'doit:student','f1:name2'
COUNTER VALUE = 1
Took 0.0345 seconds
hbase(main):018:0> incr 'doit:student','f1:name2'
COUNTER VALUE = 2
Took 0.0066 seconds
hbase(main):019:0> incr 'doit:student','f1:name2'
COUNTER VALUE = 3
Took 0.0059 seconds
hbase(main):020:0> incr 'doit:student','f1:name2'
COUNTER VALUE = 4
Took 0.0061 seconds
hbase(main):021:0> incr 'doit:student','f1:name2'
COUNTER VALUE = 5
Took 0.0064 seconds
hbase(main):022:0> incr 'doit:student','f1:name2'
COUNTER VALUE = 6
Took 0.0062 seconds
hbase(main):023:0> incr 'doit:student','f1:name2'
COUNTER VALUE = 7
Took 0.0066 seconds
hbase(main):024:0> incr 'doit:student','f1:name2'
COUNTER VALUE = 8
Took 0.0059 seconds
hbase(main):025:0> incr 'doit:student','f1:name2'
COUNTER VALUE = 9
Took 0.0063 seconds
hbase(main):026:0> incr 'doit:student','f1:name2'
COUNTER VALUE = 10
Took 0.0061 seconds
hbase(main):027:0> get_counter 'doit:student','f1:name2'
COUNTER VALUE = 10
Took 0.0040 seconds
hbase(main):028:0>

get_splits：用于获取表所对应的region数个数

Shell
hbase(main):148:0> get_splits 'doit:test'
Total number of splits = 5
rowkey_010
rowkey_020
rowkey_030
rowkey_040
Took 0.0120 seconds
=> ["rowkey_010","rowkey_020","rowkey_030","rowkey_040"]

尖叫总结：实际生产中很少通过hbase shell去操作hbase，更多的是学习测试，问题排查等等才会使用到hbase shell,hbase总的来说就是写数据，然后查询。 前者是通过API bulkload等形式写数据，后者通过api调用查询。

java客户端

3.4导入maven依赖

XML
<dependencies>
    <dependency>
        <groupId>org.apache.zookeeper</groupId>
        <artifactId>zookeeper</artifactId>
        <version>3.4.6</version>
    </dependency>

    <dependency>
        <groupId>org.apache.hbase</groupId>
        <artifactId>hbase-client</artifactId>
        <version>2.2.5</version>
    </dependency>
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-client</artifactId>
        <version>3.2.1</version>
    </dependency>
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-common</artifactId>
        <version>3.2.1</version>
    </dependency>
    <dependency>
        <groupId>org.apache.hbase</groupId>
        <artifactId>hbase-server</artifactId>
        <version>2.2.5</version>
    </dependency>
    
    <dependency>
        <groupId>org.apache.hbase</groupId>
        <artifactId>hbase-mapreduce</artifactId>
        <version>2.2.5</version>
    </dependency>
    <dependency>
        <groupId>com.google.code.gson</groupId>
        <artifactId>gson</artifactId>
        <version>2.8.5</version>
    </dependency>
    
    <dependency>
        <groupId>org.apache.phoenix</groupId>
        <artifactId>phoenix-core</artifactId>
        <version>5.0.0-HBase-2.0</version>
    </dependency>
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-auth</artifactId>
        <version>3.1.2</version>
    </dependency>
</dependencies>
<build>
    <plugins>
        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-compiler-plugin</artifactId>
            <version>3.5.1</version>
            <configuration>
                <source>1.8</source>
                <target>1.8</target>
            </configuration>
        </plugin>

        <plugin>
            <groupId>org.apache.maven.plugins</groupId>
            <artifactId>maven-assembly-plugin</artifactId>
            <version>2.6</version>
            <configuration>
                <descriptorRefs>
                    <descriptorRef>jar-with-dependencies</descriptorRef>
                </descriptorRefs>
            </configuration>
            <executions>
                <execution>
                    <id>make-assembly</id>
                    
                    <phase>package</phase>
                    <goals>
                        <goal>single</goal>
                    </goals>
                </execution>
            </executions>
        </plugin>

    </plugins>
</build>

获取hbase的连接，list出所有的表

Java
package com.doit.day01;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.Admin;
import org.apache.hadoop.hbase.client.Connection;
import org.apache.hadoop.hbase.client.ConnectionFactory;

import java.io.IOException;

/**
* Hbase的java客户端连接hbase的时候，只需要连接zookeeper的集群
* 就可以找到你Hbase集群的位置
* 核心的对象：
* Configuration：HbaseConfiguration.create();
* Connection:ConnectionFactory.createConnection(conf);
* table:conn.getTable(TableName.valueOf("tb_b")); 对表进行操作 DML
* Admin：conn.getAdmin();操作Hbase系统DDL，对名称空间等进行操作
*/
public class ConnectionDemo {
    public static void main(String[] args) throws Exception {
        //获取到hbase的配置文件对象
        Configuration conf = HBaseConfiguration.create();
        //针对配置文件设置zk的集群地址
        conf.set("hbase.zookeeper.quorum","linux01:2181,linux02:2181,linux03:2181");
        //创建hbase的连接对象
        Connection conn = ConnectionFactory.createConnection(conf);

        //获取到操作hbase的对象
        Admin admin = conn.getAdmin();

        //调用api获取到所有的表
        TableName[] tableNames = admin.listTableNames();

        //获取到哪个命名空间下的所有的表
        TableName[] doits = admin.listTableNamesByNamespace("doit");

        for (TableName tableName : doits) {
            byte[] name = tableName.getName();
            System.out.println(new String(name));
        }

        conn.close();
    }
}

获取到所有的命名空间

创建一个命名空间

Java
package com.doit.day01;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.NamespaceDescriptor;
import org.apache.hadoop.hbase.client.Admin;
import org.apache.hadoop.hbase.client.Connection;
import org.apache.hadoop.hbase.client.ConnectionFactory;
import org.apache.hadoop.hbase.protobuf.generated.HBaseProtos;

import java.util.Properties;

/**
* Hbase的java客户端连接hbase的时候，只需要连接zookeeper的集群
* 就可以找到你Hbase集群的位置
* 核心的对象：
* Configuration：HbaseConfiguration.create();
* Connection:ConnectionFactory.createConnection(conf);
* Admin：conn.getAdmin();操作Hbase系统DDL，对名称空间等进行操作
*/
public class CreateNameSpaceDemo {
    public static void main(String[] args) throws Exception {
        //获取到hbase的配置文件对象
        Configuration conf = HBaseConfiguration.create();
        //针对配置文件设置zk的集群地址
        conf.set("hbase.zookeeper.quorum",linux03:2181");
        //创建hbase的连接对象
        Connection conn = ConnectionFactory.createConnection(conf);

        //获取到操作hbase的对象
        Admin admin = conn.getAdmin();

        //获取到命名空间描述器的构建器
        NamespaceDescriptor.Builder spaceFromJava = NamespaceDescriptor.create("spaceFromJava");
        //当然还可以给命名空间设置属性
        spaceFromJava.addConfiguration("author","robot_jiang");
        spaceFromJava.addConfiguration("desc","this is my first java namespace...");
        //拿着构建器构建命名空间的描述器
        NamespaceDescriptor build = spaceFromJava.build();
        //创建命名空间
        admin.createNamespace(build);

        conn.close();
    }
}

创建带有多列族的表

Java
package com.doit.day01;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.NamespaceDescriptor;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.*;
import org.apache.hadoop.hbase.protobuf.generated.TableProtos;

import java.nio.charset.StandardCharsets;
import java.util.ArrayList;
import java.util.Map;
import java.util.Set;

/**
* Hbase的java客户端连接hbase的时候，只需要连接zookeeper的集群
* 就可以找到你Hbase集群的位置
* 核心的对象：
* Configuration：HbaseConfiguration.create();
* Connection:ConnectionFactory.createConnection(conf);
* Admin：conn.getAdmin();操作Hbase系统DDL，对名称空间等进行操作
*/
public class CreateTableDemo {
    public static void main(String[] args) throws Exception {
        //获取到hbase的配置文件对象
        Configuration conf = HBaseConfiguration.create();
        //针对配置文件设置zk的集群地址
        conf.set("hbase.zookeeper.quorum",linux03:2181");
        //创建hbase的连接对象
        Connection conn = ConnectionFactory.createConnection(conf);

        Admin admin = conn.getAdmin();

        //获取到操作hbase操作表的对象
        TableDescriptorBuilder java = TableDescriptorBuilder.newBuilder(TableName.valueOf("java"));

        //表添加列族需要集合的方式
        ArrayList<ColumnFamilyDescriptor> list = new ArrayList<>();
        //构建一个列族的构造器
        ColumnFamilyDescriptorBuilder col1 = ColumnFamilyDescriptorBuilder.newBuilder("f1".getBytes(StandardCharsets.UTF_8));
        ColumnFamilyDescriptorBuilder col2 = ColumnFamilyDescriptorBuilder.newBuilder("f2".getBytes(StandardCharsets.UTF_8));
        ColumnFamilyDescriptorBuilder col3 = ColumnFamilyDescriptorBuilder.newBuilder("f3".getBytes(StandardCharsets.UTF_8));
        //构建列族
        ColumnFamilyDescriptor build1 = col1.build();
        ColumnFamilyDescriptor build2 = col2.build();
        ColumnFamilyDescriptor build3 = col3.build();
        //将列族添加到集合中去
        list.add(build1);
        list.add(build2);
        list.add(build3);

        //给表设置列族
        java.setColumnFamilies(list);
        //构建表的描述器
        TableDescriptor build = java.build();
        //创建表
        admin.createTable(build);

        conn.close();
    }
}

向表中添加数据

Java
package com.doit.day01;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.*;

import java.nio.charset.StandardCharsets;
import java.util.ArrayList;
import java.util.Arrays;

/**
* 注意：put数据需要指定往哪个命名空间的哪个表的哪个rowKey的哪个列族的哪个列中put数据，put的值是什么
*/
public class PutDataDemo {
    public static void main(String[] args) throws Exception {
        //获取到hbase的配置文件对象
        Configuration conf = HBaseConfiguration.create();
      //针对配置文件设置zk的集群地址
        conf.set("hbase.zookeeper.quorum",linux03:2181");
        //创建hbase的连接对象
        Connection conn = ConnectionFactory.createConnection(conf);

        Admin admin = conn.getAdmin();

        //指定往哪一张表中put数据
        Table java = conn.getTable(TableName.valueOf("java"));
        //创建put对象，设置rowKey
        Put put = new Put("rowkey_001".getBytes(StandardCharsets.UTF_8));
        put.addColumn("f1".getBytes(StandardCharsets.UTF_8),"name".getBytes(StandardCharsets.UTF_8),"xiaotao".getBytes(StandardCharsets.UTF_8));
        put.addColumn("f1".getBytes(StandardCharsets.UTF_8),"age".getBytes(StandardCharsets.UTF_8),"42".getBytes(StandardCharsets.UTF_8));

        Put put1 = new Put("rowkey_002".getBytes(StandardCharsets.UTF_8));
        put1.addColumn("f1".getBytes(StandardCharsets.UTF_8),"xiaotao".getBytes(StandardCharsets.UTF_8));
        put1.addColumn("f1".getBytes(StandardCharsets.UTF_8),"42".getBytes(StandardCharsets.UTF_8));

        Put put2 = new Put("rowkey_003".getBytes(StandardCharsets.UTF_8));
        put2.addColumn("f1".getBytes(StandardCharsets.UTF_8),"xiaotao".getBytes(StandardCharsets.UTF_8));
        put2.addColumn("f1".getBytes(StandardCharsets.UTF_8),"42".getBytes(StandardCharsets.UTF_8));

        java.put(Arrays.asList(put,put1,put2));

        conn.close();
    }
}

get表中的数据

Java
package com.doit.day01;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.Cell;
import org.apache.hadoop.hbase.CellUtil;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.*;

import java.nio.charset.StandardCharsets;

/**
* 注意：put数据需要指定往哪个命名空间的哪个表的哪个rowKey的哪个列族的哪个列中put数据，put的值是什么
*/
public class GetDataDemo {
    public static void main(String[] args) throws Exception {
        //获取到hbase的配置文件对象
       Configuration conf = HBaseConfiguration.create();
        //针对配置文件设置zk的集群地址
        conf.set("hbase.zookeeper.quorum",linux03:2181");
        //创建hbase的连接对象
        Connection conn = ConnectionFactory.createConnection(conf);

      //指定往哪一张表中put数据
        Table java = conn.getTable(TableName.valueOf("java"));

        Get get = new Get("rowkey_001".getBytes(StandardCharsets.UTF_8));
//        get.addFamily("f1".getBytes(StandardCharsets.UTF_8));
        get.addColumn("f1".getBytes(StandardCharsets.UTF_8),"name".getBytes(StandardCharsets.UTF_8));
        Result result = java.get(get);
        boolean advance = result.advance();
        if(advance){
            Cell current = result.current();
            String family = new String(CellUtil.cloneFamily(current));
            String qualifier = new String(CellUtil.cloneQualifier(current));
            String row = new String(CellUtil.cloneRow(current));
            String value = new String(CellUtil.cloneValue(current));
           System.out.println(row+","+family+","+qualifier+","+value);
        }

        conn.close();
    }
}

scan表中的数据

Java
package com.doit.day01;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.*;
import org.apache.hadoop.hbase.client.*;

import java.nio.charset.StandardCharsets;
import java.util.Arrays;
import java.util.Iterator;

/**
* 注意：put数据需要指定往哪个命名空间的哪个表的哪个rowKey的哪个列族的哪个列中put数据，put的值是什么
*/
public class ScanDataDemo {
    public static void main(String[] args) throws Exception {
        //获取到hbase的配置文件对象
        Configuration conf = HBaseConfiguration.create();
        //针对配置文件设置zk的集群地址
        conf.set("hbase.zookeeper.quorum",linux03:2181");
        //创建hbase的连接对象
        Connection conn = ConnectionFactory.createConnection(conf);

        //指定往哪一张表中put数据
        Table java = conn.getTable(TableName.valueOf("java"));

        Scan scan = new Scan();
        scan.withStartRow("rowkey_001".getBytes(StandardCharsets.UTF_8));
        scan.withStopRow("rowkey_004".getBytes(StandardCharsets.UTF_8));

        ResultScanner scanner = java.getScanner(scan);
        Iterator<Result> iterator = scanner.iterator();
        while (iterator.hasNext()){
            Result next = iterator.next();
            while (next.advance()){
                Cell current = next.current();
                String family = new String(CellUtil.cloneFamily(current));
                String row = new String(CellUtil.cloneRow(current));
                String qualifier = new String(CellUtil.cloneQualifier(current));
                String value = new String(CellUtil.cloneValue(current));
                System.out.println(row+","+value);
            }
        }

        conn.close();
    }
}

删除一行数据

Java
package com.doit.day02;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.*;

import java.io.IOException;
import java.nio.charset.StandardCharsets;

public class _12_删除一行数据 {
    public static void main(String[] args) throws IOException {
        Configuration conf = HBaseConfiguration.create();
        conf.set("hbase.zookeeper.quorum","linux01");

        Connection conn = ConnectionFactory.createConnection(conf);

        Table java = conn.getTable(TableName.valueOf("java"));

        Delete delete = new Delete("rowkey_001".getBytes(StandardCharsets.UTF_8));

        java.delete(delete);

    }
}

Hbase的客户端操作特别的麻烦 .

使用JAVA客户端连接也是特别麻烦的一个.代码输入繁琐不好记忆

越往后学习会发现前面的内容对后面也非常的重要 . 比如JAVA的内容学习会用影响到会面使用,特别像这种需要建立连接内容的 . 将其他的服务端连接到JAVA .

细水长流,前面的内容会直接的影响后面的使用 . 特别是像我这种记性特别差 . 学了就忘的人 . 经常的整理笔记,写Xmind (思维导图) 加深映像 .

Hbase--02

相关文章