Hbase table sizes on hdfs are X 4 of actual input file -
i'm new forum , hdfs/hbase.
i have created table in hbase on hdfs. file loaded had 10million record's size of 1gb on windows disk. when file loaded on hdfs, size of table in hdfs is:-
root@narmada:~/agni/hdfs/hadoop-1.1.2# ./bin/hadoop fs -dus /hbase/hdfs_10m hdfs://192.168.5.58:54310/hbase/hdfs_10m 4143809619
can 1 plz reduce size?
table details.
description enabled 'hdfs_10m', {name => 'v', data_block_encoding => 'none', bloomfilter => 'none', replication_scope => '0', true versions => '3', compression => 'none', min_versions => '0', ttl => '2147483647', keep_deleted_cells => 'fa lse', blocksize => '65536', in_memory => 'false', encode_on_disk => 'true', blockcache => 'true'} 1 row(s) in 0.2340 seconds
generally once when load file on top of hdfs divides file blocks of equal size. default block size 64mb. hadoop maintains 3 duplicates of each block, means if want store file of size 1tb on hdfs, need hardware store 3tb. each block stored on 3 different data nodes.
ref:http://hadooptutor.blogspot.com/2013/07/replication.html
if don't need replication of data, place following property in hbase , hadoop config files.
<property> <name>dfs.replication</name> <value>1</value> </property>
Comments
Post a Comment