HDFS Quotas Guide官网链接2.9.2版本
HDFS Quotas Guide官网链接2.7.1版本
简介:Hadoop提供两种配额(quota)模式,name quota和space quota。name quota限制路径的文件(含文件夹)数目,space quota限制的”磁盘”占用空间。可以通过此功能实现团队的存储使用管理。
使用方法
- Set the name quota to be N for each directory. Best effort for each directory, with faults reported if N is not a positive long integer, the directory does not exist or it is a file, or the directory would immediately exceed the new quota.
$ hadoop dfsadmin -setQuota <max_number> <directory>
- Remove any name quota for each directory. Best effort for each directory, with faults reported if the directory does not exist or it is a file. It is not a fault if the directory has no quota.
$ hadoop dfsadmin -setSpaceQuota <max_size> <directory>
- Remove any name quota for each directory. Best effort for each directory, with faults reported if the directory does not exist or it is a file. It is not a fault if the directory has no quota.
$ hadoop dfsadmin -clrQuota <directory>
- Remove any space quota for each directory. Best effort for each directory, with faults reported if the directory does not exist or it is a file. It is not a fault if the directory has no quota.
$ hadoop dfsadmin -clrSpaceQuota <directory>
Name Quota示例
首先测试name quota,进行如下操作:
- 准备hdfs路径,设置name quota,文件数量限制为5
$hdfs dfs -mkdir /name_quota
$hdfs dfsadmin -setQuota 5 /name_quota
$touch 1
$hdfs dfs -copyFromLocal 1 /name_quota/1
$hdfs dfs -copyFromLocal 1 /name_quota/2
$hdfs dfs -copyFromLocal 1 /name_quota/3
$hdfs dfs -copyFromLocal 1 /name_quota/4
$hdfs dfs -copyFromLocal 1 /name_quota/5
copyFromLocal: The NameSpace quota (directories and files) of directory
/name_quota is exceeded: quota=5 file count=6
创建一个临时文件“1“,并将它复制到指定hdfs路径下,发现,当复制到第5个时失败,提示显示达到quota限制,说明quota生效,同时,我们设置quota_max_num为5,但只放了4个文件就满了,说明这个数字限制是包含文件夹的。
实际上是可以理解的,在HDFS中,无论文件和文件夹的基本抽象是一样的,占用基本同样的元数据信息,name quota实际上是元数据信息的配额。
Space Quota示例
- 创建测试路径并设置space quota
$hdfs dfs -mkdir /space_quota
$hdfs dfsadmin -setSpaceQUota 50m /space_quota
$mkfile 10m 10m
$hdfs dfs -copyFromLocal 10m /space_quota/10m_1
利用mkfile命令创建一个大小10m的文件(我用的是Mac,Linux系统使用dd命令创建指定大小文件),预期是在这个hdfs路径下可以存放5个10m的文件,第六个会超出,但实际情况是,第一个就失败了,报错如下:
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.DSQuotaExceededException): The DiskSpace quota of /space_quota is exceeded: quota = 52428800 B = 50 MB but diskspace consumed = 134217728 B = 128 MB
看提示就理解,因为HDFS是以block为单位进行实际存储的,对于一个10m的文件,仍然会打成一个block存储,我的HDFS block size是128m,所以space quota配置50m,一个block也放不下,自然失败,因此进行一下改动,重新设置:
$hdfs dfsadmin -setSpaceQUota 1024m /space_quota
$mkfile 128m 128m
$ hdfs dfs -copyFromLocal 128m /space_quota/128m_1
$ hdfs dfs -copyFromLocal 128m /space_quota/128m_2
$ hdfs dfs -copyFromLocal 128m /space_quota/128m_3
$ hdfs dfs -copyFromLocal 128m /space_quota/128m_4
$ hdfs dfs -copyFromLocal 128m /space_quota/128m_5
$ hdfs dfs -copyFromLocal 128m /space_quota/128m_6
$ hdfs dfs -copyFromLocal 128m /space_quota/128m_7
$ hdfs dfs -copyFromLocal 128m /space_quota/128m_8
$ hdfs dfs -copyFromLocal 128m /space_quota/128m_9
1024m的配额,可以放8个128m文件,在放第九个时,报错,核心信息如下:
18/05/18 16:13:01 WARN hdfs.DFSClient: DataStreamer Exception org.apache.hadoop.hdfs.protocol.DSQuotaExceededException: The DiskSpace quota of /space_quota is exceeded: quota = 1073741824 B = 1 GB but diskspace consumed = 1207959552 B = 1.13 GB
符合预期,成功限制了规定路径下的存储空间配额。
另外,经过测试(示例不详述),name quota和space quota可以同时作用于同一路径。