在HDFS中,提供了fsck命令,用于检查HDFS上文件和目录的健康状态、获取文件的block块信息和位置信息等。

具体命令介绍:

-move: 移动损坏的文件到/lost+found目录下
-delete: 删除损坏的文件
-openforwrite: 输出检测中的正在被写的文件
-list-corruptfileblocks: 输出损坏的块及其所属的文件
-files: 输出正在被检测的文件
-blocks: 输出block的详细报告 (需要和-files参数一起使用)
-locations: 输出block的位置信息 (需要和-files参数一起使用)
-racks: 输出文件块位置所在的机架信息(需要和-files参数一起使用)

例如要查看HDFS中某个文件的block块的具体分布,可以这样写:
hadoop fsck /your_file_path -files -blocks -locations -racks
示例:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#hdfs fsck /tmp/test/input/word.log  -files -blocks  -locations -racks 
Connecting to namenode via http://c12-138:50070
FSCK started by root (auth:SIMPLE) from /10.254.13.141 for path /tmp/test/input/word.log at Wed Aug 10 17:17:45 CST 2016
/tmp/test/input/word.log 70 bytes, 1 block(s): OK
0. BP-1390896613-10.254.12.138-1467366342106:blk_1074295976_555178 len=70 Live_repl=3 [/default/10.254.13.141:50010, /default/10.254.12.139:50010, /default/10.254.12.138:50010]

Status: HEALTHY
Total size: 70 B
Total dirs: 0
Total files: 1
Total symlinks: 0
Total blocks (validated): 1 (avg. block size 70 B)
Minimally replicated blocks: 1 (100.0 %)
Over-replicated blocks: 0 (0.0 %)
Under-replicated blocks: 0 (0.0 %)
Mis-replicated blocks: 0 (0.0 %)
Default replication factor: 3
Average block replication: 3.0
Corrupt blocks: 0
Missing replicas: 0 (0.0 %)
Number of data-nodes: 10
Number of racks: 1
FSCK ended at Wed Aug 10 17:17:45 CST 2016 in 1 milliseconds


The filesystem under path '/tmp/test/input/word.log' is HEALTHY

其中:
0. 表示block个数;
BP-1390896613-10.254.12.138-1467366342106:blk_1074295976_555178表示block id;
len=70 表示该文件块大小;
repl=3 表示该block块的副本数;