Linux记录-HDFS副本机制
1. 副本策略NameNode具有RackAware机架感知功能,这个可以配置。
(For the common case,when the replication factor is three,HDFS’s placement policy is to put one replica on the local machine if the writer is on a datanode,otherwise on a random datanode,another replica on a node in a different (remote) rack,and the last on a different node in the same remote rack. This policy cuts the inter-rack write traffic which generally improves write performance. The chance of rack failure is far less than that of node failure; this policy does not impact data reliability and availability guarantees. However,it does reduce the aggregate network bandwidth used when reading data since a block is placed in only two unique racks rather than three. With this policy,the replicas of a file do not evenly distribute across the racks. One third of replicas are on one node,two thirds of replicas are on one rack,and the other third are evenly distributed across the remaining racks. This policy improves write performance without compromising data reliability or read performance.) 默认情况下,Hadoop机架感知是没有启用的,需要在NameNode机器的hadoop-site.xml里配置一个选项,例如: <property> <name>topology.script.file.name</name> <value>/path/to/script</value> </property>
这个配置选项的value指定为一个可执行程序,通常为一个脚本,该脚本接受一个参数,输出一个值。接受的参数通常为datanode机器的ip地址,而输出的值通常为该ip地址对应的datanode所在的rackID,例如”/rack1”。Namenode启动时,会判断该配置选项是否为空,如果非空,则表示已经启用机架感知的配置,此时namenode会根据配置寻找该脚本,并在接收到每一个datanode的heartbeat时,将该datanode的ip地址作为参数传给该脚本运行,并将得到的输出作为该datanode所属的机架,保存到内存的一个map中。 官方脚本:https://wiki.apache.org/hadoop/topology_rack_awareness_scripts 脚本: HADOOP_CONF=/root/tmp while [ $# -gt 0 ] ; do nodeArg=$1 exec< ${HADOOP_CONF}/topology.data result="" while read line ; do ar=( $line ) if [ "${ar[0]}" = "$nodeArg" ] ; then result="${ar[1]}" fi done shift if [ -z "$result" ] ; then echo -n "/default/rack " else echo -n "$result " fi done
topology.data : ceph-1 /rack1 ceph-2 /rack2 ceph-3 /rack3 192.168.1.44 /rack1 192.168.1.43 /rack2 192.168.1.42 /rack3 当没有配置机架信息时,所有的机器hadoop都默认在同一个默认的机架下,名为 “/default-rack”,这种情况下,任何一台datanode机器,不管物理上是否属于同一个机架,都会被认为是在同一个机架下,此时,就很容易出现之前提到的增添机架间网络负载的情况。在没有机架信息的情况下,namenode默认将所有的slaves机器全部默认为在/default-rack下,此时写block时,三个datanode机器的选择完全是随机的。 2. 副本数大于datanode数实际副本=datanode数 3. 查看文件存储的文件信息、block信息、block的位置hdfs fsck /lucy/etcd-v3.3.5-linux-amd64.tar.gz -files -blocks -locations 4. HDFS冗余数据块的自动删除在日常维护hadoop集群的过程中发现这样一种情况: hdfs-site.xml文件中有一个参数: <property> <name>dfs.blockreport.intervalMsec</name> <value>10000</value> <description>Determines block reporting interval in milliseconds.</description> </property> 其中3600000为默认设置,3600000毫秒,即1个小时,也就是说,块报告的时间间隔为1个小时,所以经过了很长时间这些多余的块才被删除掉。通过实际测试发现,当把该参数调整的稍小一点的时候(60秒),多余的数据块确实很快就被删除了。
(编辑:李大同) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |