Apr
22
Hadoop CheatSheet
Introduction We have decided to aggregate in a single post the most important things to know about hadoop in a concise way. Let’s us know if you have any comments!
Hadoop
########## ## HDFS ## ########## NameNode # => Managing filesystem namespace, if you loose it you have no pointers to your data, you practially lost your data. DataNode # => You know it holds data, installed on each worker. Block # => Each file split to B1,B2,.. where each block size 128MB replication is on blocks. Name node knows that File X is split to B1,B2 and where. ########## ## YARN ## ########## ResourceManager # => Like `NameNode` for computing, tracks NodeManagers and how available they are for work.
Hadoop
########## ## HDFS ## ########## NameNode # => Managing filesystem namespace, if you loose it you have no pointers to your data, you practially lost your data. DataNode # => You know it holds data, installed on each worker. Block # => Each file split to B1,B2,.. where each block size 128MB replication is on blocks. Name node knows that File X is split to B1,B2 and where. ########## ## YARN ## ########## ResourceManager # => Like `NameNode` for computing, tracks NodeManagers and how available they are for work.