HDFS - FsImage File

> Database > (Apache) Hadoop > Hadoop Distributed File System (HDFS)

1 - About

The HDFS file system metadata are stored in a file called the FsImage.

It contains:

  • the entire file system namespace
  • the mapping of blocks to files
  • and file system properties
Advertising

3 - Management

3.1 - Location

The FsImage is stored as a file in the NameNode’s local file system.

The location is defined in HDFS - Configuration (hdfs-site.xml). Example:

hdfs-site.xml
<property>
	<name>dfs.namenode.name.dir</name>
	<value>file:/hadoop/data/dfs/namenode</value>
</property>

Example:

hdfs getconf -confKey dfs.namenode.name.dir
/hadoop/hdfs/namenode

3.2 - Modification

Even though it is efficient to read a FsImage, it is not efficient to make incremental edits directly to a FsImage. Instead of modifying FsImage for each edit, the edits are persisted in the Editlog. During the checkpoint the changes from Editlog are applied to the FsImage.

3.3 - Xml

The Offline Image Viewer (OIV) is a tool to dump the contents of hdfs fsimage files to a human-readable format and provide read-only WebHDFS API in order to allow offline analysis and examination of an Hadoop cluster’s namespace.

Example:

hdfs oiv -p XML -i fsimage_0000000000000307728 -o fsimage.xml

Result:

Example: fsimage.xml

Advertising

3.4 - Download

See the option -fetchImage <local directory> of dfsadmin to download the most recent fsimage from the Name Node and saves it in the specified local directory.

Example: We can see that the client make a call to the webHdfs Rest API

hdfs dfsadmin -D "fs.default.name=hdfs://headnode/" -fetchImage .
18/04/09 14:37:40 INFO namenode.TransferFsImage: Opening connection to http://hn0.ax.internal.cloudapp.net:30070/imagetransfer?getimage=1&txid=latest
18/04/09 14:37:40 INFO namenode.TransferFsImage: Image Transfer timeout configured to 60000 milliseconds
18/04/09 14:37:41 INFO namenode.TransferFsImage: Combined time for fsimage download and fsync to all disks took 0.05s. The fsimage download took 0.05s at 108.70 KB/s. Synchronous (fsync) write to disk of /tmp/./fsimage_0000000000000307728 took 0.00s.

3.5 - Version

cat FSIMAGE_HOME/current/VERSION
# cat /hadoop/hdfs/namenode/current/VERSION
#Mon Apr 09 08:57:32 UTC 2018
namespaceID=1498378884
clusterID=CID-f09ee152-c799-471f-8849-ebed190b31fe
cTime=0
storageType=NAME_NODE
blockpoolID=BP-272822339-10.10.6.20-1521626942449
layoutVersion=-63