HDFS - Client Connection

> Database > (Apache) Hadoop > Hadoop Distributed File System (HDFS)

1 - About

A client establishes a connection to a configurable TCP port on the NameNode machine. It talks the ClientProtocol with the NameNode.

A Remote Procedure Call (RPC) abstraction wraps both the Client Protocol and the DataNode Protocol.


3 - Client Operations

3.1 - Read

When a client retrieves file contents it perform a data integrity check on the blocks. If the check is negative, the client can opt to retrieve the replica of that block from another DataNode.

3.2 - Write

Lazy Persist writes: The Data Nodes will flush in-memory data to disk asynchronously thus removing expensive disk IO and checksum computations. See Memory Storage Support in HDFS

4 - Type

4.1 - API

4.2 - Web UI

4.3 - Command line


4.4 - Mount

  • NFS gateway, HDFS can be mounted as part of the client’s local file system.