Hadoop Note 3: Core

Java Virtual Machines

  1. Haddop processes run in separate JVMs
  2. JVMs do not share state- different Hadoop processes run in separate JVMs.
  3. JVM processes differ between Hadoop 1.0 and 2.0

Hadoop File Systems

  • HDFS (Hadoop Distributed File System)
    • Distributed or pseudo-distributed
  • Regular file system
    • Standalone
  • Cloud file systems
    • AWS:S3, Azure:BLOB

File and JVMs

  1. Single node: Local file system + Single JVM
  2. Pseudo-distributed: Uses HDFS + JVM daemons run processes


The image shows a fully-distributed mode with three separate physical servers. Each server we have various daemons and they’re represented in green.

We have some other daemons that control the data distribution, the Name Node and the Data Nodes. These are separate JVM’s and they do not share a state.

Leave a Reply

Your email address will not be published. Required fields are marked *