Monthly Archives: July 2009

A Glance at the Hadoop Failure Model

Hadoop is designed to be a fault tolerant system. Jobs should be resilient to nodes going down and other random failures. Hadoop isn’t perfect however, as I still see jobs failing due to random causes every now and again. I decided to investigate the significance of the different factors that play into a job failing. A [...]
Posted in Hadoop, MapReduce | Tagged , | 2 Comments