Monthly Archives: December 2009

Cycles of Doom in Batch Processing Workflows

We integrate new data into our databases via a large batch processing workflow. The execution time of this workflow directly affects the time it takes to get new data to our customers, so keeping the runtime small is of paramount importance to us. There’s an interesting effect that can happen which we’ve dubbed the “cycle [...]
Posted in Hadoop, MapReduce | 10 Comments