Author Archives: takashi

Pseudo-Combiners in Cascading

In order to get maximum performance from MapReduce, you need to minimize the amount of data that you have to transfer around the network.  If nearly your entire input must be transferred from your mappers to your reducers, then you’ll be putting a great deal of stress on your disks and network.  One thing that [...]
Posted in Cascading | Tagged | Leave a comment