Small files are the bane of Hadoop MapReduce. 300GB of data kept in a few files versus thousands of files can cause a 100x performance difference in jobs run over that data. For this reason, it is of paramount importance to keep files on HDFS large. There are many reasons for this. With larger files, [...]
Rapleaf Is Hiring!
We are looking for engineers who want to solve challenging problems.
We have great people, do great work, and have great perks.
Know someone who might be interested? Refer a friend and get $5,000 for successful hires.
See our current openings at
www.rapleaf.com/careers