Monthly Archives: August 2010

Analyzing some interesting networks for Map/Reduce clusters

In a previous post, I described how Rapleaf had built a conceptual model for determining the average aggregate peak throughput (which we’ll call T) that a given network architecture could support. This post applies that model to a variety of network topologies you might consider for your cluster. Just as a brief refresher, T represents [...]

Posted in Miscellaneous | 2 Comments

Analyzing network load in Map/Reduce

Hadoop Map/Reduce can put a heavy toll on your network. Just how heavy, though, isn’t obvious. This is an especially important consideration when you are expanding your cluster. Rapleaf recently encountered this situation, and in the process we devised a neat theoretical model for analyzing how network topology affects Map/Reduce. When does Hadoop put the [...]

Posted in Hadoop | Tagged , | Leave a comment

Thrift 0.4.0 Released

During the time it took for us to get everything sorted out for the 0.3.0 release, we accumulated more than enough changes to justify 0.4.0. There are a ton more changes in this release. (You can find the full summary here.) In addition to the usual bug fixes and performance improvements, there are two really [...]

Posted in Miscellaneous | Leave a comment

Cassandra Summit

I had the pleasure of attending Riptano‘s very-well-executed Cassandra Summit on Tuesday. Having always been interested in big databases, not to mention my stint as a committer on HBase, I’ve long been roughly aware that Cassandra exists, but never managed to learn about it very deeply. As such, the Summit provided some really interesting opportunities [...]

Posted in Miscellaneous | Leave a comment

Very Fast Batch Superset Queries

When we wrote our last Anonymouse blog post, we were looking into ways to increase cluster coverage by nondeterministically assigning an entity to more than one cluster. We promised that if we found a way to quickly count the number of people that could be in a cluster, we’d let you know how we did [...]

Posted in Miscellaneous | 1 Comment

Reading files quickly in Java

I came across a really interesting, well-done blog post today about the quickest way to do high-performance file IO in Java. It does a really good job of breaking down the alternatives of how to get bytes into memory, covering both traditional and NIO options in a good amount of detail. It’s a must-read for [...]

Posted in Miscellaneous | Tagged , , | Leave a comment

Thrift 0.3.0 Released

After seven separate release candidates, Thrift 0.3.0 is finally released! This version includes many, many fixes over Thrift 0.2 in areas of stability, features, and performance. If you’ve been holding off on upgrading, then now is the perfect opportunity. You can find the distribution here.

Posted in Miscellaneous | Tagged | Leave a comment
  • Rapleaf Is Hiring!

    We are looking for engineers who want to solve challenging problems.

    We have great people, do great work, and have great perks.

    Know someone who might be interested? Refer a friend and get $5,000 for successful hires.

    See our current openings at
    www.rapleaf.com/careers