Thrift Union Pattern

At Rapleaf, we use Thrift structs as the basic cornerstone of many of our processes. I won’t go into great detail talking about Thrift in general, but suffice it to say that Thrift is easy and flexible enough to be used as the primary means of storing data and communicating between our various components.

One limitation of Thrift is that it supports neither polymorphism nor variant types. Superficially, this would seem to indicate that some use cases would be cumbersome to implement. For instance, a service with a generic “processMessage” method couldn’t work, since you’d need a method per message type. However, we’ve found a design pattern that allows us to get the functionality we want very easily with a minimum amount of complexity, which we’ve been calling a “Thrift union”.

Let’s look at an example Thrift file:

struct StructA {
  // StructA-specific fields
}

struct StructB {
  // StructB-specific fields
}

struct StructAOrStructB {
  1: required i32 specifc_struct_type;

  2: optional StructA struct_a;
  3: optional StructB struct_b;
}

StructAOrStructB contains two very important parts. First, it contains a bunch of optional fields that could contain any of the possible subtypes. Second, it contains a required field that indicates which of the optional fields should be used. Effectively, the required field encodes the type and name of the whole struct, and by convention, none of the other optional fields are set.

There you have it. What you’ve created is a union-like structure that can only take on the types and names that you’ve pre-allowed. Note that this is not a true variant type (since it can’t contain just any value), but in practice, it seems that a true variant isn’t all that useful. It is much better to predefine what types you can understand up front so that you know what to do in every case; after all, that’s what the Thrift IDL is there to do, right?

This approach seems to perform pretty well, though there is at least a minor performance hit to very large structs with lots of optional fields. I think that serialization speed would roughly double (in Java) if we were clever enough to only serialize the one field that was set. To that end, we’ve opened an issue to create a more proper union implementation in Thrift itself.

  • Facebook
  • HackerNews
  • Reddit
  • Twitter
  • del.icio.us
  • Digg
  • Slashdot
  • StumbleUpon

Follow Bryan on Twitter: @bryanduxbury

This entry was posted in Miscellaneous, Thrift. Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.

One Trackback

  1. [...] a previous post, we discussed the Thrift Union pattern of struct definition. To quickly summarize, the benefits are simplicity, flexibility, and low disk [...]

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

  • Rapleaf Is Hiring!

    We are looking for engineers who want to solve challenging problems.

    We have great people, do great work, and have great perks.

    Know someone who might be interested? Refer a friend and get $5,000 for successful hires.

    See our current openings at
    www.rapleaf.com/careers