We're hiring! Check out our currently open positions »

tech.CurrencyFair

I’ve been looking forward to writing a post about an interesting technology called Thrift - an exciting framework that helps you separate your application tiers without having to worry about communication between the tiers themselves.

I’ve written up a brief (despite the length of this page ….) introduction to Thrift, and how you may go about using it, with a simple demo application tacked on. There are plenty of code snippets and examples here, so read on, and let me know how you get on in the comments below the post.

What is Thrift?

Thrift, from Facebook (and made available through Apache), is a framework for the easy development of cross-language software services. Using automatically generated code, entities can be seamlessly passed between server and client, without worrying about language differences or the creation of a dedicated or formal internal data transfer API. (Technically, I suppose, this Thrift layer becomes that internal API)

From the Thrift whitepaper (which is a great reference for more detailed information on some of the topics discussed below):

Thrift is a software library and set of code-generation tools developed at Facebook to expedite development and implementation of efficient and scalable backend services.

Its primary goal is to enable efficient and reliable communication across programming languages by abstracting the portions of each language that tend to require the most customization into a common library that is implemented in each language

Thrift uses a combination of various IDL (Interface Definition Langauge) files to build up a model of the base types, entities (which are called Structs in Thrift) and available services that make up your system or application.

Using this model, it can then auto-generate both the client and server interface implementations in a variety of languages, which allows you to get straight into to the coding of your systems, without having to worry about how multiple layers will communicate with each other.

Supported Languages

Thrift supports the following languages:

C++Java
C#OCaml
CocoaPerl
DPHP
DelphiPython
ErlangRuby
HaskellSmalltalk

Thrift Types

The Thrift Type system contains pre-defined base types, user-defined custom types, user-defined structs (which are referred to in this post as entities), containers, services and exceptions.

Base types

TypeDescription
boolA boolean value, true or false
byteA signed byte
i16A 16-bit signed integer
i32A 32-bit signed integer
i64A 64-bit signed integer
doubleA 64-bit floating point number
stringAn encoding-agnostic text or binary string
list <type>A list of entities
set <type>A set of unique entities
map <type1, type2>A map of unique keys (type1) to entities (type2)

Custom types

Custom types can be defined at any point, by using a typedef, which allows you to create any number of custom types, based on either a Base Type, or another Struct at its root.

//Manufacturer isn't a type so lets alias it a string so we can treat it differently in our code if we want to
typedef string Manufacturer

Structs

Although I’ve been been referring to them almost exclusively in this post as entities, a struct can be considered the same as a class, in OO languages, containing a set of strongly typed, uniquely named fields.

struct Note {
    1: required i64    noteId;
    2: required string content;
    3: required string title = "Default title - change me"; // a default value for this string is set
    4: required string authorName;
}

Containers

Consisting of list, set and map, Containers are language-specific implementation of what their names imply:

  • A list is an ordered list of entities, which may contain duplicates
  • A set is an unordered list of unique entities
  • A map is a map of unique keys to (optionally duplicate) entities, referring to a HashMap or associative array in the various target languages

Required/optional

Every field is optional by default, but you should make a habit of explicitly stating this in your IDLs to make things clearer when examining them later.

You should pay attention to this setting, though - any time you want to sent an object from your client code to your server code, you will need to obey these settings. When it comes to transporting the entities across the wire, Thrift will throw an exception if a required field is not set.

Also worth noting, is that when you come to changing/upgrading your Thrift codebase as time goes on, you will likely want to decommision old variables. Once a field is defined as required, it needs to stay like that to support backwards compatability. For this reason, you’re better off going for optional fields, and implementing some form of entity validation in your codebase, if you really want to go for required fields.


Exceptions

Exceptions are, for all intents and purposes, special structs. They support the same field-definition capability as structs do, but extend from a base Exception class in each language, so that they may be treated as Exceptions correctly in those languages.


Services

A service is an uninstantiable abstract class/interface, defined in a similar way to a struct. Thrift auto-generates a server and client implementation of this interface, which is what is used by your application code to communicate with your service.

Putting it all together

In order to explain all of the individual components a bit more clearly, let’s put them together to create a simple, read-only, Twitter-like service, called DanTweets

To begin with, this is defined as a simple PHP front end (client) app, consuming data from a Java backend. (Given the basic nature of this app, converting either tier to another language of your choice should be trivial, so don’t worry if neither of these are your preferred language)

Remember, one of the major benefits of Thrift, is that it allows you to build services in any of the supported languages, without having to worry about how those services are consumed, so it really doesn’t matter what languages we choose for this demonstration!

Installation

First things first, we need to install Thrift.

The official website has more complete install instructions, but if you’re running OSX (which this post was written on) then you can just follow these easy steps:

# Download and extract Thrift
wget http://ftp.heanet.ie/mirrors/www.apache.org/dist/thrift/0.9.0/thrift-0.9.0.tar.gz
tar -zxvf thrift-0.9.0.tar.gz

# Make
./configure
make
sudo make install

Required files

Jars

In order to compile your various class files, you’ll need to download a couple of Jars. Go and grab SLF4J and Log4J.

The jars I have in my lib directory are:

  • libthrift-0.9.0.jar (you’ll get this from the Thrift download)
  • log4j-1.2-api-2.0-beta7.jar
  • log4j-api-2.0-beta7.jar
  • log4j-core-2.0-beta7.jar
  • slf4j-api-1.7.5.jar
  • slf4j-log4j12-1.7.5.jar

PHP Thrift Source

You’ll also need to grab the Thrift source from the main site. Place the extracted thrift-0.9.0 directory into your test application directory.


Now, for the code

Now that Thrift is installed, we have the jar files in place and we have the thrift PHP source files. Next, recreate the directory structure below for our application:

.
 \- client
 \- lib
 \- server
 \- thrift
 \- thrift-0.9.0 //this is the thrift source directory
    - lib
       - php
          - lib

So, lets get started with our files…

First up, we need to define our IDLs. These are used as the base for our application communication layer, so we’ll begin with a common library that can be used by our services.

./thrift/Common.thrift

namespace java com.dantweets.common  // defines the Java namespace
namespace php ThriftClient.common  // defines a PHP namespace

// Date isn't a type so lets encode it as a string and decode it back on the other side
// also, note no semicolon
typedef string Date

exception ServiceException {
  1: i32    what,
  2: string why
}

struct User {
 1: required i64    userId;
 2: required string userName;
}

struct Tweet {
  1: required i64    tweetId
  2: required User   user;
  3: required string message;
  4: required Date   created;
  5: optional bool   promoted;
  // another way to define this field could have been:
//5: required bool   promoted = false;
}

Next up, is our service itself:

./thrift/DanTweetService.thrift

  namespace java com.dantweets.server  // defines the Java namespace
  namespace php ThriftClient.Packages.server  // defines a PHP namespace

  include "Common.thrift" //Note that doing this operates as a simple thrift namespace

  // DanTweet Service
  service DanTweetService {
      list<Common.Tweet> getAllTweets() throws (1:Common.ServiceException ex);
      list<Common.Tweet> getTweetsForUser(1:Common.User) throws (1:Common.ServiceException ex); // See the 'Common' Namespace in use here?
  }
}

Now, we can define the client/consumer code:

client/consumer.php

<?php

namespace DanTweet;

$thriftRoot = '../thrift-0.9.0/lib/php/lib';
require_once($thriftRoot . '/Thrift/ClassLoader/ThriftClassLoader.php');
$classLoader = new \Thrift\ClassLoader\ThriftClassLoader();
$classLoader->registerNamespace('Thrift', $thriftRoot);
$classLoader->register();

use ThriftClient\server\DanTweetServiceClient;

require_once '../thrift/gen-php/ThriftClient/common/Types.php';
require_once '../thrift/gen-php/ThriftClient/server/Types.php';
require_once '../thrift/gen-php/ThriftClient/server/DanTweetService.php';

$host = 'localhost';
$port = '4000';

echo "Connecting to {$host}:{$port}...\n";
try {
    // Create a socket
    $socket    = new \Thrift\Transport\TSocket($host, $port);
    $transport = new \Thrift\Transport\TBufferedTransport($socket);
    $protocol  = new \Thrift\Protocol\TBinaryProtocol($transport);

    // Instantiate a client
    $client = new DanTweetServiceClient($protocol);

    $transport->open(); //open the transport
    $tweetList = $client->getAllTweets(); //get the total list of tweets
    $transport->close(); //close the transport

    //dump the count of tweets
    echo "All-tweets count:\t" . count($tweetList) . "\n";

    //there should be 3 of them
    foreach ($tweetList as $tweet) {
	echo "{$tweet->user->userName}: {$tweet->message}\n";
    }

    echo "-------------------------------------------------\n";

    $transport->open();
    $user = $client->getUserById(1); //get user by id
    $transport->close();

    $transport->open();
    $userTweets = $client->getTweetsForUser($user); //get the total list of tweets for this user
    $transport->close();

    echo "Tweet count for " . $user->userName . ":\t" . count($userTweets) . "\n";

    //there should be 2 of them
    foreach ($userTweets as $tweet) {
	echo "{$tweet->user->userName}: {$tweet->message}\n";
    }
} catch (\Thrift\Exception\TException $e) {
    echo 'Something went horribly, horribly wrong: ' . $e->getMessage();
}

And finally, our 2 java files that make up our service layer:

./server/Server.java

import org.apache.thrift.*;
import org.apache.thrift.server.*;
import org.apache.thrift.transport.*;
import com.dantweets.server.*;
import com.dantweets.common.*;

public class Server {
    public static void main(String[] args) {
        try {
            DanTweetServiceHandler danTweetServiceHandler = new DanTweetServiceHandler();

            int port = 4000;

            TServerSocket serverTransport = new TServerSocket(port);
            TThreadPoolServer.Args threadServer = new TThreadPoolServer.Args(serverTransport);
            TServer server = new TThreadPoolServer(threadServer.processor(danTweetServiceHandler.getProcessor()));

            System.out.println(" - Starting server on port " + port);
            server.serve();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

./server/DanTweetServiceHandler.java

import org.apache.thrift.TException;
import org.apache.thrift.TProcessor;

import com.dantweets.common.ServiceException;
import com.dantweets.common.Tweet;
import com.dantweets.common.User;
import com.dantweets.server.DanTweetService;

import java.util.List;
import java.util.ArrayList;
import java.util.Date;
import java.util.Map;
import java.util.HashMap;

public class DanTweetServiceHandler implements DanTweetService.Iface {
    private List<Tweet> tweets = new ArrayList<Tweet>();
    private Map<Integer, User> users = new HashMap<Integer, User>();

    public DanTweetServiceHandler () {
        /**
         * For demo purposes, we'll pre-define a list of 3 users on our system
         * We'll also create 3 fake tweets (for 2 of our users in total)
         */
        String\[\] names = {"daniel", "john", "paul"};

        int count = 1;

        for (String name : names) {
            User user = new User();
            user.setUserId(count++);
            user.setUserName(name);

            Integer key = new Integer((int)user.getUserId());
            users.put(key, user);
        }

        Tweet t1 = new Tweet();
        t1.setTweetId(1);
        t1.setUser(users.get(new Integer(1)));
        t1.setMessage("Dan's first tweet");
        t1.setCreated("2013-07-02");
        tweets.add(t1);

        Tweet t2 = new Tweet();
        t2.setTweetId(2);
        t2.setUser(users.get(new Integer(1)));
        t2.setMessage("Dan's second tweet");
        t2.setCreated("2013-07-03");
        tweets.add(t2);

        Tweet t3 = new Tweet();
        t3.setTweetId(2);
        t3.setUser(users.get(new Integer(2)));
        t3.setMessage("John's first tweet");
        t3.setCreated("2013-07-04");
        tweets.add(t3);
    }

    public TProcessor getProcessor() {
        return new DanTweetService.Processor(this);
    }

    /**
     * This function, and getTweetsForUser() simply return the pre-defined list of tweets in this clas
     * In a real-world situation, we would actually be querying Twitter from this point in the code, to
     * get the data we wanted
     *
     * For demo purposes, it's simpler to just return some data so we can see how it all works, without
     * overcomplicating things
     */
    public List<Tweet> getAllTweets() throws ServiceException, TException {
        return tweets;
    }

    public List<Tweet> getTweetsForUser(User user) throws ServiceException, TException {
        List<Tweet> userTweets = new ArrayList<Tweet>();

        for (Tweet tweet : tweets) {
            if (tweet.getUser().getUserId() == user.getUserId()) {
                userTweets.add(tweet);
            }
        }

        return userTweets;
    }

    public User getUserById(long userId) throws ServiceException, TException {
        Integer key = new Integer((int)userId);

        if (users.containsKey(key)) {
            return users.get(key);
        }

        return null;
    }
}

Build the Thrift files

Finally, you’ve created the 5 source files required for this demo, and are ready to test.

Generate the thrift files

cd ./thrift
for i in *.thrift; do thrift --gen java:server $i; done
for i in *.thrift; do thrift --gen php:oop $i; done

Providing you’ve installed Thrift correctly, this should generate 2 directories - gen-java and gen-php, containing both the server and client files for Java, and the client files for PHP.

It’s these files that we will be using in our small application, so remember, if you make any changes to your IDL files, you will need to re-run this generation process!

Oh - you won’t need to move these files anywhere for this demo. All of the example code, compilation and execution commands take into account these auto-generated filenames.

Compile, then run, the java service code

cd ./server
export CLASSPATH=../lib/libthrift-0.9.0.jar:../lib/slf4j-api-1.7.5.jar:../lib/slf4j-log4j12-1.7.5.jar:../thrift/gen-java/
javac *.java
java Server

At this point, you should see the output from your server code: - Starting server on port 4000

Run your client code

cd ./client
php consumer.php

Now, if everything is working as expected, you should see the following output:

Connecting to localhost:4000...
All-tweets count:   3
daniel: Dan's first tweet
daniel: Dan's second tweet
john: John's first tweet
-------------------------------------------------
Tweet count for daniel: 2
daniel: Dan's first tweet
daniel: Dan's second tweet

Recap

So, what have we managed to do?

We have:

  • Learned about, and installed Thrift
  • Created 2 IDL files
    • A common IDL (you could consider it a library)
    • And a service IDL - remember, you can create as many services as you want within this file, or within individual files. You just need to instantiate them on the client and server side to use them as you would expect. (the next Thrift post will cover more advanced usages of services, so stay tuned!)
  • Created a basic Java service layer, that returns lists of tweets, filtered lists of tweets, and users
  • Created a basic PHP consumer/client layer, that queries the service tier for data and displays it

What next?

Well, you have now created an end-to-end test implementation of Thrift, in 2 different languages! Granted, there are more considerably complex approaches that could (and, really, should) be taken if you’re going to apply any of this into a real-world application, but this should give you a good taster for how Thrift works at a very basic level.

Let me know in the comments if you have any questions about Thrift, or how you may implement it in your own projects. It’s not an easy thing to get your head around at first, but believe me, once you integrate it into your systems and workflow, you’ll realise just how useful this sort of approach can be.

Finally, if you’re looking for more information on Thrift, make sure to checkout the official Thrift website, or you can read the excellent Thrift: The Missing Guide by Diwaker Gupta.

Until next time,

Daniel.


  • tools
  • process
  • development
  • thrift

blog comments powered by Disqus