Rust Pickling 

by simbo1905

The first thing item to research on my Rust spike is picking. A quick survey of the landscape indicates that serde appears to be the current defacto standard framework. Pickling was an area that I chose to hand code in the TRex Scala implementation. Why? Because Scala’s pickling engine crashed my JVM and wasn’t yet the stage of a stable disk format. I want to have no external dependencies for the inner Paxos library and as few as possible elsewhere. As Scala’s official pickling engine wasn’t ready for prime time, I was on my own. I had some fun writing my own ByteChain based pickling where I wrote a compact binary wire encoding for unsigned integers. With Rust does serde put me in a better place?

Looking at the examples of serde the easy route is to annotate your structs to have the framework generate the serialization and deserialization logic:

#[derive(Serialize, Deserialize)]
pub struct BallotNumber {
pub counter: u32,
pub node_identifier: u32,
pub era: u32
}

I got this going very quickly over on Github.  serde  provides the generic framework to generate the code onto your struct. Different projects then supply specific encoding support. This gives you a good choice of wire formats including BSON.

A quick skim of the documents suggests that you get fast zero copy performance out of the box. You don’t need much imagination to guess how that could work in Rust. Typically you serialize an object into an intermediate structure that is then copied to an output buffer that is flushed to the wire. The intermediate format can borrow the underlying data within the objects you are pickling. Rust’s sophisticated lifetimes model should be able to ensure that the intermediate format doesn’t live longer than the object itself. Rust should also be able to reclaim the space for everything when the longest-lived thing falls out of scope.

Okay knowing the principles is not the same as getting all that working yourself. I am certainly a long way from knowing enough Rust to think that I will have an easy time reading the serde codebase. In the meanwhile, the good news is that I can expect that the serde library does a good job. I can hope that swapping between wire formats provided by 3rd parties won’t break the core serde logic. That’s the theory anyway.

A problem with the easy route is that I don’t want to put annotations from a 3rd party framework onto my core library objects. Luckily a quick skim of the serde documentation shows that it supports writing custom serialization logic by hand. The choice of automatically generated logic or manually created logic is great. I can prototype using the annotations. Once my message structures are stable I can hand crank the serializer logic into a module outside of the core Paxos library. That can be shipped in a separate crate. At the same time as unsigned integers aren’t a valid BSON format I can upgrade them to long values then compact them as byte arrays as I did in Scala. Happy days.

As we will see in the next post Rust also has a great feature to lower the boilerplate overhead of putting the serialisation logic into a separate module.