Professional Open Source: Maintaining API, Binary, and Wire Compatibility

We’re in the process of defining some community standards for Akka.NET, part of which is expanding and modernizing our contributor guidelines to help users answer the question “how do I know if my pull request will be merged?” before having to ask anyone on the development team.

Why bother doing any of this? Well, step 1 in the formula for creating sustainable open source software projects is to treat your project with a high degree of professionalism. This is a baseline requirement for gaining the critical mass of end-user adoption you’ll need for any type of financial viability, should you choose to go down that route.

There are a lot of things that go into professionalizing an open source project: documentation, repeatable builds, active issue management, and so on. But the issue we’re going to address today is one of the most difficult to maintain: binary, API, and wire compatibility.

Why Compatibility Matters

“Compatible” is an expectation that if I upgrade from one version of a library to a newer version of the library, I’m either going to experience “none” or “some” breaking changes depending on whether I’m upgrading between a revision or a major version update of the software. This is the idea behind Semantic Versioning (SemVer) - a convention that uses version numbers to explicitly set these expectations for users.

The expectation of compatibility is essential to building trust between the consumers and producers of OSS - some consumers are going to be early adopters who have no problem consuming the latest nightly builds and others are going to be conservative consumers like major banks and health-care companies who have to plan upgrading from one version of a critical dependency to another on a quarterly or biannual basis. Having a consistent, robust versioning / compatibility management strategy is key to satisfying the entire spectrum of consumers, which is one of the things that is needed in order to build the aforementioned critical mass.

Let’s define the three types of compatibility, using Akka.NET as an example:

Compatibility matters because there are real risks and uncertainty that go into upgrading a dependency developed by a third party. If I upgrade to a new version of Akka.NET and suddenly my code can’t compile, learning why that happened and what to do about it presents a new and perhaps totally unexpected cost from me. Users don’t like those types of surprises.

By following a scheme like SemVer and explicitly maintaining strict versioning rules over time, you reduce the uncertainty for consumers by both accurately and meeting their expectations. Projects that do this consistently become less risky over time and are rewarded for it with increased trust and adoption.

Binary and Source Compatibility

Consider the following data type in C#:

/// <inheritdoc />
/// <summary>
/// Represents a "buy"-side event
/// </summary>
public sealed class Bid : IWithStockId, IWithOrderId
{
    public Bid(string stockId, string orderId, decimal bidPrice, 
        double bidQuantity, DateTimeOffset timeIssued)
    {
        StockId = stockId;
        BidPrice = bidPrice;
        BidQuantity = bidQuantity;
        TimeIssued = timeIssued;
        OrderId = orderId;
    }

    public string StockId { get; }

    public decimal BidPrice { get; }

    public double BidQuantity { get; }

    public DateTimeOffset TimeIssued { get; }

    public string OrderId { get; }
}

I want to add support for “margin trading” - the ability to buy / sell equities on credit - to my application. The accounting is different when a trade is executed on margin so I need to have some way of differentiating a margin trade from a cash trade.

I’ll add a new property to my immutable class, so I change the code to the following:

/// <inheritdoc />
/// <summary>
/// Represents a "buy"-side event
/// </summary>
public sealed class Bid : IWithStockId, IWithOrderId
{
    public Bid(string stockId, string orderId, decimal bidPrice, 
        double bidQuantity, DateTimeOffset timeIssued, bool onMargin)
    {
        StockId = stockId;
        BidPrice = bidPrice;
        BidQuantity = bidQuantity;
        TimeIssued = timeIssued;
        OrderId = orderId;
        OnMargin = onMargin;
    }

    public string StockId { get; }

    public decimal BidPrice { get; }

    public double BidQuantity { get; }

    public DateTimeOffset TimeIssued { get; }

    public string OrderId { get; }

    public bool OnMargin { get; }
}

This change is:

Due to changing the method signature on the Bid class constructor, thus any code that depends on this Bid message is going to need to be rewritten to pass in the additional onMargin parameter value.

What if we tried using optional parameters in C# to pass in a default value for onMargin?

/// <inheritdoc />
/// <summary>
/// Represents a "buy"-side event
/// </summary>
public sealed class Bid : IWithStockId, IWithOrderId
{
    public Bid(string stockId, string orderId, decimal bidPrice, 
        double bidQuantity, DateTimeOffset timeIssued, 
        bool onMargin = false) // defaults to cash, which 100% of older trades were
    {
        StockId = stockId;
        BidPrice = bidPrice;
        BidQuantity = bidQuantity;
        TimeIssued = timeIssued;
        OrderId = orderId;
        OnMargin = onMargin;
    }

    public string StockId { get; }

    public decimal BidPrice { get; }

    public double BidQuantity { get; }

    public DateTimeOffset TimeIssued { get; }

    public string OrderId { get; }

    public bool OnMargin { get; }
}

This change is:

This change is source compatible because any methods that previously called Bid’s constructor as-is can still work without being rewritten, but this change is not binary compatible because the method signature of Bid’s constructor has actually changed, even though that’s hidden from the source changes. Thus applications and downstream libraries compiled against an older version of this library may throw a MissingMethodException as the expected signature and actual signature are no longer the same.

So how can we introduce this change in a way that is binary compatible?

/// <inheritdoc />
/// <summary>
/// Represents a "buy"-side event
/// </summary>
public sealed class Bid : IWithStockId, IWithOrderId
{
    public Bid(string stockId, string orderId, decimal bidPrice, 
        double bidQuantity, DateTimeOffset timeIssued) 
        : this(stockId, orderId, bidPrice, bidQuantity,
            timeIssued, false) // default to cash
    {
       
    }

    // new signature
    public Bid(string stockId, string orderId, decimal bidPrice, 
        double bidQuantity, DateTimeOffset timeIssued, 
        bool onMargin) 
    {
        StockId = stockId;
        BidPrice = bidPrice;
        BidQuantity = bidQuantity;
        TimeIssued = timeIssued;
        OrderId = orderId;
        OnMargin = onMargin;
    }

    public string StockId { get; }

    public decimal BidPrice { get; }

    public double BidQuantity { get; }

    public DateTimeOffset TimeIssued { get; }

    public string OrderId { get; }

    public bool OnMargin { get; }
}

This change is binary compatible because:

In addition to that, all of the defaults are safe - you couldn’t make any margin trades in a previous version of the software, so automatically adapting all Bid messages to OnMargin == false is behaviorally compatible as well.

Wire Compatibility

Wire compatibility refers to whether the serialized output from one version of the library is readable by another version of the same library.

Let’s imagine we’re using Json.NET to perform polymorphic serialization between multiple Akka.NET nodes participating in a shared Akka.NET cluster and we introduce a binary-incompatible change into our Bid message:

Breaking wire change inside Akka.NET cluster

N.B.: Polymorphic serialization is the practice of using your application’s types to perform double-duty as your wire types - in .NET polymorphic serialization relies heavily on reflection. It’s a good practice for rapidly prototyping applications but it can also be quite non-performant (reflection is slow,) brittle (changes in your code can easily break wire compat,) and insecure (instantiates types on the fly) in production. You’re almost always better off using schema-based serialization, such as Google Protobuf, as it allows you to explicitly control these factors and separate your concerns appropriately.

When a breaking wire change is introduced, older versions of the software lose the ability to communicate with the newer version because Json.NET’s reflection can’t find a constructor signature that matches the newer type introduced in version v1.2 of the software.

This is an even more severe problem when you’re working with persistent data, such as file formats and database storage - because a break in wire compatibility can result in losing the ability to read years worth of application data.

To evolve the wire format of an application, you have to design a strategy that accomplishes one or both of the following:

  1. Backwards-compatible - older versions of the wire format can be correctly read by newer versions of the library and
  2. Forwards-compatible - older versions of the software can read the wire format produced by newer versions of the library.

Akka.NET is in the unfortunate position of having to maintain both forwards and backwards compatibility because we’re used in live, highly available networking systems and updating those systems typically requires both new and older versions of the software being used concurrently within the same network for brief periods of time during a deployment.

So how do we maintain wire compatibility in both directions?

  1. Adopt extend-only design - I first encountered extend-only design in the world of SQL, where it gave me the ability to introduce changes to our SQL schema well in advance of actually deploying our applications. Extend-only design is simple: you never remove or update anything from previous versions of your wire format - the only changes that are allowed are extensions and additions to what was included before. This means that anything from the V1 of your wire format will also be included into v1.1, v2.0, and beyond. Google Protocol Buffers’s guide to “Updating a Message Type” explains how to do this well when using their format. Extend-only design preserves backwards compatibility.
  2. Adopt the Tolerant Reader pattern - the Tolerant Reader pattern is necessary for forwards-compatibility, as it allows older versions of the software to simply ignore parts of the wire format they don’t recognize or use without failing or faulting. The degree to which this is feasible varies application by application.
  3. Introduce new wire types before you use them - sometimes the Tolerant Reader isn’t quite enough because you don’t want to lose data that can’t be read by older versions of the software. So the solution to this problem is rather straightforward: introduce new wire types and the tools for reading / deserializing those wire types before you introduce the code that produces those types. That way those new wire types can be safely introduced and read without data loss at some point in the future.

When we do all of these things we can maintain backwards compatibility:

Backwards wire format compatibility

And maintain forwards compatibility:

Forward wire format compatibility

For most applications and libraries backwards compatibility is all that you need. But for OSS that is used for constructing distributed systems you need to have a model for handling both forwards and backwards compatibility. The strategy listed above is simple and works well.

Other Types of Compatibility Expectations

There are some other compatibility expectations that matter to end-users, which must be balanced alongside the technical expectations such as wire and binary compatibility:

Behavioral Compatibility

Behavioral compatibility is the expectation that all of the functions in Akka.NET v1.4.10 perform the same role they did as v1.4.11. If you change a default configuration value or an internal implementation of a method in such a way that it radically alters that features’ scope or behavior, you’re effectively introducing a “breaking” change even though that change may be binary and source compatible.

Example: we introduced a change that caused Akka.Cluster.Sharding entity actors to automatically “passivate,” gracefully shut down after persisting their state, into a minor revision of Akka.NET a couple of years ago. This is actually a breaking change even though it’s binary and source compatible - this change should never have been introduced in a revision.

Version Longevity

One other type of expectation is version longevity - “how long can I expect the version of the software I’m on to be maintained?”

i.e. if I adopt Akka.NET v1.4, is that going to be the current major.minor version for the next year? two years? Six months?

The reason why this matters: companies with conservative update cadences will actually slow down the rate at which they upgrade if they expect major version changes to be released frequently.

If I’m on .NET Core 3.1 and it’s working well enough, why would I upgrade to .NET 5 if I know that .NET 6 is coming out in 12 months and .NET 7 12 months after that? Maybe I’ll just stick to doing one big upgrade every three years? The overhead to these conservative adopters is the switching costs between versions and the risk management that goes into making sure the new version doesn’t introduce any new unknowns.

This is why Microsoft is marking some .NET releases as LTS (long-term support) and others as not-LTS - these are indicators to more risk-averse users for which releases they can skip and which ones are worth the costs.

Ultimately, you can’t please everybody when it comes to release plans so it’s not worthwhile to try. Despite that, I’d argue that it’s best to:

  1. Push small patches frequently;
  2. Set a timetable for when major / minor version releases will be available; and
  3. Do a thorough job selling the benefits for upgrading and write a migration guide to help de-risk the upgrade for all users.

Conclusion

Ultimately, we go through the trouble of reasoning about compatibility and versioning in order to help set and meet expectations for end-users. We always want to make sure the experience a user has during an upgrade doesn’t come with unpleasant surprises, so putting the effort into eliminating and preventing those surprises from appearing our OSS is one of the things that will help professionalize it and earn the trust of end-users.

Discussion, links, and tweets

I'm the CTO and founder of Petabridge, where I'm making distributed programming for .NET developers easy by working on Akka.NET, Phobos, and more..