The full code and documentation is available on Github. Cover photograph by Samyuktha Sridhar.

Introduction


Concurrency is hard. Some languages like Rust are capable of statically detecting concurrency errors, or race conditions, that occur when multiple threads on a single machine simultaneously operate on shared data. But most languages, including Rust, do little to guarantee correctness when confronted with simultaneous operations on data shared across multiple machines, so architects of distributed systems are forced to rely explicitly on unintuitive, error-prone synchronization mechanisms like distributed locks to safely coordinate concurrent actions across a cluster.

Caustic is a robust, transactional programming language for building safe distributed systems. Programs written in Caustic may be distributed arbitrarily, but they will always operate safely on data stored within any transactional key-value store without any explicit synchronization.

Background


A race condition is a situation in which the order in which operations are performed impacts the result. As a motivating example, suppose there exist two machines A and B that each would like to increment a shared counter x. Formally, each machine reads x, sets x' = x + 1, and writes x'. If B reads after A finishes writing, then B reads x' and writes x' + 1. However, if B reads before A finishes writing, then B reads x and also writes x'. Clearly, this is a race condition because the value of the counter (x' + 1 or x') depends on the order in which A and B perform reads and writes. This particular race condition may seem relatively benign. Who cares if two increments were successfully performed, but the effect of only one was recorded? Imagine if the value of x corresponded to your bank balance, and the increments corresponded to deposits. What if your bank only recorded every second deposit? Still don’t care? While race conditions manifest themselves in subtle ways in distributed systems, they can often have catastrophic consequences.

A transaction is a sequence of operations that are atomic, consistent, isolated, and durable. These ACID properties (from which Caustic derives its name!) make transactions a formidible tool for eliminating race conditions.

  • Atomic: Transactions are all-or-nothing. Either all of their operations complete successfuly, or none of them do.
  • Consistent: Transactions must see the effect of all successfully completed transactions.
  • Isolated: Transactions cannot see the effect of in-progress transactions.
  • Durable: Transaction effects are permanent.

If the machines in the previous example had instead transactionally incremented x if and only if the value of x remained unchanged, then whenever B read before A finished writing, B would detect the modification to x by A when writing x' and would fail to complete successfully. Because the value of x now depends only on the number of successful increments and not on the order in which they are applied, the race condition no longer exists.

A key-value store is a data structure that asssociates a unique value to any key. For example, a dictionary is a key-value store that associates a unique definition to any word. Key-value stores are the essence of every storage system; memory is a key-value store that associates a unique sequence of bytes to any address, and databases are key-value stores that associate blobs of data to any primary key. A transactional key-value store is simply a key-value store that supports transactions. While transactions are challenging to correctly implement, there are an enourmous number of storage systems that are capable of handling them. Examples range from software transaction memory solutions for single machines to powerful databases like Cassandra and MySQL for larger clusters.

Clearly, transactions are a useful primitive for building correct distributed systems and there are a plethora of storage systems capable of handling them. However, these transactional storage systems each have their own unique language for specifying transactions that are often lacking in functionality and performance. Recent years have marked an explosion in NoSQL databases, that scale well by shedding functionality. These databases were not popularized because of their query languages, they were in spite of them. Some like CQL and AQL attempt to mimic SQL, but, while similar in name and intent, most fall short of implementing the entire SQL specification. Others like MongoDB and DynamoDB have their own bespoke interfaces that are often so complicated that they require classes. But even SQL is not beyond reproach. In his article “Some Principles of Good Language Design”, CJ Date, one of the fathers of relational databases, outlined a number of inherent flaws in the SQL language including its lack of a canonical implementation and its ambiguous syntax. While these storage systems provide the necessary transactional guarantees that are required to build safe distributed systems, their lack of a robust interface makes it impossible to design nontrivial applications.

Caustic is a powerful and performant programming language for expressing and executing transactions against any transactional key-value store. Caustic couples the robust functionality of a modern programming language with the ACID guarantees of a transactional key-value store, both of which are necessary to architect correct distributed systems.

Runtime


The Caustic Runtime is a virtual machine that executes transactions on any transactional key-value store that supports: get and cput. The get operation retrieves the values of a set of keys, and the cput, or conditional put, operator updates the values of a set of keys if and only if a set of dependent keys remain unchanged. Internally, the Caustic Runtime uses Multiversion Concurrency Control to detect modifications to keys. Each key is associated with a version number that is incremented on each update. Transactions are executed through partial evaluation. The runtime reads all the keys that accessed or modified by the transaction, and then evaluates all other operations. Because reads may be nested arbitrarily (in the case of a pointer dereference for example), the runtime may require multiple iterations of partial evaluation to completely evaluate a transaction. Because database reads are batched together, the runtime is guaranteed to perform a minimal number of roundtrips to and from the database which should significantly improve performance of most nontrivial programs.

The runtime also integrates intermediate write-through caching for any non-transactional key-value store that supports: fetch, update and invalidate. The fetch operation retrieves the cached values of a set of keys, the update operation changes the values of a set of keys, and the invalidate operation removes a set of keys from cache. Multiversion Concurrency Control allows the runtime to cache incoherently without sacrificing data integrity. The runtime speculates about the value of a key by reading a potentially stale version from cache, and then validates the cached version number on commit. The runtime automatically maintains cache coherency by evicting cached key-value pairs that cause version conflicts on commit!

The runtime may operate as a distributed cluster of transaction execution units. Runtimes register themselves in ZooKeeper and are automatically discoverable by clients who can remotely execute transactions over Thrift.

Compiler


Caustic transactions are composed of exactly 29 different operations and 4 different types. These operations include read and write, which retrieve and update the value of a database key, branch, which performs a conditional branch, repeat which performs a conditional loop, and arithmetic, logical, and comparison operations like add, equal, and less. Crucially, every feature of the Caustic programming language can be represented by these 29 operations and 4 types, and so everything in Caustic is transactional and race-free.

While very much a work in progress, the Caustic programming language is still packed with features. It has all the common constructs like pointers, loops, conditionals, variables, functions, and objects. It is statically and structurally typed, but features aggressive type inference. It also compiles into a Scala library that is compatible with all existing frameworks, tooling, and infrastructure for the JVM. Consider the following example of a distributed counter written in Caustic. This program can be executed without modification on any transactional key-value store, and run simulatenously without error on any number of machines.

module caustic.example

/**
 * A total quantity.
 * 
 * @param value Current total.
 */
record Total {
  value: Int
}

/**
 * A distributed counting service.
 */
service Counter {
  
  /**
   * Increments the total and returns the current value.
   * 
   * @param x Total reference.
   * @return Current value.
   */
  def increment(x: Total&): Int = {
    if x.value {
      x.value += 1
    } else {
      x.value = 1
    } 
  }

}