Ashwin MadavanThe work and thoughts of a programmer
http://ashwin153.github.io/
Tue, 10 Apr 2018 16:59:02 +0000Tue, 10 Apr 2018 16:59:02 +0000Jekyll v3.7.3Paxos Made Simpler<p>The only certainty in distributed systems is that machines will fail.
<a href="http://www.sns.ias.edu/pitp2/2012files/Probabilistic_Logics.pdf">The key challenge in building fault-tolerant distributed systems is
constructing reliable systems from unreliable components.</a>
Distributed databases replicate data across a cluster of
machines. By storing copies in different places, distributed databases
are tolerant of individual machine failures.</p>
<p>Replication solves the problem of fault-tolerance, but creates another;
the various replicas must be kept consistent with each other. A naïve
approach is to require all replicas to first agree to apply any
modifications made to the database. This would ensure that all replicas
remained identical. However, this approach is not fault-tolerant. If any
replica were to fail, then no modifications could ever be made to the
database until it recovered. Instead, fault-tolerant distributed
databases require only a majority of replicas to reach agreement before
modifications can be safely applied. This ensures that any majority of
replicas will contain at least one that has the latest copy of the
database while remaining tolerant of a minority of individual machine
failures. Reaching majority agreement in a distributed system is known
as consensus, and it is a relatively well-studied problem. In this
section, we explore different consensus algorithms culminating with the
approach taken in Beaker.</p>
<h1 id="terminology">Terminology</h1>
<hr />
<p>We define a <strong>proposal</strong> as a candidate operation on a group of replicas
and a <strong>proposer</strong> as the replica that initiates consensus on a
proposal. Over the course of this discussion, we will gradually refine
this definition of a proposal until we arrive at the one used by Beaker.</p>
<p>Replicas communicate by sending messages. We assume that delivered
messages cannot be reordered; if message <script type="math/tex">A</script> was delivered before
message <script type="math/tex">B</script>, then <script type="math/tex">A</script> was sent before <script type="math/tex">B</script>. In practice, this assumption
is satisfied by most networking protocols including TCP.</p>
<h1 id="two-phase-commit">Two-Phase Commit</h1>
<hr />
<p>In Two-Phase Commit, the proposer <em>prepares</em> a proposal by first
acquring locks on a majority of replicas. If it successfully acquires
all locks, the proposer informs all replicas to <em>learn</em> the proposal.
When a proposal is learned, its operation is applied and all locks are
subsequently released.</p>
<p>Two-Phase Commit is not fault-tolerant. If the proposer were to fail
after it successfully prepared a proposal but before it requested that
it be learned, the locks that it acquired would never be released and no
new proposals could ever be learned. We will see in Paxos how we can
modify the protocol to guarantee fault-tolerance.</p>
<h1 id="paxos">Paxos</h1>
<hr />
<p><a href="http://doi.acm.org/10.1145/279227.279229">Paxos</a> makes two modifications to Two-Phase Commit to address its
fault-intolerance. First, it associates each proposal with a
monotonically-increasing, globally-unique <strong>ballot</strong> number. Proposals
are totally-ordered and uniquely-identified by their <strong>ballot</strong>. Each
replica keeps track of the latest ballot that it has seen. Second, it
introduces an intermediate <em>accept</em> phase to the protocol. We will see
that this additional phase allows the system to recover from proposer
failure.</p>
<p>In Paxos, the proposer prepares a proposal by assigning it a ballot
greater than any it has seen and sending it to all replicas. If the
ballot is greater than the latest it has seen, a replica <em>promises</em> not
to accept any proposal less than it and returns any proposal that it has
already accepted. Otherwise, the replica ignores the request.
Intuitively, ballots function as a kind of a lock that the proposer
holds until another proposer prepares a greater ballot. If a majority of
replicas do not respond to its prepare request, the proposer retries
with a greater ballot. Otherwise, the proposer <em>selects</em> a proposal to
be accepted. If any replica returned an accepted proposal, then the
proposer must select the latest accepted proposal and set its ballot to
the one that it prepared. Otherwise, the proposer selects its own
proposal. Intuitively, this allows the system to pick up where it left
off when a proposer fails after convinced a majority to accept its
proposal but before it could be learned. A replica accepts a proposal if
and only if it has not promised not to. When a replica accepts a
proposal, it requests that all replicas learn it. A replica learns a
proposal when a majority of replicas have requested that it be learned.
When a replica is learned, its operation is applied and any accepted
proposals are removed. Intuitively, this allows the system to reset and
begin consensus on another proposal.</p>
<p>Paxos guarantees that all non-faulty replicas will learn proposals in
the same order. Often, this guarantee is unnecessary because a large
number of operations on a distributed system are commutative and so they
may be performed in any order. For example, reads and writes to
different keys in a database may be performed in any order without
compromising consistency. We will see in Generalized Paxos that we can
exploit commutativity to improve performance.</p>
<p><a href="http://doi.acm.org/10.1145/3149.214121">It is known that no deterministic fault-tolerant consensus protocol can
guarantee progress in an asynchronous network.</a> Paxos is no
exception. If a higher ballot is continuously prepared before any
proposal can be accepted, no proposal will ever be learned.
Implementations of Paxos typically elect a distinguished replica, called
a <strong>leader</strong>, to which all other replicas forward their proposals to
guarantee liveness. Whenever leaders fail, replicas run an instance of
Paxos to acquire leadership of the cluster. The reliance on the
existence of a single, stable leader is both a important simplifying
assumption and a performance limitation. If there exists a leader, then
prepare messages are superfluous. Intuitively, the leader implicitly
holds a lock on all replicas because no other replica can initiate
proposals. This allows proposals to be learned in just two message
delays. However, the reliance on the leader to initiate all proposals is
also a significant bottleneck at scale. The entire system moves at the
rate of the leader. In fact, this is the fundamental limitation in
implementations of Paxos like <a href="http://dl.acm.org/citation.cfm?id=1855840.1855851">ZooKeeper</a> and
<a href="http://dl.acm.org/citation.cfm?id=1298455.1298487">Chubby</a>. We will see in Egalitarian Paxos that we can remove
the dependence on a leader to improve performance.</p>
<h1 id="generalized-paxos">Generalized Paxos</h1>
<hr />
<p><a href="https://www.microsoft.com/en-us/research/publication/generalized-consensus-and-paxos/">Generalized Paxos</a> addresses the scalability of
Paxos by exploiting commutativity. An operation <script type="math/tex">A</script> commutes with <script type="math/tex">B</script> if
performing <script type="math/tex">A</script> after <script type="math/tex">B</script> has the same effect as performing <script type="math/tex">B</script> after
<script type="math/tex">A</script>. For example, addition is commutative but division is not;
<script type="math/tex">4 + 3 = 3 + 4</script> but <script type="math/tex">\frac{4}{3} \ne \frac{3}{4}</script>. In fact, most
operations on a distributed database are commutative. Reads commute with
each other and reads and writes to different keys commute.</p>
<p>Generalized Paxos associates each proposal with a sequence of
operations. We say that proposal <script type="math/tex">A</script> is <em>equivalent</em> to proposal <script type="math/tex">B</script> if
all non-commutative operations in <script type="math/tex">A</script> and <script type="math/tex">B</script> are in the same order. All
equivalent proposals have the same effect. Generalized Paxos permits
replicas to learn different proposals as long as they are equivalent.</p>
<p>In Generalized Paxos, proposers do not forward their requests to leader.
Instead, they immediately request that all replicas accept their
proposed operation. A replica appends the operation to their currently
accepted proposal and requests that all replicas learn it. A proposal is
learned when a replica a majority of replicas have requested that it or
an equivalent proposal be learned. If no majority of replicas can agree
on the ordering of non-commutative operations, it is the responsibility
of the leader to select one and to run an instance of Paxos to convince
the other replicas to accept its choice before resuming normal
operation.</p>
<p>Like Paxos, Generalized Paxos relies on the existence of a single,
stable leader to mediate ordering disagreements between replicas and
guarantees that all commutative operations will be learned in two
message delays. Unlike Paxos, it does not require all proposals to
originate from the leader. If most operations are commutative, the
leader will rarely be required to arbitrate. However, the existence of a
leader can still be a scalability bottleneck. We will see in Egalitarian
Paxos that we can remove the dependence on a leader to improve the
performance of the system.</p>
<h1 id="egalitarian-paxos">Egalitarian Paxos</h1>
<hr />
<p><a href="http://doi.acm.org/10.1145/2517349.2517350">Egalitarian Paxos</a> makes a subtle modification to Generalized
Paxos to remove its dependence on a leader. Egalitarian Paxos associates
with proposal with a directed acyclic graph of operations. The benefit
of using a directed acyclic graph is its various strongly connected
components can be performed in parallel. This has huge ramifications for
performance, particularly in databases because reads and writes are
relatively expensive operations.</p>
<p>In Egalitarian Paxos, an operation depends on all accepted proposals for
which it does not commute. The proposer builds a dependency graph for a
proposal from any proposals that it has already accepted and requests
that all replicas accept it. A replica supplements the dependency graph
of the proposal with any proposals that it has accepted and requests
that the result be learned. If no majority of replicas can agree on the
dependency graph of a proposal, it is the responsibility of the proposer
to select one and to run an instance of Paxos to convince the other
replicas to accept its choice before resuming normal operation.</p>
<p>Egalitarian Paxos implicitly assumes that operations are idempotent. An
operation <script type="math/tex">A</script> is idempotent if repeated non-sequential applications of
<script type="math/tex">A</script> have the same effect as a single application of <script type="math/tex">A</script>. For example,
multiplication by one is idempotent but by two is not;
<script type="math/tex">4 * 1 = 4 * 1 * 1</script> but <script type="math/tex">4 * 2 \ne 4 * 2 * 2</script>. In <a href="https://madavan.me/projects/beaker.html">Beaker</a>, we
show how Egalitarian Paxos can be modified to implement a distributed,
fault-tolerant database.</p>
Tue, 10 Apr 2018 04:20:00 +0000
http://ashwin153.github.io/projects/paxos.html
http://ashwin153.github.io/projects/paxos.htmlProjectsBeaker<p>The full code and documentation is available on <a href="https://github.com/ashwin153/beaker">Github</a>. Cover photograph by
<a href="https://samyusridhar.github.io/">Samyuktha Sridhar</a>.</p>
<h1 id="beaker">Beaker</h1>
<p>Beaker is a distributed, transactional key-value store that is consistent and available. Beaker uses
a leader-less variation of <a href="https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/tr-2005-33.pdf">Generalized Paxos</a> to consistently execute transactions. Beaker
permits a minority of failures and hence it is <code class="highlighter-rouge">N / 2</code> fault tolerant. Beaker assumes that
failures are fail-stop. It makes no assumptions about the underlying network except that messages
are received in the order they were sent. Most networking protocols, including <a href="https://en.wikipedia.org/wiki/Transmission_Control_Protocol">TCP</a>, satisfy
this requirement.</p>
<h2 id="introduction">Introduction</h2>
<p>A <strong>database</strong> is a transactional key-value store. Databases map keys to versioned values, called
<strong>revisions</strong>. Revisions are uniquely identified and totally ordered by their version. A
<strong>transaction</strong> depends on the versions of a set of keys, called its <em>readset</em>, and changes the
values of a set of keys, called its <em>writeset</em>. Transactions may be <em>committed</em> if and only if the
versions they depend on are greater than or equal to their versions in the database. Revisions
are monotonic; if a transaction changes a key for which there exists a newer revision, the
modification is discarded. This ensures that transactions cannot undo the effect of other
transactions. We say that a transaction <code class="highlighter-rouge">A</code> <em>conflicts with</em> <code class="highlighter-rouge">B</code> if either reads or writes a
key that the other writes.</p>
<p>A distributed database is a collection of <strong>beakers</strong>. Each beaker maintains its own replica of the
database. In order to maintain consistency across beakers, a majority of beakers must agree to
commit every transaction. Reaching agreement in a distributed system is often referred to as
<a href="https://en.wikipedia.org/wiki/Consensus_(computer_science)">consensus</a>, and it is a relatively well studied problem in computer science. There are a variety
of algorithms that solve this problem, most notably <a href="https://en.wikipedia.org/wiki/Paxos_(computer_science)">Paxos</a>, that have proven to be correct
and performant. Beaker employs a variation of Paxos that has several desirable properties.
First, beakers may simultaneously commit non-conflicting transactions. Second, beakers automatically
repair replicas that have stale revisions. Third, beakers may safely commit transactions as long as
they are connected to at least a majority of their non-faulty peers.</p>
<h2 id="consensus">Consensus</h2>
<p>Beakers reach consensus on <strong>proposals</strong>. A proposal is a collection of non-conflicting
transactions. These transactions may conditionally apply changes or unconditionally repair stale
revisions. Proposals are uniquely identified and totally ordered by a <strong>ballot</strong> number. We say that
a proposal <code class="highlighter-rouge">A</code> <em>conflicts with</em> <code class="highlighter-rouge">B</code> if a transaction that is applied by <code class="highlighter-rouge">A</code> conflicts
with a transaction that is applied by <code class="highlighter-rouge">B</code>. We say that a proposal <code class="highlighter-rouge">A</code> is <em>older than</em>
<code class="highlighter-rouge">B</code> if <code class="highlighter-rouge">A</code> conflicts with <code class="highlighter-rouge">B</code> and <code class="highlighter-rouge">A</code> has a lower ballot than <code class="highlighter-rouge">B</code>. We say that
a proposal <code class="highlighter-rouge">A</code> <em>matches</em> <code class="highlighter-rouge">B</code> if <code class="highlighter-rouge">A</code> applies the same transactions as <code class="highlighter-rouge">B</code>. Proposals
<code class="highlighter-rouge">A</code> and <code class="highlighter-rouge">B</code> may be <em>merged</em> by taking the maximum of their ballots, combining the
transactions they apply choosing the transactions in the newer proposal in the case of conflicts,
and combining their repairs choosing highest revision changes in the case of duplicates.</p>
<p>The leader for a proposal <code class="highlighter-rouge">P</code> first <em>prepares</em> <code class="highlighter-rouge">P</code> on a majority of beakers. If a beaker has
not made a promise to a newer proposal, it responds with a <strong>promise</strong> not to accept any proposal
that conflicts with the proposal it returns that has a lower ballot than <code class="highlighter-rouge">P</code>. If a beaker has
already accepted proposals older than <code class="highlighter-rouge">P</code>, merges them together and returns the result.
Otherwise, it returns the proposal with a zero ballot. If the leader does not receive a majority of
promises, it retries with a higher ballot. Otherwise, it merges the returned promises into a single
proposal <code class="highlighter-rouge">P'</code>. If <code class="highlighter-rouge">P</code> does not match <code class="highlighter-rouge">P'</code>, it retries with <code class="highlighter-rouge">P'</code>. Otherwise, the
leader <em>gets</em> the latest versions of the keys that are read by <code class="highlighter-rouge">P</code> from a majority of beakers.
The leader discards all transactions in <code class="highlighter-rouge">P</code> that cannot be committed given the latest versions,
and sets its repairs to the latest revisions of keys that are read - but not written - by <code class="highlighter-rouge">P</code>
for which the beakers disagree on their version. The leader then requests a majority of beakers to
<em>accept</em> <code class="highlighter-rouge">P</code>. A beaker accepts a proposal if it has not promised not to. When a beaker accepts a
proposal, it discards all older accepted proposals and broadcasts a <strong>vote</strong> for it. We say that a
proposal is <em>accepted</em> if a majority of beakers vote for it. A beaker <em>learns</em> a proposal once a
majority of beakers vote for it. If a beaker learns a proposal, it commits its transactions and
repairs on its replica of the database.</p>
<h3 id="correctness">Correctness</h3>
<p>The proof of correctness relies on the assumption of <em>connectivity</em>, beakers are always connected to
all of their non-faulty peers, and the fact of <em>quorum intersection</em>, any majority of beakers will
contain at least one beaker in common.</p>
<p><strong>Liveness.</strong> An accepted proposal <code class="highlighter-rouge">A</code> will eventually be learned. <strong>Proof.</strong> By quorum
intersection, at least one promise will contain <code class="highlighter-rouge">A</code>. Therefore, <code class="highlighter-rouge">A</code> must be proposed enough
beakers learn <code class="highlighter-rouge">A</code> such that <code class="highlighter-rouge">A</code> is no longer accepted by a majority. By assumption of
connectivity, if any beaker learns a proposal then all beakers will eventually learn it.</p>
<p><strong>Linearizability.</strong> If a proposal <code class="highlighter-rouge">A</code> is accepted, then any conflicting proposal <code class="highlighter-rouge">B</code> that
is accepted after <code class="highlighter-rouge">A</code> will be learned after <code class="highlighter-rouge">A</code>. <strong>Proof.</strong> Because <code class="highlighter-rouge">A</code> was accepted
before <code class="highlighter-rouge">B</code>, the majority that accepted <code class="highlighter-rouge">A</code> before <code class="highlighter-rouge">B</code> will vote for <code class="highlighter-rouge">A</code> before
<code class="highlighter-rouge">B</code>. Because messages are delivered in order, <code class="highlighter-rouge">A</code> will be learned before <code class="highlighter-rouge">B</code>.</p>
<p><strong>Commutativity.</strong> Let <code class="highlighter-rouge">R</code> denote the repairs for an accepted proposal <code class="highlighter-rouge">A</code>. Any accepted
proposal <code class="highlighter-rouge">B</code> that conflicts with <code class="highlighter-rouge">A + R</code> but not <code class="highlighter-rouge">A</code> commutes with <code class="highlighter-rouge">A + R</code>.
<strong>Proof.</strong> Because <code class="highlighter-rouge">B</code> conflicts with <code class="highlighter-rouge">A + R</code> but not <code class="highlighter-rouge">A</code>, <code class="highlighter-rouge">B</code> must read a key
<code class="highlighter-rouge">k</code> that is read by <code class="highlighter-rouge">A</code>. By linearizability, <code class="highlighter-rouge">B</code> must read the latest version of
<code class="highlighter-rouge">k</code> because <code class="highlighter-rouge">B</code> is accepted. Suppose that <code class="highlighter-rouge">B</code> is committed first. Because <code class="highlighter-rouge">B</code> reads
and does not write <code class="highlighter-rouge">k</code>, <code class="highlighter-rouge">A + R</code> can still be committed. Suppose that <code class="highlighter-rouge">A + R</code> is
committed first. Because <code class="highlighter-rouge">A + R</code> writes the latest version of <code class="highlighter-rouge">k</code> and <code class="highlighter-rouge">B</code> reads the
latest version, <code class="highlighter-rouge">B</code> can still be committed.</p>
<p><strong>Consistency.</strong> If a proposal <code class="highlighter-rouge">A</code> is accepted, it can be committed. <strong>Proof.</strong> Suppose there
exists a transaction that cannot be committed. Then, the transaction must read a key for which there
exists a newer version. This implies that there exists a proposal <code class="highlighter-rouge">B</code> that was accepted after
but learned before <code class="highlighter-rouge">A</code> that changes a key <code class="highlighter-rouge">k</code> that is read by <code class="highlighter-rouge">A</code>. By linearizability,
<code class="highlighter-rouge">B</code> cannot conflict with <code class="highlighter-rouge">A</code>. Therefore, <code class="highlighter-rouge">B</code> must repair <code class="highlighter-rouge">k</code>. By commutativity,
<code class="highlighter-rouge">A</code> may still be committed.</p>
<h2 id="reconfiguration">Reconfiguration</h2>
<p>Each beaker is required to be connected to a majority of non-faulty peers in order to guarantee
correctness. However, this correctness condition is only valid when the cluster is static. In
practical systems, beakers may join or leave the cluster arbitrarily as the cluster grows or shrinks
in size. In this section, we describe how <em>fresh</em> beakers are <em>bootstrapped</em> when they join an
existing cluster. When a fresh beaker joins a cluster, its database is initially empty. In order to
guarantee correctness, its database must be immediately populated with the latest revision of every
key-value pair. Otherwise, if <code class="highlighter-rouge">N -+ 1</code> fresh beakers join a cluster of size <code class="highlighter-rouge">N</code> it
is possible for a quorum to consist entirely of fresh beakers.</p>
<p>A naive solution might be for the fresh beaker to propose a read-only transaction that depends on
the initial revision of every key-value pair in the database and conflicts with every other
proposal. Then, the fresh beaker would automatically repair itself in the process of committing this
transaction. However, this is infeasible in practical systems because databases may contain
arbitrarily many key-value pairs. This approach would inevitably saturate the network because for a
database of size <code class="highlighter-rouge">D</code> such a proposal consumes <code class="highlighter-rouge">D * (3 * N / 2 + N * N)</code> in bandwidth.
Furthermore, it prevents any proposals from being accepted in the interim.</p>
<p>We can improve this solution by decoupling bootstrapping and consensus. A fresh beaker joins the
cluster as a non-voting member; it learns proposals, but does not participate in consensus. The
fresh beaker reads the contents of the database from a quorum. It then assembles a repair
transaction and commits it on its replica. It then joins the cluster as a voting member. This
approach consumes just <code class="highlighter-rouge">D * N / 2</code> in bandwidth and permits concurrent proposals.</p>
Tue, 13 Mar 2018 04:20:00 +0000
http://ashwin153.github.io/projects/beaker.html
http://ashwin153.github.io/projects/beaker.htmlProjectsCaustic<p>The full code and documentation is available on <a href="https://github.com/ashwin153/caustic">Github</a>. Cover photograph by
<a href="https://samyusridhar.github.io/">Samyuktha Sridhar</a>.</p>
<h1 id="introduction">Introduction</h1>
<hr />
<p>Concurrency is hard. Some languages like <a href="https://blog.rust-lang.org/2015/04/10/Fearless-Concurrency.html">Rust</a> are capable of statically detecting concurrency
errors, or race conditions, that occur when multiple threads on a single machine simultaneously
operate on shared data. But most languages, including Rust, do little to guarantee correctness when
confronted with simultaneous operations on data shared across multiple machines, so architects of
distributed systems are forced to rely <em>explicitly</em> on unintuitive, error-prone synchronization
mechanisms like <a href="https://en.wikipedia.org/wiki/Distributed_lock_manager">distributed locks</a> to safely coordinate concurrent actions across a cluster.</p>
<p>Caustic is a robust, transactional programming language for building safe distributed systems.
Programs written in Caustic may be distributed arbitrarily, but they will <em>always</em> operate safely on
data stored within <em>any</em> transactional key-value store without <em>any</em> explicit synchronization.</p>
<h1 id="background">Background</h1>
<hr />
<p>A <strong>race condition</strong> is a situation in which the order in which operations are performed impacts the
result. As a motivating example, suppose there exist two machines <code class="highlighter-rouge">A</code> and <code class="highlighter-rouge">B</code> that each
would like to increment a shared counter <code class="highlighter-rouge">x</code>. Formally, each machine reads <code class="highlighter-rouge">x</code>, sets
<code class="highlighter-rouge">x' = x + 1</code>, and writes <code class="highlighter-rouge">x'</code>. If <code class="highlighter-rouge">B</code> reads <em>after</em> <code class="highlighter-rouge">A</code> finishes writing, then
<code class="highlighter-rouge">B</code> reads <code class="highlighter-rouge">x'</code> and writes <code class="highlighter-rouge">x' + 1</code>. However, if <code class="highlighter-rouge">B</code> reads <em>before</em> <code class="highlighter-rouge">A</code> finishes
writing, then <code class="highlighter-rouge">B</code> reads <code class="highlighter-rouge">x</code> and also writes <code class="highlighter-rouge">x'</code>. Clearly, this is a race condition
because the value of the counter (<code class="highlighter-rouge">x' + 1</code> or <code class="highlighter-rouge">x'</code>) depends on the order in which <code class="highlighter-rouge">A</code>
and <code class="highlighter-rouge">B</code> perform reads and writes. This particular race condition may seem relatively benign. Who
cares if two increments were successfully performed, but the effect of only one was recorded?
Imagine if the value of <code class="highlighter-rouge">x</code> corresponded to your bank balance, and the increments corresponded
to deposits. What if your bank only recorded every second deposit? Still don’t care? While race
conditions manifest themselves in subtle ways in distributed systems, they can often have
catastrophic consequences.</p>
<p>A <strong>transaction</strong> is a sequence of operations that are atomic, consistent, isolated, and durable.
These <a href="https://en.wikipedia.org/wiki/ACID">ACID</a> properties (from which Caustic derives its name!) make transactions a formidible
tool for eliminating race conditions.</p>
<ul>
<li><strong>Atomic</strong>: Transactions are all-or-nothing. Either all of their operations complete successfuly,
or none of them do.</li>
<li><strong>Consistent</strong>: Transactions must see the effect of all successfully completed transactions.</li>
<li><strong>Isolated</strong>: Transactions cannot see the effect of in-progress transactions.</li>
<li><strong>Durable</strong>: Transaction effects are permanent.</li>
</ul>
<p>If the machines in the previous example had instead <em>transactionally</em> incremented <code class="highlighter-rouge">x</code> <em>if and
only if the value of <code class="highlighter-rouge">x</code> remained unchanged</em>, then whenever <code class="highlighter-rouge">B</code> read before <code class="highlighter-rouge">A</code> finished
writing, <code class="highlighter-rouge">B</code> would detect the modification to <code class="highlighter-rouge">x</code> by <code class="highlighter-rouge">A</code> when writing <code class="highlighter-rouge">x'</code> and
would fail to complete successfully. Because the value of <code class="highlighter-rouge">x</code> now depends only on the
<em>number</em> of successful increments and not on the <em>order</em> in which they are applied, the race
condition no longer exists.</p>
<p>A <strong>key-value store</strong> is a data structure that asssociates a unique value to any key. For example, a
dictionary is a key-value store that associates a unique definition to any word. Key-value stores
are the essence of every storage system; memory is a key-value store that associates a unique
sequence of bytes to any address, and databases are key-value stores that associate blobs of data to
any primary key. A <strong>transactional key-value store</strong> is simply a key-value store that supports
transactions. While transactions are challenging to correctly implement, there are an enourmous
number of storage systems that are capable of handling them. Examples range from
<a href="https://en.wikipedia.org/wiki/Software_transactional_memory">software transaction memory</a> solutions for single machines to powerful databases like
<a href="https://en.wikipedia.org/wiki/Distributed_lock_manager">Cassandra</a> and <a href="https://en.wikipedia.org/wiki/Database_transaction">MySQL</a> for larger clusters.</p>
<p>Clearly, transactions are a useful primitive for building correct distributed systems and there
are a plethora of storage systems capable of handling them. However, these transactional storage
systems each have their own unique language for specifying transactions that are often lacking in
functionality and performance. Recent years have marked an explosion in NoSQL databases, that scale
well by shedding functionality. These databases were not popularized because of their query
languages, they were <em>in spite of them</em>. Some like <a href="https://docs.datastax.com/en/cql/3.1/cql/cql_intro_c.html">CQL</a> and <a href="https://docs.arangodb.com/3.1/AQL/">AQL</a> attempt to mimic SQL,
but, while similar in name and intent, most fall short of implementing the entire SQL specification.
Others like <a href="https://www.mongodb.com">MongoDB</a> and <a href="https://aws.amazon.com/dynamodb/">DynamoDB</a> have their own bespoke interfaces that are often so
complicated that they require <a href="https://university.mongodb.com/">classes</a>. But even SQL is not beyond reproach. In his article
<a href="https://tinyurl.com/yc7hjvvz">“Some Principles of Good Language Design”</a>, CJ Date, one of the fathers of relational
databases, outlined a number of inherent flaws in the SQL language including its lack of a canonical
implementation and its ambiguous syntax. While these storage systems provide the necessary
transactional guarantees that are required to build safe distributed systems, their lack of a robust
interface makes it impossible to design nontrivial applications.</p>
<p>Caustic is a powerful and performant programming language for expressing and executing transactions
against <em>any</em> transactional key-value store. Caustic couples the robust functionality of a modern
programming language with the ACID guarantees of a transactional key-value store, both of which are
necessary to architect correct distributed systems.</p>
<h1 id="runtime">Runtime</h1>
<hr />
<p>The Caustic Runtime is a virtual machine that executes transactions on any transactional key-value
store that supports: <code class="highlighter-rouge">get</code> and <code class="highlighter-rouge">cput</code>. The <code class="highlighter-rouge">get</code> operation retrieves the values of a set
of keys, and the <code class="highlighter-rouge">cput</code>, or conditional put, operator updates the values of a set
of keys if and only if a set of dependent keys remain unchanged. Internally, the Caustic Runtime
uses <a href="https://en.wikipedia.org/wiki/Distributed_lock_manager">Multiversion Concurrency Control</a> to detect modifications to keys. Each key is associated
with a version number that is incremented on each update. Transactions are executed through
partial evaluation. The runtime reads all the keys that accessed or modified by the transaction, and
then evaluates all other operations. Because reads may be nested arbitrarily (in the case of a
pointer dereference for example), the runtime may require multiple iterations of partial evaluation
to completely evaluate a transaction. Because database reads are batched together, the runtime
is guaranteed to perform a minimal number of roundtrips to and from the database which should
significantly improve performance of most nontrivial programs.</p>
<p>The runtime also integrates intermediate write-through caching for any non-transactional key-value
store that supports: <code class="highlighter-rouge">fetch</code>, <code class="highlighter-rouge">update</code> and <code class="highlighter-rouge">invalidate</code>. The <code class="highlighter-rouge">fetch</code> operation
retrieves the cached values of a set of keys, the <code class="highlighter-rouge">update</code> operation changes the
values of a set of keys, and the <code class="highlighter-rouge">invalidate</code> operation removes a set of keys from cache.
Multiversion Concurrency Control allows the runtime to cache <em>incoherently</em> without sacrificing data
integrity. The runtime speculates about the value of a key by reading a potentially stale version
from cache, and then validates the cached version number on commit. The runtime <em>automatically</em>
maintains cache coherency by evicting cached key-value pairs that cause version conflicts on commit!</p>
<p>The runtime may operate as a distributed cluster of transaction execution units. Runtimes register
themselves in ZooKeeper and are automatically discoverable by clients who can remotely execute
transactions over Thrift.</p>
<h1 id="compiler">Compiler</h1>
<hr />
<p>Caustic transactions are composed of exactly 29 different operations and 4 different types. These
operations include <code class="highlighter-rouge">read</code> and <code class="highlighter-rouge">write</code>, which retrieve and update the value of a database
key, <code class="highlighter-rouge">branch</code>, which performs a conditional branch, <code class="highlighter-rouge">repeat</code> which performs a conditional
loop, and arithmetic, logical, and comparison operations like <code class="highlighter-rouge">add</code>, <code class="highlighter-rouge">equal</code>, and
<code class="highlighter-rouge">less</code>. Crucially, every feature of the Caustic programming language can be represented by
these 29 operations and 4 types, and so everything in Caustic is transactional and race-free.</p>
<p>While very much a work in progress, the Caustic programming language is still packed with features.
It has all the common constructs like pointers, loops, conditionals, variables, functions, and
objects. It is statically and structurally typed, but features aggressive type inference. It also
compiles into a Scala library that is compatible with all existing frameworks, tooling, and
infrastructure for the JVM. Consider the following example of a distributed counter written in
Caustic. This program can be executed without modification on <em>any</em> transactional key-value store,
and run simulatenously without error on <em>any</em> number of machines.</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>module caustic.example
/**
* A total quantity.
*
* @param value Current total.
*/
record Total {
value: Int
}
/**
* A distributed counting service.
*/
service Counter {
/**
* Increments the total and returns the current value.
*
* @param x Total reference.
* @return Current value.
*/
def increment(x: Total&): Int = {
if x.value {
x.value += 1
} else {
x.value = 1
}
}
}
</code></pre></div></div>
Mon, 06 Nov 2017 04:20:00 +0000
http://ashwin153.github.io/projects/caustic.html
http://ashwin153.github.io/projects/caustic.htmlProjectsNightcrawler<p>Our news industry <em>thrives</em> off instability and uncertainty, and it’s because <a href="http://www.roanoke.com/news/wire_headlines/amnesty-up-to-hanged-in-syria-s-slaughterhouse/article_de909053-fa36-5ee0-ae8c-597f774cdb99.html">Up to 13,000 hanged in Syria’s ‘slaughterhouse’
</a> makes for a much better headline than “Everybody Got Along Today.”</p>
<p>Dan Gilroy’s <a href="http://www.imdb.com/title/tt2872718/">Nightcrawler</a>, expertly depicts the sensationalism of modern news. It features Jake Gyllenhaal as a sociopathic freelance news cameraman. Jake Gyllenhaal’s character, Louis Bloom, begins the movie trying to make a quick buck recording local crime; however, he quickly finds out that “if it bleeds, it leads.” The more graphic the content, the larger his paycheck. As the movie progresses, he begins to tamper with crime scenes in order to stage them for the best possible shot. He breaks into people’s homes, films the deaths of peers, and shows no restraint in his pursuit of bloody headlines. The movie climaxes in a triple homicide hit-and-run, in which Louis withholds evidence in order to stage a shootout in a crowded restaurant and capture it on camera.</p>
<p>The movie highlights that the news is not designed to best inform, it’s designed to best perform in the ratings. Each night, the news director at the local news station and Louis’s love interest, Nina, and her crew sit together in a room and decide which stories are newsworthy. Nina admits that they, “find [their] viewers are more interested in urban crime creeping into the suburbs. What that means is a victim or victims, preferably well-off and/or white, injured at the hands of the poor, or a minority.” Nina isn’t giving her audience the news that they need to hear, she’s giving them what they want to hear. She and her team selectively purchase and report on news that fits into their narrative. Even after they realized that the triple homicide was not a suburban home invasion but a drug roberry, Nina refused to air the story because it didn’t fit into her portrait of reality. In a conversation with Nina, Louis says that he’s “focusing on framing. A proper frame not only draws the eye into a picture, but keeps it there longer, dissolving the barrier between the subject and the outside of the frame.” Louis almost exclusively films graphic crime scenes and car accidents, but he wants to “dissolve the barrier” between his audience and his images. He doesn’t want to just portray an image of “urban crime creeping into the suburbs” he wants his audience to <em>feel</em> like they are right there alongside the blood and gore. It’s clear that the purpose of the news today isn’t to inform, it’s to create a visceral reaction that supports whatever narrative is pushed by the media.</p>
<p>Despite being surrounded by news on a daily basis, it seems that, more than ever, people are uninformed. It’s not because they aren’t watching the news or paying attention to current events. It’s because a bunch of people in suits get together each night and decide what the world will talk about tomorrow. Towards the end of the movie, Louis admits to the police detective that he’d “like to think if you’re seeing me you’re having the worst day of your life.” Why can’t the news see us on our best days?</p>
<p>Cover photograph by <a href="https://images5.alphacoders.com/625/625941.png">images5.alphacoders.com</a>.</p>
Tue, 28 Feb 2017 04:20:00 +0000
http://ashwin153.github.io/reviews/nightcrawler.html
http://ashwin153.github.io/reviews/nightcrawler.htmlReviewsPrime Product Compression<h1 id="introduction">Introduction</h1>
<p>This project was inspired by the famous <a href="http://mathforum.org/library/drmath/view/55812.html">Lockers Riddle</a>. Suppose that there are 1000 lockers in a school with 1000 students. If the first student opens each locker, the second student closes every second locker, the third student opens every third locker if it’s closed and closes it if it’s open, and so on and so forth; which lockers will be closed once every student has had their turn? The answer to this puzzle is relatively straightforward - every perfect-square numbered locker will be closed and all others will be open. However, we may generalize this notion of lockers and students to compression of arbitrary sequences of binary data.</p>
<p>We may think of these 1000 lockers as a sequence of bits, or bit string, in which the value of each bit indicates whether or not the corresponding locker is open or closed. Furthermore, we may think of these students as applying periodic bit flips to this bit string - the first student flips every bit, the second second flips every second bit, etc. We may then represent <em>any</em> bit-string as the sequence of periodic bit flips required to generate it. For example, the bit string 0101 may be generated by flipping every second bit of the zero bit string. Therefore, we may represent the four bit string using just the two bits required to encode the number 2.</p>
<h1 id="algorithm">Algorithm</h1>
<p>Given some bit string <script type="math/tex">B = \{0, 1\}^N</script>, allocate some <script type="math/tex">B' = \{0\}^N</script>. For each <script type="math/tex">i = 1, \ldots, N</script>, if <script type="math/tex">B_i \neq B'_i</script> then for all <script type="math/tex">\alpha i \leq N</script> for <script type="math/tex">\alpha \in \mathbb{Z}^{+}</script> set <script type="math/tex">B'_{\alpha i} = \neg B'_{\alpha i}</script> and append <script type="math/tex">i</script> to some sequence <script type="math/tex">X</script>. This procedure corresponds to <script type="math/tex">i^{th}</script> student opening every <script type="math/tex">i^{th}</script> locker if it is closed and closing it if it is open. Clearly, the original bit string <script type="math/tex">B</script> may be recovered by applying the periodic bit flips in <script type="math/tex">X</script> to the zero bit string.</p>
<h2 id="correctness">Correctness</h2>
<p><strong>Theorem:</strong> Upon termination, <script type="math/tex">B = B'</script>. <strong>Proof:</strong> If <script type="math/tex">B_i \neq B'_i</script>, the algorithm flips every <script type="math/tex">i^{th}</script> bit. Therefore, after the <script type="math/tex">i^{th}</script> iteration of the algorithm <script type="math/tex">B_i = B'_i</script>. Because the algorithm iterates over increasing <script type="math/tex">i</script>, future iterations are guaranteed to preserve previous results. Clearly, if <script type="math/tex">j > i</script> and <script type="math/tex">B_j \neq B'_j</script> then <script type="math/tex">\alpha j > i</script> because <script type="math/tex">j > i</script> and <script type="math/tex">\alpha \geq 1</script> so the <script type="math/tex">i^{th}</script> bit is unaffected by the <script type="math/tex">j^{th}</script> iteration. Therefore, upon termination, for each <script type="math/tex">i</script>, <script type="math/tex">B_i = B'_i</script>. This implies that <script type="math/tex">B = B'</script> upon termination.</p>
<h2 id="complexity">Complexity</h2>
<p>The <script type="math/tex">i^{th}</script> iteration of the algorithm may require up to <script type="math/tex">\frac{N}{i}</script> bit flips. Therefore, in the worst case, all <script type="math/tex">N</script> iterations may require up to <script type="math/tex">N \sum_{i=1}^{N} \frac{1}{i}</script> bit flips. Clearly, <script type="math/tex">\sum_{i=1}^{N} \frac{1}{i}</script> is the <script type="math/tex">N^{th}</script> partial sum of a harmonic series, which is approximately equal to <a href="http://math.stackexchange.com/questions/496116/is-there-a-partial-sum-formula-for-the-harmonic-series">O(ln N)</a>. Therefore, the algorithm is <script type="math/tex">O(N ln N)</script>.</p>
<h2 id="implementation">Implementation</h2>
<p>The algorithm encodes a binary string as a sequence of periodic bit flips and is correctly able to decode this sequence of bit flips back into the original binary string. We may encode this sequence of periodic bit flips as a single prime number <script type="math/tex">p = \prod prime(X_i)</script> where <script type="math/tex">prime(n)</script> returns the <script type="math/tex">n^{th}</script> prime number. Clearly, this number can be decomposed into its prime factors to recover the sequence of periodic bit flips. The implementation of the algorithm and the prime product representation is including below.</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="nn">itertools</span> <span class="kn">import</span> <span class="n">izip</span>
<span class="kn">import</span> <span class="nn">random</span>
<span class="kn">import</span> <span class="nn">math</span>
<span class="c"># List of the first 168 prime numbers (http://primos.mat.br/indexen.html).</span>
<span class="n">primes</span> <span class="o">=</span> <span class="p">[</span>
<span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">7</span><span class="p">,</span> <span class="mi">11</span><span class="p">,</span> <span class="mi">13</span><span class="p">,</span> <span class="mi">17</span><span class="p">,</span> <span class="mi">19</span><span class="p">,</span> <span class="mi">23</span><span class="p">,</span> <span class="mi">29</span><span class="p">,</span> <span class="mi">31</span><span class="p">,</span> <span class="mi">37</span><span class="p">,</span> <span class="mi">41</span><span class="p">,</span> <span class="mi">43</span><span class="p">,</span> <span class="mi">47</span><span class="p">,</span> <span class="mi">53</span><span class="p">,</span> <span class="mi">59</span><span class="p">,</span> <span class="mi">61</span><span class="p">,</span> <span class="mi">67</span><span class="p">,</span> <span class="mi">71</span><span class="p">,</span> <span class="mi">73</span><span class="p">,</span> <span class="mi">79</span><span class="p">,</span>
<span class="mi">83</span><span class="p">,</span> <span class="mi">89</span><span class="p">,</span> <span class="mi">97</span><span class="p">,</span> <span class="mi">101</span><span class="p">,</span> <span class="mi">103</span><span class="p">,</span> <span class="mi">107</span><span class="p">,</span> <span class="mi">109</span><span class="p">,</span> <span class="mi">113</span><span class="p">,</span> <span class="mi">127</span><span class="p">,</span> <span class="mi">131</span><span class="p">,</span> <span class="mi">137</span><span class="p">,</span> <span class="mi">139</span><span class="p">,</span> <span class="mi">149</span><span class="p">,</span> <span class="mi">151</span><span class="p">,</span> <span class="mi">157</span><span class="p">,</span> <span class="mi">163</span><span class="p">,</span> <span class="mi">167</span><span class="p">,</span> <span class="mi">173</span><span class="p">,</span>
<span class="mi">179</span><span class="p">,</span> <span class="mi">181</span><span class="p">,</span> <span class="mi">191</span><span class="p">,</span> <span class="mi">193</span><span class="p">,</span> <span class="mi">197</span><span class="p">,</span> <span class="mi">199</span><span class="p">,</span> <span class="mi">211</span><span class="p">,</span> <span class="mi">223</span><span class="p">,</span> <span class="mi">227</span><span class="p">,</span> <span class="mi">229</span><span class="p">,</span> <span class="mi">233</span><span class="p">,</span> <span class="mi">239</span><span class="p">,</span> <span class="mi">241</span><span class="p">,</span> <span class="mi">251</span><span class="p">,</span> <span class="mi">257</span><span class="p">,</span> <span class="mi">263</span><span class="p">,</span> <span class="mi">269</span><span class="p">,</span>
<span class="mi">271</span><span class="p">,</span> <span class="mi">277</span><span class="p">,</span> <span class="mi">281</span><span class="p">,</span> <span class="mi">283</span><span class="p">,</span> <span class="mi">293</span><span class="p">,</span> <span class="mi">307</span><span class="p">,</span> <span class="mi">311</span><span class="p">,</span> <span class="mi">313</span><span class="p">,</span> <span class="mi">317</span><span class="p">,</span> <span class="mi">331</span><span class="p">,</span> <span class="mi">337</span><span class="p">,</span> <span class="mi">347</span><span class="p">,</span> <span class="mi">349</span><span class="p">,</span> <span class="mi">353</span><span class="p">,</span> <span class="mi">359</span><span class="p">,</span> <span class="mi">367</span><span class="p">,</span> <span class="mi">373</span><span class="p">,</span>
<span class="mi">379</span><span class="p">,</span> <span class="mi">383</span><span class="p">,</span> <span class="mi">389</span><span class="p">,</span> <span class="mi">397</span><span class="p">,</span> <span class="mi">401</span><span class="p">,</span> <span class="mi">409</span><span class="p">,</span> <span class="mi">419</span><span class="p">,</span> <span class="mi">421</span><span class="p">,</span> <span class="mi">431</span><span class="p">,</span> <span class="mi">433</span><span class="p">,</span> <span class="mi">439</span><span class="p">,</span> <span class="mi">443</span><span class="p">,</span> <span class="mi">449</span><span class="p">,</span> <span class="mi">457</span><span class="p">,</span> <span class="mi">461</span><span class="p">,</span> <span class="mi">463</span><span class="p">,</span> <span class="mi">467</span><span class="p">,</span>
<span class="mi">479</span><span class="p">,</span> <span class="mi">487</span><span class="p">,</span> <span class="mi">491</span><span class="p">,</span> <span class="mi">499</span><span class="p">,</span> <span class="mi">503</span><span class="p">,</span> <span class="mi">509</span><span class="p">,</span> <span class="mi">521</span><span class="p">,</span> <span class="mi">523</span><span class="p">,</span> <span class="mi">541</span><span class="p">,</span> <span class="mi">547</span><span class="p">,</span> <span class="mi">557</span><span class="p">,</span> <span class="mi">563</span><span class="p">,</span> <span class="mi">569</span><span class="p">,</span> <span class="mi">571</span><span class="p">,</span> <span class="mi">577</span><span class="p">,</span> <span class="mi">587</span><span class="p">,</span> <span class="mi">593</span><span class="p">,</span>
<span class="mi">599</span><span class="p">,</span> <span class="mi">601</span><span class="p">,</span> <span class="mi">607</span><span class="p">,</span> <span class="mi">613</span><span class="p">,</span> <span class="mi">617</span><span class="p">,</span> <span class="mi">619</span><span class="p">,</span> <span class="mi">631</span><span class="p">,</span> <span class="mi">641</span><span class="p">,</span> <span class="mi">643</span><span class="p">,</span> <span class="mi">647</span><span class="p">,</span> <span class="mi">653</span><span class="p">,</span> <span class="mi">659</span><span class="p">,</span> <span class="mi">661</span><span class="p">,</span> <span class="mi">673</span><span class="p">,</span> <span class="mi">677</span><span class="p">,</span> <span class="mi">683</span><span class="p">,</span> <span class="mi">691</span><span class="p">,</span>
<span class="mi">701</span><span class="p">,</span> <span class="mi">709</span><span class="p">,</span> <span class="mi">719</span><span class="p">,</span> <span class="mi">727</span><span class="p">,</span> <span class="mi">733</span><span class="p">,</span> <span class="mi">739</span><span class="p">,</span> <span class="mi">743</span><span class="p">,</span> <span class="mi">751</span><span class="p">,</span> <span class="mi">757</span><span class="p">,</span> <span class="mi">761</span><span class="p">,</span> <span class="mi">769</span><span class="p">,</span> <span class="mi">773</span><span class="p">,</span> <span class="mi">787</span><span class="p">,</span> <span class="mi">797</span><span class="p">,</span> <span class="mi">809</span><span class="p">,</span> <span class="mi">811</span><span class="p">,</span> <span class="mi">821</span><span class="p">,</span>
<span class="mi">823</span><span class="p">,</span> <span class="mi">827</span><span class="p">,</span> <span class="mi">829</span><span class="p">,</span> <span class="mi">839</span><span class="p">,</span> <span class="mi">853</span><span class="p">,</span> <span class="mi">857</span><span class="p">,</span> <span class="mi">859</span><span class="p">,</span> <span class="mi">863</span><span class="p">,</span> <span class="mi">877</span><span class="p">,</span> <span class="mi">881</span><span class="p">,</span> <span class="mi">883</span><span class="p">,</span> <span class="mi">887</span><span class="p">,</span> <span class="mi">907</span><span class="p">,</span> <span class="mi">911</span><span class="p">,</span> <span class="mi">919</span><span class="p">,</span> <span class="mi">929</span><span class="p">,</span> <span class="mi">937</span><span class="p">,</span>
<span class="mi">941</span><span class="p">,</span> <span class="mi">947</span><span class="p">,</span> <span class="mi">953</span><span class="p">,</span> <span class="mi">967</span><span class="p">,</span> <span class="mi">971</span><span class="p">,</span> <span class="mi">977</span><span class="p">,</span> <span class="mi">983</span><span class="p">,</span> <span class="mi">991</span><span class="p">,</span> <span class="mi">997</span>
<span class="p">]</span>
<span class="k">def</span> <span class="nf">simulate</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="n">length</span><span class="p">):</span>
<span class="s">"""
Perform a simulation of n trials of bit strings of the specified length. Prints the
average compression ratio and the percentage of bit strings that were compressable.
:param n: Number of trials.
:param length: Length of bit string.
"""</span>
<span class="k">def</span> <span class="nf">step</span><span class="p">():</span>
<span class="c"># Construct and encode a random sequence of bits.</span>
<span class="n">r</span> <span class="o">=</span> <span class="p">[</span><span class="nb">bool</span><span class="p">(</span><span class="n">random</span><span class="o">.</span><span class="n">getrandbits</span><span class="p">(</span><span class="mi">1</span><span class="p">))</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">length</span><span class="p">)]</span>
<span class="n">e</span> <span class="o">=</span> <span class="n">encode</span><span class="p">(</span><span class="n">r</span><span class="p">)</span>
<span class="c"># Ensure that the decoding is consistent with the original.</span>
<span class="k">assert</span> <span class="n">r</span> <span class="o">==</span> <span class="n">decode</span><span class="p">(</span><span class="o">*</span><span class="n">e</span><span class="p">)</span>
<span class="c"># Calculate the size of the encoding.</span>
<span class="k">return</span> <span class="n">math</span><span class="o">.</span><span class="n">ceil</span><span class="p">(</span><span class="n">math</span><span class="o">.</span><span class="n">log</span><span class="p">(</span><span class="nb">abs</span><span class="p">(</span><span class="n">e</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span> <span class="o">+</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">))</span> <span class="o">+</span> <span class="mi">1</span>
<span class="n">s</span> <span class="o">=</span> <span class="p">[</span><span class="n">step</span><span class="p">()</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">n</span><span class="o">+</span><span class="mi">1</span><span class="p">)]</span>
<span class="k">print</span><span class="p">(</span><span class="s">"Average bits: "</span> <span class="o">+</span> <span class="nb">str</span><span class="p">(</span><span class="nb">sum</span><span class="p">(</span><span class="n">s</span><span class="p">)</span> <span class="o">/</span> <span class="nb">len</span><span class="p">(</span><span class="n">s</span><span class="p">)))</span>
<span class="k">print</span><span class="p">(</span><span class="s">"Compressable: "</span> <span class="o">+</span> <span class="nb">str</span><span class="p">(</span><span class="nb">sum</span><span class="p">(</span><span class="nb">map</span><span class="p">(</span><span class="k">lambda</span> <span class="n">x</span><span class="p">:</span> <span class="mf">1.0</span> <span class="k">if</span> <span class="n">x</span> <span class="o"><</span> <span class="n">length</span> <span class="k">else</span> <span class="mf">0.0</span><span class="p">,</span> <span class="n">s</span><span class="p">))</span> <span class="o">/</span> <span class="nb">len</span><span class="p">(</span><span class="n">s</span><span class="p">)))</span>
<span class="k">def</span> <span class="nf">encode</span><span class="p">(</span><span class="n">bits</span><span class="p">):</span>
<span class="s">"""
Encodes the specified bits as the product of the prime numbers corresponding to the
indices in a bit string of all zeroes or all ones (depending on which is smaller),
that need to be flipped to generate the bits.
:param bits: Bits to encode.
:return: Compressed bits.
"""</span>
<span class="n">copy</span> <span class="o">=</span> <span class="p">[</span><span class="bp">False</span><span class="p">]</span> <span class="o">*</span> <span class="nb">len</span><span class="p">(</span><span class="n">bits</span><span class="p">)</span>
<span class="n">flip</span> <span class="o">=</span> <span class="p">[]</span>
<span class="c"># Determine the necessary bit flips required to copy the bit string.</span>
<span class="k">for</span> <span class="n">i</span><span class="p">,</span> <span class="p">(</span><span class="n">b</span><span class="p">,</span> <span class="n">c</span><span class="p">)</span> <span class="ow">in</span> <span class="nb">enumerate</span><span class="p">(</span><span class="n">izip</span><span class="p">(</span><span class="n">bits</span><span class="p">,</span> <span class="n">copy</span><span class="p">)):</span>
<span class="k">if</span> <span class="n">b</span> <span class="o">!=</span> <span class="n">c</span><span class="p">:</span>
<span class="n">flip</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">i</span><span class="p">)</span>
<span class="k">for</span> <span class="n">j</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">i</span><span class="p">,</span> <span class="nb">len</span><span class="p">(</span><span class="n">copy</span><span class="p">),</span> <span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">):</span>
<span class="n">copy</span><span class="p">[</span><span class="n">j</span><span class="p">]</span> <span class="o">=</span> <span class="ow">not</span> <span class="n">copy</span><span class="p">[</span><span class="n">j</span><span class="p">]</span>
<span class="k">if</span> <span class="ow">not</span> <span class="n">flip</span><span class="p">:</span>
<span class="c"># If there are no bit flips, then return 0.</span>
<span class="k">return</span> <span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">bits</span><span class="p">),</span> <span class="mi">0</span><span class="p">)</span>
<span class="k">else</span><span class="p">:</span>
<span class="c"># Otherwise, return the product of the prime numbers associated with each index.</span>
<span class="n">product</span> <span class="o">=</span> <span class="nb">reduce</span><span class="p">(</span><span class="k">lambda</span> <span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">:</span> <span class="n">x</span> <span class="o">*</span> <span class="n">y</span><span class="p">,</span> <span class="nb">map</span><span class="p">(</span><span class="k">lambda</span> <span class="n">x</span><span class="p">:</span> <span class="n">primes</span><span class="p">[</span><span class="n">x</span><span class="p">],</span> <span class="n">flip</span><span class="p">))</span>
<span class="k">return</span> <span class="p">(</span><span class="nb">len</span><span class="p">(</span><span class="n">bits</span><span class="p">),</span> <span class="n">product</span><span class="p">)</span>
<span class="k">def</span> <span class="nf">decode</span><span class="p">(</span><span class="n">length</span><span class="p">,</span> <span class="n">product</span><span class="p">):</span>
<span class="s">"""
Decode the prime product by finding its prime factorization and performing the
required bit flips on a bit string of all zeroes or all ones (depending on the
sign of the product).
:param length: Length of bit string.
:param product: Prime product.
:return: Decoded bits.
"""</span>
<span class="c"># Find the prime factorization of the product to determine the bit flips.</span>
<span class="k">if</span> <span class="n">product</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span>
<span class="n">flip</span> <span class="o">=</span> <span class="p">[]</span>
<span class="k">else</span><span class="p">:</span>
<span class="n">flip</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="k">if</span> <span class="n">product</span> <span class="o"><</span> <span class="mi">0</span> <span class="k">else</span> <span class="p">[]</span>
<span class="n">flip</span><span class="o">.</span><span class="n">extend</span><span class="p">([</span><span class="n">i</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="n">length</span><span class="p">)</span> <span class="k">if</span> <span class="nb">abs</span><span class="p">(</span><span class="n">product</span><span class="p">)</span> <span class="o">%</span> <span class="n">primes</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">==</span> <span class="mi">0</span><span class="p">])</span>
<span class="c"># Perform the bit flips in order on the bit string.</span>
<span class="n">bits</span> <span class="o">=</span> <span class="p">[</span><span class="bp">False</span><span class="p">]</span> <span class="o">*</span> <span class="n">length</span>
<span class="k">for</span> <span class="n">f</span> <span class="ow">in</span> <span class="n">flip</span><span class="p">:</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">f</span><span class="p">,</span> <span class="n">length</span><span class="p">,</span> <span class="n">f</span> <span class="o">+</span> <span class="mi">1</span><span class="p">):</span>
<span class="n">bits</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="ow">not</span> <span class="n">bits</span><span class="p">[</span><span class="n">i</span><span class="p">]</span>
<span class="k">return</span> <span class="n">bits</span>
</code></pre></div></div>
<h1 id="results">Results</h1>
<p>The algorithm proved to have relatively poor compression ratios for randomly generated bit strings. However, it is possible that for certain kinds of files this compression strategy may prove to be performant. More rigorous validation is definitely required, but initial results are not promising.</p>
<p>Cover photograph by <a href="https://tctechcrunch2011.files.wordpress.com/2016/08/gettyimages-532955831.jpg">Tech Crunch</a>.</p>
Wed, 08 Feb 2017 04:20:00 +0000
http://ashwin153.github.io/projects/compress.html
http://ashwin153.github.io/projects/compress.htmlProjectsBricks<p>The full code and documentation is available on <a href="https://github.com/ashwin153/bricks">Github</a>.</p>
<h1 id="introduction">Introduction</h1>
<p>Facebook Messenger recently launched <a href="https://techcrunch.com/2016/11/29/messenger-instant-games/">Instant Games</a>, a new social gaming platform directly integrated into the Messenger application. However, what should have been a fun competition among friends, quickly became an obsession. Tired of waking up every morning to six new notifications about how my high score in <a href="https://www.gameeapp.com/game/tC8eBB">Brick Pop</a> had been bested the night before, I set about doing what anyone else would have done in my position - writing an automated solver.</p>
<h1 id="mechanics">Mechanics</h1>
<p>Brick pop is similar to other tile-matching games like Bejewled and Candy Crush. The games begins with a 10x10 matrix of bricks each assigned one of six colors (red, teal, yellow, blue, purple, and gray). The goal of the game is to remove adjacent bricks with matching colors until no bricks remain on the board. Each time that a brick is removed, the bricks above it fall down. Each time an entire column of bricks is removed, the bricks to their right move left. For example, the screenshot below is of a sample game.</p>
<p><img src="/img/bricks-game.png" alt="Game" title="Sample Game" /></p>
<h1 id="solver">Solver</h1>
<p>Because the game is relatively small, its sufficient to provide a brute force solution; even the brute force solution terminates in a handful of milliseconds. The brute force algorithm recursively removes adjacent bricks with matching colors and terminates when no bricks remain or all sequences of removals have been tried. By greedily selecting the largest neighborhood of matching color bricks first, the algorithm is likely to find a maximum score path; however, it is not guaranteed to do so.</p>
<ul>
<li><strong>Determine the set of all neighborhoods of similarly colored bricks.</strong> Neighborhoods can be determined using a variation of flood fill, and is guaranteed to visit each brick exactly once.</li>
<li><strong>Sort neighborhoods by size.</strong> Removing more bricks at a time produces an exponentially higher score. By greedily selecting the largest possible neighborhood, we ensure that the algorithm earns the highest possible score without having to try all possibilities.</li>
<li><strong>Remove and recurse.</strong> Remove each neighborhood and recursively try to solve the remaining board, until some sequence of removals produces an empty board or until all sequences of removals have been tried. Because it is impossible to solve a board in which there exists exactly one brick of any particular color, we can also stop recursing when this condition is met.</li>
</ul>
<h1 id="image-processing">Image Processing</h1>
<p>Because manual input of individual brick positions proved to be a extraordinarily difficult task, I wrote some simple image processing code that enables you to upload a screenshot of the game from which brick positions are extracted.</p>
<h1 id="future-work">Future Work</h1>
<p>It would be cool to simulate iPhone touches which would allow the solver to directly play the game without manual intervention.</p>
<p>Cover photograph by <a href="http://wallpaperswide.com/arcade_game_machine-wallpapers.html">Wallpapers Wide</a>.</p>
Wed, 08 Feb 2017 04:20:00 +0000
http://ashwin153.github.io/projects/bricks.html
http://ashwin153.github.io/projects/bricks.htmlProjectsSwara<p>The full code and documentation is available on <a href="https://github.com/ashwin153/swara">Github</a>. Cover photograph by <a href="http://samyusridhar.github.io">Samyuktha Sridhar</a>.</p>
<h1 id="introduction">Introduction</h1>
<p><a href="https://en.wikipedia.org/wiki/Swara">Swara</a> is the South Indian word for a musical note. It’s an appropriate name for a musical machine learning project that attempts to imitate the legends of South Indian music, who leverage their vast and varied musical experiences to extemporaneously produce profoundly intricate <a href="https://en.wikipedia.org/wiki/Ragam_Thanam_Pallavi">sequences of swaras</a>.</p>
<p>Musical machine learning is an area of active research (e.g. <a href="https://magenta.tensorflow.org/welcome-to-magenta">Google Magenta</a> and <a href="https://www.jukedeck.com/">Jukedeck</a>). In this article, I’ll describe my approach to the development of a variety of musical applications and the various musical and machine learning models that enable them.</p>
<h1 id="musical-modeling">Musical Modeling</h1>
<p>The first challenge with Swara was to develop a library, <code class="highlighter-rouge">swara-music</code>, for digitally representing sheet music. The library enables the programmatic construction and modification of the various musical elements of a song while retaining their underlying musical meaning. For example, a waltz tempo may be constructed using the library in the following manner.</p>
<div class="language-scala highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Waltz Tempo.
</span><span class="k">val</span> <span class="n">waltz</span> <span class="k">=</span> <span class="nc">Tempo</span><span class="o">(</span>
<span class="n">signature</span> <span class="k">=</span> <span class="nc">Length</span><span class="o">(</span><span class="mi">4</span><span class="o">,</span> <span class="mi">4</span><span class="o">),</span>
<span class="n">bpm</span> <span class="k">=</span> <span class="mf">80.0</span>
<span class="o">)</span>
</code></pre></div></div>
<p>The <code class="highlighter-rouge">swara-music</code> library currently supports the following musical elements: <code class="highlighter-rouge">Song</code>, <code class="highlighter-rouge">Fragment</code>, <code class="highlighter-rouge">Key</code>, <code class="highlighter-rouge">Tempo</code>, <code class="highlighter-rouge">Phrase</code>, <code class="highlighter-rouge">Voice</code>, <code class="highlighter-rouge">Chord</code>, <code class="highlighter-rouge">Note</code>, <code class="highlighter-rouge">Pitch</code>, and <code class="highlighter-rouge">Length</code>. These simple musical primitives can be combined to form highly complex musical arrangements. For example, this <a href="https://gist.github.com/ashwin153/d86292dbfc346b48d7e8f9e79db463fd">code</a> produces the following fragment of sheet music (rendered using <a href="https://musescore.org/en/2.0">MuseScore 2</a>).</p>
<p><img align="center" src="/img/sample-song.png" /></p>
<p>However, the library is still too unwieldy to be frequently used to directly write music. To facilitate interoperability with existing music writing tools, the library provides adapters for various file formats (midi, json, etc.).</p>
<ul>
<li>Working with midi is notoriously difficult</li>
<li>Cool builder pattern</li>
</ul>
<h1 id="machine-learning">Machine Learning</h1>
<p>The next challenge with Swara was to develop <code class="highlighter-rouge">swara-learn</code>, a library of generalized machine learning models. <strong>Disclaimer:</strong> this library does not pretend to compete with high-performance machine learning implementations like TensorFlow and Caffe. This library is simply the result of genuine curiosity into the inner workings of artificial intelligence.</p>
<h3 id="discrete-markov-chain">Discrete Markov Chain</h3>
<p>A <code class="highlighter-rouge">DiscreteMarkovChain</code> is an unsupervised random process that undergoes transitions from one state to another. They are a special case of a family of stochastic models known as markov models, which are used to model random processes in which future states depend only on some number of previous states and not on any states prior. This “memoryless” property, is more formally, the requirement that for some discrete sequence <script type="math/tex">x_0, \ldots, x_n</script> the probability <script type="math/tex">P(x_n \vert x_{n-1}, \ldots, x_{0}) = P(x_n \vert x_{n-1}, \ldots, x_{k}</script> for some <script type="math/tex">k > 0</script>.</p>
<p>The implementation of discrete markov chains is built off a simple, but high-performance custom <a href="https://en.wikipedia.org/wiki/Trie">Trie</a> and, unlike other implementations, does not require an explicit definition of the state space and the various transition probabilities between states. The <code class="highlighter-rouge">DiscreteMarkovChain</code> is fully thread-safe; therefore, it may be simultaneously be trained and used.</p>
<h3 id="hidden-markov-model">Hidden Markov Model</h3>
<p>A <code class="highlighter-rouge">HiddenMarkovModel</code> is a supervised learning technique that models a system in which every sequence of observations <script type="math/tex">O_1, \ldots, O_n</script> is generated by a Markov process whose state <script type="math/tex">H_t</script> is hidden from the observer. Hidden markov models are especially useful for sequence prediction (e.g. speech-to-text) and generation (e.g. speech generation). The implementation was designed with the following considerations in mind:</p>
<ul>
<li><strong>Unknown state space</strong>: In many problems, the state space is not known a priori. For example, the state space of all possible chords is infinite because they may contain any number of any note; however, all songs only ever use a finite number of distinct chords. Traditional dyamic programming algorithms like the Viterbi Algorithm require you to preallocate an <script type="math/tex">n</script> by <script type="math/tex">\vert H \vert</script> matrix. This is impossible if the underlying state space <script type="math/tex">H</script> is unknown. Therefore, I was forced to use an A* variation instead.</li>
<li><strong>Concurrency</strong>: Implementation will be trained on massive datasets, so it would be interesting if it may be used for prediction and generation while it is trained.</li>
</ul>
<h3 id="genetic-algorithms">Genetic Algorithms</h3>
<p>A genetic algorithm is an algorithm that mimics the processes of biological evolution to find optimal solutions to problems. In high school biology, we learned that when two organisms reproduce their genomes are recombined and mutated to produce offspring. Over many generations, favorable traits are naturally selected and become more predominant within a population. By mathematically defining the genetic operators that enable evolution (recombination, mutation, natural selection), we can harness the power of nature to solve arbitrary problems.</p>
<p>The genetic algorithm library exposes the <code class="highlighter-rouge">Selector</code>, <code class="highlighter-rouge">Mutator</code>, <code class="highlighter-rouge">Evaluator</code>, and <code class="highlighter-rouge">Recombinator</code> traits which allow the rules of evolution to be arbitrarily defined for any problem. A <code class="highlighter-rouge">Population</code> may then be evolved according to these defined rules.</p>
<ul>
<li><code class="highlighter-rouge">Evaluator</code>: Maps each individual in the population to an evolutionary fitness score, which represents the likilihood an individual will survive and reproduce (ie. analogue of survival of the fittest).</li>
<li><code class="highlighter-rouge">Selector</code>: Defines which individuals are selected for reproduction (ie. analogue of natural selection).</li>
<li><code class="highlighter-rouge">Recombinator</code>: Defines how two parents who have been selected for reproduction produce offspring (ie. analogue of cellular reproduction).</li>
<li><code class="highlighter-rouge">Mutator</code>: Defines how an individual’s genome is mutated. Essential to encourage diversity and variation within a population (ie. analogue of cellular mutation).</li>
</ul>
<p>For example, consider the problem of function maximization. Suppose we would like to maximize some function <script type="math/tex">f \colon \mathbb{R} \mapsto \mathbb{R}</script> over some interval <script type="math/tex">(from, until) \subset \mathbb{R}</script> but <script type="math/tex">f</script> is not differentiable. Clearly, the standard method of locating maxima by finding points at which the derivative vanishes is not applicable. However, we may still use a genetic algorithm if we first define the rules of evolution in the context of function maximization. We construct a population of randomly selected points <script type="math/tex">x \in (from, until)</script> and define the evaluator to be <script type="math/tex">eval(x) = f(x)</script>, the recombinator to be the average <script type="math/tex">recombine(x_1, x_2) = \frac{x_1 + x_2}{2}</script>, the mutator to jitter by a gaussian random <script type="math/tex">mutate(x) = x + \epsilon</script> for <script type="math/tex">\epsilon ~ N(0, 1)</script>, and the selector to be a standard <a href="https://en.wikipedia.org/wiki/Tournament_selection">Tournament Selector</a>.</p>
<div class="language-scala highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">val</span> <span class="n">generator</span> <span class="k">=</span> <span class="nc">Rand</span><span class="o">.</span><span class="n">uniform</span><span class="o">.</span><span class="n">map</span><span class="o">(</span><span class="k">_</span> <span class="o">*</span> <span class="o">(</span><span class="n">until</span> <span class="o">-</span> <span class="n">from</span><span class="o">)</span> <span class="o">+</span> <span class="n">from</span><span class="o">)</span>
<span class="k">var</span> <span class="n">population</span> <span class="k">=</span> <span class="nc">Population</span><span class="o">(</span><span class="nc">Seq</span><span class="o">.</span><span class="n">fill</span><span class="o">(</span><span class="mi">100</span><span class="o">)(</span><span class="n">generator</span><span class="o">.</span><span class="n">draw</span><span class="o">()))</span>
<span class="o">(</span><span class="mi">1</span> <span class="n">to</span> <span class="mi">50</span><span class="o">).</span><span class="n">foreach</span> <span class="o">{</span> <span class="k">_</span> <span class="k">=></span>
<span class="n">population</span> <span class="k">=</span> <span class="n">population</span><span class="o">.</span><span class="n">evolve</span><span class="o">(</span>
<span class="n">selector</span> <span class="k">=</span> <span class="k">new</span> <span class="nc">TournamentSelector</span><span class="o">(</span><span class="mi">5</span><span class="o">),</span>
<span class="n">recombinator</span> <span class="k">=</span> <span class="o">(</span><span class="n">f</span><span class="o">,</span> <span class="n">m</span><span class="o">)</span> <span class="k">=></span> <span class="o">(</span><span class="n">f</span> <span class="o">+</span> <span class="n">m</span><span class="o">)</span> <span class="o">/</span> <span class="mf">2.0</span><span class="o">,</span>
<span class="n">mutator</span> <span class="k">=</span> <span class="n">x</span> <span class="k">=></span> <span class="n">x</span> <span class="o">+</span> <span class="nc">Rand</span><span class="o">.</span><span class="n">gaussian</span><span class="o">.</span><span class="n">draw</span><span class="o">(),</span>
<span class="n">evaluator</span> <span class="k">=</span> <span class="n">x</span> <span class="k">=></span> <span class="n">f</span><span class="o">(</span><span class="n">x</span><span class="o">),</span>
<span class="o">)</span>
<span class="o">}</span>
<span class="n">population</span><span class="o">.</span><span class="n">members</span><span class="o">.</span><span class="n">maxBy</span><span class="o">(</span><span class="n">f</span><span class="o">)</span>
</code></pre></div></div>
<h1 id="future-work">Future Work</h1>
<p>The next challenge of Swara will be to build the <code class="highlighter-rouge">swara-core</code> library, which will apply the <code class="highlighter-rouge">swara-music</code> and <code class="highlighter-rouge">swara-learn</code> libraries to build exciting musical technologies like:</p>
<ul>
<li><strong>Algorithmic Composition</strong>: Generating original, but representative scores of music. Create a <a href="https://soundcloud.com/swara-labs">SoundCloud Profile</a> with entirely computer generated content.</li>
<li><strong>Fingerprinting</strong>: Generating a musical fingerprint, or identifier, which uniquely define a piece of music. This fingerprint can then be used to perform musical identification and search (like Shazam).</li>
<li><strong>Musical Translation</strong>: Extracting musical information from audio input sources. It is the musical analogue of speech-to-text.</li>
</ul>
Fri, 23 Sep 2016 04:20:00 +0000
http://ashwin153.github.io/projects/swara.html
http://ashwin153.github.io/projects/swara.htmlProjectsAlgorithms (CS 331H)<h1 id="workshop-selection">Workshop Selection</h1>
<p>Let <script type="math/tex">D</script> be the set of all possible days to hold a workshop and let <script type="math/tex">P</script> be the set of people interested in attending the workshop such that each person <script type="math/tex">p_{i}</script> wants to attend during <script type="math/tex">[s_{i}, f_{i}] \subset D</script>. Each day <script type="math/tex">d_{i}</script> costs <script type="math/tex">c_{i}</script>, but each person <script type="math/tex">p_{i}</script> will willing pay <script type="math/tex">v_{i}</script> to attend. <strong>Suppose there is no limit to the number of people that can attend on a particular day, what days should you select to maximize profit?</strong></p>
<p>Constrct a flow network <script type="math/tex">N = (V, E)</script> as follows:</p>
<ul>
<li>Create a source vertex <script type="math/tex">s \in V</script> and a sink vertex <script type="math/tex">t \in V</script></li>
<li>Create a vertex for each person <script type="math/tex">p_{i} \in P</script></li>
<li>Create a vertex for each day <script type="math/tex">d_{j} \in D</script></li>
<li>Create a directed edge <script type="math/tex">(s, p_{i})</script> of capacity <script type="math/tex">v_{i}</script> for all <script type="math/tex">p_{i} \in P</script></li>
<li>Create a directed edge <script type="math/tex">(d_{i}, t)</script> of capacity <script type="math/tex">c_{i}</script> for all <script type="math/tex">d_{i} \in D</script></li>
<li>Create a directed edge <script type="math/tex">(p_{i}, d_{j})</script> of capacity <script type="math/tex">\infty</script> for all <script type="math/tex">p_{i}</script> such that <script type="math/tex">d_{j} \in [s_{i}, f_{i}]</script></li>
</ul>
<p>For some subset <script type="math/tex">U \subset V</script>, let <script type="math/tex">rev(U) = \sum\nolimits_{p_{i} \in U} v_{p}</script> and <script type="math/tex">cost(U) = \sum\nolimits_{d_{i} \in U} c_{i}</script>. Clearly, <script type="math/tex">profit(U) = rev(U) - cost(U)</script>.</p>
<p>Find the minimum cut <script type="math/tex">S, T</script> of <script type="math/tex">N</script>. Because all <script type="math/tex">s-t</script> paths contain edges of finite capacity by construction, the minimum cut must always be finite. Therefore, the cut-set of the minimum cut <script type="math/tex">S, T</script> may only contain edges of the form <script type="math/tex">(s, p_{i})</script> or <script type="math/tex">(d_{i}, t)</script>. Because no edges may cross the cut, all the desired days for each person in <script type="math/tex">S</script> must be in <script type="math/tex">S</script>. Consequently, the cut capacity is defined as:</p>
<script type="math/tex; mode=display">c(S, T) = \sum\nolimits_{p_{i} \in T} v_{i} + \sum\nolimits_{d_{j} \in S} c_{j} = rev(T) + cost(S)</script>
<p>Therefore,</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{align*}profit(S) &= rev(S) - cost(S) \\
&= rev(V) - (rev(T) + cost(S)) \\
&= rev(V) - c(S, T)\end{align*} %]]></script>
<p>Because <script type="math/tex">c(S, T)</script> is minimal, <script type="math/tex">profit(S)</script> must be maximal. Therefore, the days <script type="math/tex">d_{i} \in S</script> generate the maximal profit of <script type="math/tex">rev(V) - c(S, T)</script>.</p>
<h1 id="detecting-arbitrage">Detecting Arbitrage</h1>
<p>Let <script type="math/tex">C</script> be a set of currencies and let the <script type="math/tex">R</script> be the set of exchange rates between currencies such that <script type="math/tex">r(c_{i}, c_{j}) \in R</script> is the exchange rate between currencies <script type="math/tex">c_{i}</script> and <script type="math/tex">c_{j}</script>. Clearly, <script type="math/tex">r(c_{i}, c_{j}) = \frac{1}{r(c_{j}, c_{i})}</script>. Arbitrage occurs when there are opportunities for riskless profit. <strong>How do you detect arbitrage opportunities between currencies?</strong></p>
<p>In the graph <script type="math/tex">G = (C, R)</script> arbitrage occurs where there exists a cycle of currencies <script type="math/tex">c_{1},c_{2},\ldots,c_{k},c_{1}</script> such that:</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{align*}r(c_{1}, c_{2}) r(c_{2}, c_{3}) \ldots r(c_{k-1}, c_{k}) r(c_{k}, c_{1}) &> 1\\
\log r(c_{1}, c_{2}) + \log r(c_{2}, c_{3}) + \ldots + \log r(c_{k-1}, c_{k}) + \log r(c_{k}, c_{1}) &> 0\\
\log \frac{1}{r(c_{1}, c_{2})} + \log \frac{1}{r(c_{2}, c_{3})} + \ldots + \log \frac{1}{r(c_{k-1}, c_{k})} + \log \frac{1}{r(c_{k}, c_{1})} &< 0\end{align*} %]]></script>
<p>Therefore, detecting arbitrage opportunities in <script type="math/tex">G</script> is equivalent to finding negative cost cycles in the graph <script type="math/tex">G' = (C, \{ \log \frac{1}{r} : r \in R\})</script>, which can be found in <script type="math/tex">O(\lvert C \rvert ^{3})</script> using the Floyd-Warshall Algorithm.</p>
<h1 id="tiling">Tiling</h1>
<p><strong>How many ways can one tile a 3 x n rectangle using 2 x 1 tiles?</strong> It is immediately clear that <script type="math/tex">n</script> must be even, because it is impossible to tile an odd area with even area tiles. Therefore, <script type="math/tex">n = 2 \cdot k</script> and <script type="math/tex">k \in \mathbb{N}</script>. By enumerating the various ways to tile 2 x 1 tiles on a 3 x <script type="math/tex">2 \cdot k</script> rectangle for <script type="math/tex">k \in \{1, 2, 3\}</script>, I determined that the number of tilings:</p>
<script type="math/tex; mode=display">\begin{equation*}T(k) = 3 \cdot T(k - 1) + 2 \cdot T(k - 2) + 2 \cdot T(k - 3) + ... + 2 \cdot T(0)\end{equation*}</script>
<script type="math/tex; mode=display">\begin{equation*}T(k - 1) = 3 \cdot T(k - 2) + 2 \cdot T(k - 3) + ... + 2 \cdot T(0)\end{equation*}</script>
<p>By subtraction of these two equations,</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{align*}
T(k) - T(k - 1) &= [3 \cdot T(k - 1) + 2 \cdot T(k - 2) + ... + 2 \cdot T(0)] - \\
&\ \qquad [3 \cdot T(k - 2) + 2 \cdot T(k - 3) + ... + 2 \cdot T(0)] \\
&= 3 \cdot T(k - 1) - T(k - 2)\end{align*} %]]></script>
<p>Therefore, <script type="math/tex">T(k) = 4 \cdot T(k - 1) - T(k - 2)</script>. The characteristic equation of this linear homogeneous recurrence relation is <script type="math/tex">r^{2} = 4 \cdot r - 1</script>. The roots of this characteristic equation are <script type="math/tex">a_{1} = \frac{4 + \sqrt{12}}{2} = 2 + \sqrt{3}</script> and <script type="math/tex">a_{2} = \frac{4 - \sqrt{12}}{2} = 2 - \sqrt{3}</script>. Suppose the closed-form solution of the recurrence relation is of the form</p>
<script type="math/tex; mode=display">T(k) = c_{1} a_{1}^{k} + c_{2} a_{2}^{k} = c_{1} (2 + \sqrt{3})^{k} + c_{2} (2 - \sqrt{3})^{k}</script>
<script type="math/tex; mode=display">% <![CDATA[
\begin{align*}T(0) = 1 \implies 1 &= c_{1} + c_{2} \implies c_{2} = 1 - c_{1} \\
T(1) = 3 \implies 3 &= c_{1} (2 + \sqrt{3}) + c_{2} (2 - \sqrt{3}) \\
&= c_{1} (2 + \sqrt{3}) + (1 - c_{1}) (2 - \sqrt{3}) \\
&= 2 \sqrt{3} c_{1} + 2 - \sqrt{3}\end{align*} %]]></script>
<p>Consequently, <script type="math/tex">c_{1} = \frac{1 + \sqrt{3}}{2 \sqrt{3}}</script> and <script type="math/tex">c_{2} = 1 - \frac{1 + \sqrt{3}}{2 \sqrt{3}} = \frac{\sqrt{3} - 1}{2 \sqrt{3}}</script>. Therefore, the closed-form solution to the recurrence relation is <script type="math/tex">T(k) = \frac{1 + \sqrt{3}}{2 \sqrt{3}} (2 + \sqrt{3})^{k} + \frac{\sqrt{3} - 1}{2 \sqrt{3}} (2 - \sqrt{3})^{k}</script> and solves the problem in <script type="math/tex">O(1)</script> time and <script type="math/tex">O(1)</script> space.</p>
<p><strong>How many ways can one tile a k x n rectangle using 2 x 1 tiles?</strong> Assuming we can tile the first n-1 columns of the rectangle, how many ways are there to tile the last column? We can either tile elements in the last column one-at-a-time using a horizontal tile or two-at-a-time using a vertical tile. Therefore, the number of ways to tile the last column <script type="math/tex">f(k) = f(k-2) + f(k-1)</script>. Because the <script type="math/tex">n^{th}</script> Fibonnaci number <script type="math/tex">F(n) = \frac{\phi^{n}}{\sqrt{5}}</script>, the number of ways to tile the last column grows exponentially with <script type="math/tex">k</script>. Because there are <script type="math/tex">n</script> columns in the matrix, the complexity of the algorithm is <script type="math/tex">O(\phi^{nk})</script>.</p>
<p>Cover photograph by <a href="http://aspireblog.org/wp-content/uploads/2013/04/chalkboard.jpg">aspireblog</a>.</p>
Wed, 06 Apr 2016 04:20:00 +0000
http://ashwin153.github.io/classes/algorithms.html
http://ashwin153.github.io/classes/algorithms.htmlClassesOriginals: How Non-Conformists Move the World Review<p>The book mentions a fascinating social experiment in which Teresa Amabile created book reviews that had identical content but opposite tones and asked people to rate the intelligence of their authors. Amabile found that, “people rated the critical reviewer as 14% more intelligent, and having 16 percent greater literary expertise, than the complimentary reviewer.” Therefore, despite thoroughly enjoying this book, I’ll focus on its faults first in order to look smarter and more literate than I really am.</p>
<p>I found the most glaring weakness to be the lack of sufficient quantitative evidence. Adam Grant frequently cited the results of studies and social experiments when making arguments about the causes of human creativity; however, he left out details about the statistical significance and margin of error of these results. By omitting the more complete statistical picture, Grant forces his readers to take him at his word. It’s difficult to disagree with his interpretation of these scientific findings, because he only provides the data that supports his claims.</p>
<p>That being said, this was easily one of my favorite books. It was rereadable and highly actionable. I especially liked how Grant used a variety of real life examples to support his claims about originality. The following are ten of my favorite quotes from the book.</p>
<ol>
<li>“People who suffer the most from a given state of affairs are paradoxically the least likely to question, challenge, reject, or change it.” A reminder that when the going gets tough, fucking do something about it.</li>
<li>“Practice makes perfect, but it doesn’t make new.” Doing the same thing over and over again is going to lead to the same results.</li>
<li>“When it comes to idea generation, quantity is the most predictable path to quality.” Don’t bring any biases or prejudices to the idea generation process. Every idea may not be a good idea, but every idea is useful.</li>
<li>“When we judge their greatness, we focus not on their averages, but on their peaks.” People are remembered by who they were at their best.</li>
<li>“It’s true that the early bird gets the worm, but we can’t forget that the early worm gets caught.” Sometimes it’s the first mover disadvantage. Being first to market can be a disadvantage. It’s much more important to be the best than it is to be the first.</li>
<li>“Timing accounted for forty-two percent of the difference between success and failure.” The best idea requires the best timing to be successful.</li>
<li>“The secret to success is sincerity.” You have to be genuine. Mean everything you say. Do everything you believe.</li>
<li>“Strong opinions, weakly held… Argue like you’re right and listen like you’re wrong.” It’s good to be steadfast in your principles, but its bad for your principles to be steadfast. You should always be open to new thoughts, ideas, and perspectives.</li>
<li>“Never doubt that a small group of thoughtful citizens can change the world; indeed, it’s the only thing that ever has.” Preach.</li>
<li>“Becoming original is not the easiest path in the pursuit of happiness, but it leaves us perfectly poised for the happiness of pursuit.” It’s about the journey, not the destination.</li>
</ol>
<p>Cover photograph from <a href="https://mycreativejourney2015.files.wordpress.com/2015/04/creativity1.jpg">mycreativejourney2015</a>.</p>
Fri, 25 Mar 2016 04:20:00 +0000
http://ashwin153.github.io/reviews/originals.html
http://ashwin153.github.io/reviews/originals.htmlReviewsVIX Futures Roll<h1 id="introduction">Introduction</h1>
<h3 id="what-are-futures">What are futures?</h3>
<p>A futures contract is an agreement between two parties, in which the parties agree to exchange some underlying asset at a set price, at some specified time in the future.</p>
<p>For example, consider an upstream oil company (exploration & production). The upstream oil company benefits from higher oil prices, because higher prices allow the company to sell extracted oil for more profit. Suppose the company believes that oil prices will fall next year. If the company does nothing, they risk losing revenues to falling oil prices next year. Instead, the company can sell futures contracts in order to guarantee the current price point for oil sold next year. By locking in the current price, the company is protected against falling prices, but loses out on unrealized revenues if the price of oil rises. Clearly, the seller of futures contracts believes that prices <em>will</em> fall and the buyer of futures contracts believes that prices <em>will</em> rise.</p>
<p>One of the important principles of futures contracts is that their price will always converge to the price of the underlying asset at expiration. In other words, the price of Dec ‘16 oil futures will always converge to the price of oil in Dec ‘16. Why? To eliminate arbitrage opportunities. If the price of a Dec ‘16 oil futures contract was $40 and the price of oil in Dec ‘16 was $20, a smart trader could sell Dec ‘16 contracts at $40, fulfill the contract with oil purchased at $20, and profit off the difference. Similarly, if the price of a Dec ‘16 oil futures contract was $20 and the price of oil in Dec ‘16 was $40, a smart trader could short oil at $40, close the short position with the oil purchased from Dec ‘16 contracts at $20, and profit off the difference. As traders exploit these arbitrage opportunities, the price of oil futures and the price of oil will gradually converge.</p>
<p>Futures have two purposes: hedging exposure to prices (in the case of oil companies) and speculating future prices of an asset. The purpose of this investment strategy is to speculate the future prices of VIX futures.</p>
<h3 id="what-is-vix">What is VIX?</h3>
<p>Volatility represents the inertia of a securities’ price. High volatility securities are likely to experience dramatic fluctuations in price and low volatility securities are likely to experience smaller shifts in price. There are two kinds of volatility: implied volatility and realized volatility. Implied volatility represents the market’s expectation for future volatility and is determined from options prices and realized volatility represents actual volatility experienced by the market.</p>
<p>VIX is an index that tracks the 30-day implied volatility of the S&P 500. The price of VIX futures contracts represents the market’s expectation for the 30-day implied volatility of the S&P 500 at the expiration of the contract.</p>
<h1 id="strategy">Strategy</h1>
<p>My trading strategy makes two hypotheses about VIX futures:</p>
<ol>
<li><strong>The price of VIX futures contracts will move toward the price of the subsequent futures contract.</strong> For example, the price of Jan ‘15 contracts will move toward that of Feb ‘15 and the price of Dec ‘15 contracts will move toward that of Jan ‘16.</li>
<li><strong>The larger the difference in price between a VIX futures contract and the subsequent futures contract, the larger the magnitude of the move in price will be.</strong> For example, if the price of Jan ‘15 contracts is $15, the price of Feb ‘15 contracts is $20, and the price of March ‘15 contracts is $30; then we can expect the price of Feb ‘15 contracts to move the most.</li>
</ol>
<p>To prove these hypotheses, I found the correlation between the differences in prices of adjacent futures contracts (e.g., Jan ‘15 and Feb ‘15) and the next day price changes (e.g., price of Jan ‘15 on 3/21 and 3/22). The resulting correlation coefficient of r = 0.934 shows a high degree of correlation between price differences and price changes and signifies that our hypotheses are correct.</p>
<h3 id="attempt-1-long-positive-price-differences">Attempt 1: Long Positive Price Differences</h3>
<p>My first attempt at the strategy was to purchase the future with the most positive price difference. Because we have shown positive price differences are highly correlated with positive price changes, I expected this attempt to perform extremely well. However, when I backtested the strategy against 9 years of VIX futures data, I ended up losing all my money! The graph below shows the cumulative returns curve. The curve represents the change in portfolio value at each point in time.</p>
<p><img align="center" style="margin: 0 auto; display: block;" src="/img/vix-a1.png" /></p>
<h3 id="attempt-2-short-negative-price-differences">Attempt 2: Short Negative Price Differences</h3>
<p>Clearly, my first attempt at the strategy was a complete failure. However, I was not yet ready to give up. My next attempt at the strategy sold the future with the most negative price difference (opposite of attempt 1). This strategy performed significantly better than the previous attempt. It produced positive returns and had a Sharpe ratio of 1.28 since 3/21/12. Sharpe ratios represent returns in excess of the risk-free rate; in other words, Sharpe ratios represent risk-adjusted returns. Sharpe ratios greater than 1 are good, greater than 2 are great, and greater than 3 are excellent. The graph below shows the cumulative returns curve.</p>
<p><img align="center" style="margin: 0 auto; display: block;" src="/img/vix-a2.png" /></p>
<h3 id="attempt-3-hybrid">Attempt 3: Hybrid</h3>
<p>When analyzing futures, it is often useful to define two terms: backwardation and contango. Contango refers to periods of time in which futures contracts are trading at a premium to the current price of the underlying asset and backwardation refers to periods of time in which futures contracts are trading at a discount. The blue line in the graph below represents the current price of VIX (the current 30-day implied vol of the S&P 500) and the line in red represents the average price of all VIX futures contracts. Periods of time in which the red line is below the blue line signify backwardation and periods of time in which the red line is above the blue line signify contango.</p>
<p><img align="center" style="margin: 0 auto; display: block;" src="/img/vix-a3.png" /></p>
<p>Why were returns so good in the second attempt and so poor in the first attempt? Weren’t both attempts implementations of the same strategy? The answer lies in the graph above. Careful analysis of this graph will reveal an interesting phenomenon. The period of backwardation beginning October 2008 and ending December 2008 lines up perfectly with the spike in returns in the first attempt over the same period. Similarly, the period of contango beginning October 2011 and ending October 2014 lines up perfectly with the spike in returns in the second attempt over the same period.</p>
<p>Clearly, the best implementation of this strategy is a combination of the previous attempts. In other words, we should purchase futures contracts with the most positive price difference during periods of backwardation and we should sell futures contracts with the most negative price difference during periods of contango. This hybrid strategy had by far the best results and had a sharpe ratio of 1.68 since 3/21/12. The graph below shows the cumulative returns of this attempt.</p>
<p><img align="center" style="margin: 0 auto; display: block;" src="/img/vix-a4.png" /></p>
<h1 id="future-work">Future Work</h1>
<p>When professional traders analyze futures, they typically use a three-dimensional model: price, volume, and open interest. Volume represents the total number of contracts that changed hands and open interest represents the total number of outstanding contracts. Changes in volume and open interest are important indicators of futures market behavior:</p>
<table>
<thead>
<tr>
<th style="text-align: center">Volume</th>
<th style="text-align: center">Open Interest</th>
<th style="text-align: left">Effect</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: center">rises</td>
<td style="text-align: center">rises</td>
<td style="text-align: left">Confirms a trend in price</td>
</tr>
<tr>
<td style="text-align: center">rises</td>
<td style="text-align: center">falls</td>
<td style="text-align: left">Liquidate position</td>
</tr>
<tr>
<td style="text-align: center">falls</td>
<td style="text-align: center">rises</td>
<td style="text-align: left">Period of congestion; selling, but not buying</td>
</tr>
<tr>
<td style="text-align: center">falls</td>
<td style="text-align: center">falls</td>
<td style="text-align: left">Period of accumulation; buying, but not selling</td>
</tr>
</tbody>
</table>
<p>Another important factor to consider when trading futures is time to expiration. A futures contract closer to expiration is likely to converge faster to the market price than a futures contract farther from expiration.</p>
<p>Cover photograph by <a href="http://www.stockideas.org/wp-content/uploads/2014/01/stocks-ranked-by-volatility.jpg">stockideas.org</a>.</p>
Tue, 20 Oct 2015 04:20:00 +0000
http://ashwin153.github.io/projects/vix.html
http://ashwin153.github.io/projects/vix.htmlProjects