Verne
=====

An experiment in Social Computing

Status Quorum
-------------

A semaphore is a signalling system that allows coordinating participants safe passage by following some simple rules. Indeed, in some languages the word "semaphore" is used to refer to traffic lights. Another example of a semaphore would be an air traffic control system. These systems provide signals that navigators can use to avoid bumping into one another. The consequences of ignoring them are dire; things are lost, and these things that are lost are often very valuable.

In computer programming, the situation is the same. If two threads try to perform a non atomic operation on the same data at the same time, a value is lost. Values are also valuable.

Lets take an example of a group of workers performing some kind of work which adds value. The value is added to a central state which is shared between all participants. Let's say that the work that a worker does to generate a unit of value requires exclusive access to the shared state for some length of time, and workers take turns adding value to this central shared state. This is not going to scale very well, as only one worker will be able to work on the shared state at a time.

If our problem is divisible into multiple independent chunks that can be worked on separately, as problems often are, then we can then set half of our workers working on task A while half work on task B, and we have increased the amount of work that they can do by making the locks more finely grained. This pattern can be seen in computer systems (prerequisite steps performed in parallel) as well as in institutions (bureaucratic hierarchies).

But there's a catch, which is that we need to appoint an additional agent to keep track of the resulting intermediary values and compile them together into a final value. This agent resides at the top and they must be empowered with final authority over the data in order to be able to do their job, which is to provide consistency. If they aren't the authority over the data, then they can't prevent disputes and thus can't provide consistency.

The uniqueness of this all powerful agent is problematic, as their role is critical and they are not easy to replace. A value network depends on them, and their loss would usually cause major inconvenience or even failure of the network. Their job is also very difficult, as they have nobody to pass the buck to; if the value network that depends on them is not efficient and well organised, then they may have a very high workload to deal with. As the size of the network increases, their relative importance increases while the relative importance of an agent closer to the bottom decreases, due to them being even more replaceable.

Eventually something must be done to correct the imbalance, and that thing is to discard the requirement for strict consistency, and move to eventual consistency, or simply have an inconsistent system, which can appear different from different vantage points.

Collaborative Computing
-----------------------

So why is it so hard to build a decentralised system for people to collaborate on the internet? The answer is that large scale collaboration involves disagreements, and today's internet is designed not to be in a state of disagreement. DNS, the Web, blockchains, and certificate authorities, are all (eventually) consistent networks and thus tend towards centralisation. The requirement for consistency mandates that they retain authority over all value, making this value unavailable to people who want to use it.

However, there is a class of collaborative computing which is designed to accomodate multiple versions of the truth: distributed version control. This is a critical feature for software development, because it allows developers to delay the resolution of conflicts until such a time as they are ready to deal with them. In development, context switches are expensive and it can be advantageous to process tasks asynchronously, and there can also be economies of scale in processing groups of similar tasks together. Importantly, it also allows for a project to be taken in a different direction at an extremely small overhead.

Development can be seen as an iterative process where the inputs are requirements, the outputs are code, and the algorithm is a combined effort between a machine and a user. A typical pattern is a developer who changes the code, then executes the program to see if it gives the desired behaviour. If the developer is happy with their code, they may commit it to a branch.

The DVCS is designed not to make destructive changes. It's default behaviour is to append. Unless instructed otherwise by the user, no destructive changes are made, and even when a state needs to be reverted a DVCS may add a negation to a change rather than deleting it. The history that is formed is a timeline of events. In the typical case, the events are points in time which the developer wanted to record something. This policy of non destruction and recording change events in sequence allows for 2 working states of the development process to be merged together.

Versioning Distributed State Machine
------------------------------------

This paper proposes the "versioning distributed state machine" (VDSM) as a new class of event driven distributed application based on distributed version control. A VDSM is a kind of networked application which can exist in many different states at the same time. It does not have a typical lifecycle; it is not dependent on a host computer to run it to completion. It uses a DVCS repository to track changes to it's working state, which means that it's execution can be performed in stages over many different machines.

A process implemented as a VDSM cannot break in the way that traditional computer processes can break, since in the event that an erroneous circumstance is detected, the culprit branch can be discarded and a new, fixed branch curated in it's place. This undo mechanism gives it the property of being more or less foolproof, by constraining the amount of value that can be lost in the event of a problem.

As well as the application state, a VDSM may contain it's own code. It is advantageous to host code with data because then we can treat code as a first class input to an event driven system, in the case that a data output depends on the version of a code function. We also reap the benefit that the development and execution lifecycles are unified, which would otherwise incur additional deployment requirements and inhibit it's ability to propagate through the network.

The lack of a hard requirement for consistency, plus the non destructive style of adding values, also makes programming a VDSM much easier than programming consistent distributed systems, since the task of merging of the data is generalised by the version control system, and maintaining consistency is the job of the user.

A VDSM is propagated around the network between interested users. Each time a change is produced, a user may broadcast the availability of this change to their peers, who may then synchronise it into their own repositories. Due to the lifecycle model of the VDSM, it is feasible to simply clone applications between users of the network and execute them locally. It is therefore possible for users to collaborate on the execution of a VDSM in a peer to peer fashion.

Social Applications
-------------------

Lets take an example of a social app and explore it in detail: A game of Chess. Chess is a 2 player game where players take turns modifying the game state until a fixed end condition is reached.

It is possible to play a game of chess to completion, using a VDSM as a medium for propagation of game state but encoding everything by hand, ie, with no game logic implemented whatsoever. But, there are benefits to implementing the game logic and verifying the moves made by players, namely, to present a nice interface, to save the players inconvenience from minor errors, and to prevent propagation of invalid game states around the network.

The game state consists of a "move file" to which moves are written, and a file for each piece on the board, where the filename is it's offset on the board (D2 etc), and the file contents are the color and type of the piece.

Player A informs the app of their move, and the app encodes that move to the repository. The move file is updated with the parameters of the move, signed by the user, and the state of the board is reflected in the piece files. These changes are committed and synchronised to player B.

Player B does not immediately merge the branch into their working branch. Rather, the new commit is verified first. In pseudocode (untested):

HEAD = argv[1] # HEAD of the user's working branch
MERGE = argv[2] # Branch with new unverified move
move = readBlob(MERGE, "move")
verifySig(move)
lastMove = readBlob(HEAD~2, "move")
if (lastMove) { assert (signatory(move) == signatory(lastMove)) }
assert (HEAD == playMove(HEAD^1, move)))

If the signatures are correct, and the move is valid, the game continues and it is the turn of player B. If not, then the branch will be rejected and the conflict must either be resolved by the players or the game abandoned.

In the case that a user attempts to merge 2 conflicting game states together, the merge should be rejected, either by a merge hook, or due to conflict on the move file.

This game, once complete, could be archived, or deleted. If a standard data structure for Chess can be identified, then the game and the interface for the game can be merged from separate branches, so that the game data is portable between interfaces.

Currency
--------

A simple Bitcoin style cryptocurrency is possible, but without the guarantees of consistency provided by a blockchain. The double spending problem is "solved" by enforcing the sequence of transactions from a particular address, such that trying to merge a double spend results in a conflict. Fungibility of the currency is trust based, as the recipient of a transaction must trust that the payee is not double spending. If a user does double spend, the network will continue to function, but recipients of the double spent outputs will not be able to transact between each other, and neither will the respective recipient groups of any dependent transactions. Issue of currency is trust based, users creating new units of the currency as they require them. The sender must be trusted to honor the value of a transaction based on some agreed rate of exchange.

Conclusion
----------

In this paper we have explored how to create a new model for collaborative decentralised trust based computing, enabling systems that are in an inconsistent state to continue to function. Users of this system form a graph that can be traversed by applications in a viral fashion. Since propagation is dependent on the consent of users, traversal of this graph will be time consuming and challenging, but likely very rewarding. The ability for localised private entities to collaborate together, in a distributed fashion and without regard for a global namespace, will form the basis for strong local culture and sovereignty on the internet.

Thanks to Satoshi Nakamoto for inspiration and to Linus Torvalds for excellent software.

- Sadgit