Sunday, September 28, 2014

Themes from Strangeloop 2014

One of the many reasons making the Strangeloop conference special is the interdisciplinary perspective, taking on themes as diverse as core functional programming concepts - what could fit this description better than the infinite tower of interpreters seen on Nada Amin's keynote - to upcoming software deployment approaches.

Still, some common themes seem to have emerged. One candidate is the spreadsheet as inspiration for programming. One thinker that seems to have taken some inspiration for his work is Jonathan Edwards, that opened the Future of Programming workshop with a talk showcasing the latest version of his research language Subtext.  Earlier prototypes explored the idea of programming without names, directly linking tree nodes to each other via a kind of cut-and-paste mechanism. In its latest incarnation it appears to have evolved into a reactive language with a completely observable evaluation model, the entire evaluation tree is always available for exploration, and a two stage reactive architecture allows for relating input events to evaluation steps. User interface is auto generated, sharing the environment with the code, much like their older reactive cousing, the spreadsheet.

Kaya, a new language created by David Broderick, explores the spreadsheet metaphor in a more literal manner: what if spreadsheets and cells were composable, allowing for naturally nested structures? Moreover, what if we could query this structure in a SQL-like manner? The result is a small set of abstractions generating complex emergent behavior, including, as in Subtext, a generic user interface.

Data dependency graph driven evaluation is an important part of both modern functional reactive programming languages and of all spreadsheet packages since 1978's VisiCalc. We saw some of the first on Evan Czaplicki's highly approachable talk "Controlling Time And Space: Understanding The Many Formulations Of Frp". And a bit of the latter on Felienne Hermans's talk "Spreadsheets for Developers", sort of wrapping around the metaphor and looking to software engineering for inspiration to improve spreadsheet usage.

One of the great aspects of spreadsheets is the experience of direct manipulation of a live environment. This is at the crux of what Brett Victor has been demonstrating on many of his demos, showing how different programming could be if our tools were guided by this principle. Though he did not present, the idea of direct manipulation was present in Strangeloop in several of the talks.  Subtext's transparency and live reflection of code updates on the generated UI moves in this direction. Still on the Future of Programming workshop, Shadershop is an environment whose central concept seems to be directly manipulating real-valued functions by composing simpler functions while live inspecting the resulting plots. Stephen Wolfram's keynote was an entertaining demonstration of his latest product, the Wolfram Language.  Its appeal was due, among other reasons, to the interactive exploration environment, particularly the visual representation of non-textual data and the seamless jump from evaluating expressions to building small exploratory UIs. 

Czaplicki's talk discussed several of the decisions involved in designing Elm, his functional reactive programming language. I found noteworthy that many of those were taken in order to allow live update of running code and an awesome time-traveling debugger.

Taking a different perspective at the buzzword du Jour, reactive, is another candidate theme for this year's Strangeloop: the taming of callbacks. They were repeatedly mentioned as one evil to be banished from the world of programming, including on Joe Armstrong's keynote, "The mess we are in" and all the functional reactive programming content took aim at the construct. Not only functional, another gem from this year's Future of Programming workshop was the imperative reactive programming language Céu. Created by Fransico Sant'anna at PUC Rio - the home of the Lua programming language - Céu compiles an imperative language with embedded event based concurrency constructs down to a deterministic state machine in C.  Achieving, among other tricks, fully automated memory management without a garbage collector.

Befitting our age of microservices and commodity cloud computing, another interesting current was looking at modern approaches to testing distributed systems. Michael Nygard exemplified simulation testing - which can be characterized as property based testing in the large - with Simulant, a clojure framework to prepare, run, record events, make assertions and analyze the results of sophisticated black box tests. Kyle @aphyr Kingsbury delivered another amazing performance torturing distributed databases to their breaking point. Most interesting was the lengths he had to go to in order to control the combinatorial explosion of the state space and actually verify global ordering properties like linearizability.

Speaking of going to great lengths to torture database systems, we come to what might have been my favorite talk at the conference, by the FoundationDb team, "Testing Distributed Systems w/ Deterministic Simulation".  Like @aphyr's Jepsen, they control the external environemnt to inject failures while generating transactions and asserting the system maintains its properties. They take great care to mock out all sources of non-determisim, including time, random number generation, even extend C++ to add better behaved concurrency abstractions.

Tests run thousands of times each night, nondeterministic behavior is weeded out by running twice each set of parameters and checking outputs don't change. FoundationDb's team goes further than Jepsen in the types of failures they can inject; not only causing network partitions and removing entire nodes from the cluster, but also simulating network lag, data corruption, and even operator mistakes, like swapping data files between nodes! Of course the test harness itself could be buggy, failing to exercise certain failure conditions; to overcome this specter, they generate real hardware failures with programmable power supplies connected to physical servers (they report no bugs were found on FoundationDb with this strategy, but Linux and Zookeeper had defects surfaced - the latter isn't in use anymore).

What I particularly enjoyed from this talk was the attitude towards the certainty of failures in production. Building a database is a serious endeavor, data loss is simply not acceptable, and they understood this from the start.

Closing the conference in the perfect key was Carin Meier and Sam Aaron's keynote demonstrating the one true underlying theme: Our Shared Joy of Programming.