Rafael rambling: review

Showing posts with label review. Show all posts

Saturday, July 07, 2012

Types and Bugs

There are certain discussions in our biz that are so played out they provoke instant boredom upon encounter. A major one is the old dynamic vs. static skirmish, recently resurfaced in a blog post by Evan Farrer. Which is a shame as the post is quite interesting, describing his results transliterating well-tested code in a dynamic language to a static language to see if the type system found any bugs. Which it did.

The full-length paper is a great read as well. He describes his translation methodology and gives some detail on each bug found. At first it may seem the author could be stacking the odds towards the static language as the translation was manually done by himself, but I found his description of the process pretty convincing of its fairness. The choice of Python as a source language probably helped given the pythonic inclination towards straightforward code that avoids sophisticated abstraction and metaprogramming mechanisms.

But the real meat of the paper is in the description of the bugs he found. Upon a not particularly discriminating reading, a clear pattern jumped out. Most of the bugs fell into one of two categories:

Assuming that a variable always references a valid value when it can contain a null value.
Referencing constructs that no longer exist in the source.

The first category is also the largest, comprising several places where the original code could be coerced into letting a variable be set to a null value, usually by just leaving it uninitialized, and a subsequent call would attempt to derefence it assuming it contained a valid value. Haskell's type system avoids the problem as it simply doesn't have any notion of null. Code that has to deal with optional values must do it trough algebraic data types.

How the second category of bug comes about is easy to guess from the projects histories: some method or variable was present but changed, perhaps it was renamed or subsumed, and not all references were updated to reflect the change. Even pervasive unit tests can't hope to catch these kinds of regressions, as the problem is found on the integration between units of code; the units themselves are just fine. A type system helps when the change affects the signature of the referenced construct, which is often but not always.

If the study's findings are generalizable and my observations are correct, these are the main takeaways:

If you have a type system at your service, it's prudent to structure code such that behavior-breaking changes are reflected on the types.
End-to-end integration tests are a necessary complement to both a suite of unit tests and a type system. In my experience how far should these tests stray off the happy path is a difficult engineering trade-off.
If your type system allows nulls — such as Java's, for instance — its role in bug prevention is greatly diminished. The proportion of null-dereference bugs on the analysed code bases helps to makes it clear just how big a mistake it is to allow nulls in a programming language.

Wednesday, November 17, 2010

Book review: Growing Object Oriented Software Guided by Tests

I’m a curious kind of guy. It is with some surprise, then, that I catch myself re-reading a technical book: Nat Pryce and Steve Freeman’s Growing Object Oriented Software Guided By Tests. It’s a very practical book, stuffed with code and useful advice, but it’s also more than that.

The first section of the book gives an overview of the basic principles of object-oriented design and test-driven development; not much is new but everything is clearly explained. The third and final section is a potpourri of test-centric techniques to identify problems and improve code quality. But the meat of the book is in the second section, a long walkthrough the development of a sample program. It’s a much larger example than usually found on programming books, tackling thorny issues such as asynchornous inter-process communication and end-to-end testing of GUIs. And it reads like a real software project, there are missteps, refactorings are rolled-back, some of the work is almost clerical, but then there are the great design breakthoughs, the elegant ideas that simultaneously solve several difficulties, the joy in seeing the product grow. It felt like reading a good novel. And that’s kind of what it is, a programming book that is a narrative, not an exposition. Besides making for a good read, the narrative aspect of the book is important for showing how modern object-orientation practice takes place.

OO criticisms nowadays are a dime a dozen, and though they sometimes present good points, often many of the arguments are directed at practices that aren’t so common, or at least that shouldn’t be so common. For instance, inheritance hierarchies, rampant mutable state, and patternitis (the FactoryFactory syndrome), are not indictments of object orientation, just symptoms of bad programming practice. One of the great things about GOOS is that it provides a great example of actual non-trivial object-orientation. And the same goes for test-driven-development, any abstract discussion of the benefits of TDD is bound to seem hand-wavey; this book helps to ground the understanding in real coding, done step by step in front of the reader’s eyes.

Anyway, I've read some pretty good technical books this year, but this one was the best.

Saturday, October 06, 2007

Four books

Another blog post, still no inspiration for anything creative, so lets just rehash that old bloggers' recipe of stuffing some "cultural" reviews in a post and hope it passes for content. Excited yet?

Concepts, Techniques, and Models of Computer Programming

First up is Peter Van Roy's "Concepts, Techniques, and Models of Computer Programming". Don't let that big title scare you away. The book is pretty hefty in itself, but don't let that scare you either, it is a great read. But what is it about, you may ask? Well, CTM - as it is affectionately called - could sit comfortably on the "programming paradigms" shelf, alongside Sebesta and Kamin. All books that aim to take the reader through a stroll down the computing Zoo, allowing him or her to gaze in awe of the strength of higher order functions, be amused by the quirkiness of dataflow variables, marvel at the elegant logical predicates lying under a sunny...

Ok, I took the metaphor too far, sorry about that. What I was trying to say is that CTM doesn't limit itself to enumerate paradigms accompanying each with a brief description and a couple of examples and leaving it at that. Van Roy's text goes further by discussing in reasonable depth programming techniques applicable to each computation model (the authors prefer to avoid the term "paradigm") and, more that that, advising the reader on how to best integrate them.

The technical approach that enables this leveling is to describe the models in terms of a kernel language that is expanded throughout the book. Each chapter shows how the kernel language needs to be augmented to support the required features, how it is interpreted by an abstract machine and what syntactic sugar can be added on top of the kernel to ease programming.

It would not be a fair review if I didn't relate at least one negative point, but it is a minor one. I think that the approach to logic/relational programming would be more representative of the usual intent if the language was more predicate-and-fact based. Or, to put it in other words, I like the Prolog syntax better than the "Relational Oz" one. As the authors explain, both approaches are semantically equivalent in their core, so I'm nitpicking. Overall, I can safely say that I recommend this book. It is, if you pardon the cliché, an eye-opener, making it clear that the "mainstream" imperative and stateful programming model is but one of many equally significant alternatives.

Engines of Logic

If you've ever been subject to any formal instruction in computing (or "informatics" or Information Systems or whatever), you probably had to endure at least one lecture on the "history of computing", which usually amounts to a lengthy enumeration of machines. If you were particularly unlucky, it started with some blabber about the abacus back in who-the-fuck-cares AD, and it invariably went on to spend a great deal of time discussing punched cards and looms. Yeah, freaking looms! I'm sure Joseph Marie Jacquard is a swell guy and all, but is a rudimentary mechanical input system all that important in the grand scheme of things? My answer, of course, is no. As Dijkstra put it: "Computer science is as much about computers as astronomy is about telescopes". And that is why Engines of Logic is such a great little book, it seeks to give an account of the history of ideas that culminated in modern computing.

We see how Leibnitz' utopia of a machine to automate human reasoning, up to the point of forever settling all disputes and intellectual arguments, evolved to a series of formal mathematical systems for "calculating with thoughts" (mathematical logic) by the hand of such great man as Boole, Frege, Cantor, Gödel, and others, culminating with the notion of "universal computers" and their actual realization. The book reads like a good popular science work, with amusing biographical anecdotes scattered throughout the nine chapters. Although, contrary to many works in this genre, Engines of Logic is not afraid of stating formulas and proving theorems when when deeper insight is required*. Check out a small excerpt from the chapter on David Hilbert for a sample of the lighter side of the book:

During my own graduate student days in the late 1940s, anecdotes about Göttingen in the 1920s were still being repeated from one generation of students to the next. We heard about the endless cruel pranks that Carl Ludwig Siegel played on on the hapless Bessel-Hagen, who remained ever gullible. My own favorite story was about the time that Hilbert was seen day after day in torn trousers, a source of embarrassment to many. The task of tactfully informing Hilbert of the situation was delegated to his assistant, Richard Courant. Knowing the pleasure Hilbert took in strolls in the countryside while talking mathematics, Courant invited him for a walk. Courant managed matters so that the pair walked through some thorny bushes, at which point Courant informed Hilbert that he had evidently torn his pants on one of the bushes. "Oh no," Hilbert replied, "they've been that way for weeks, but nobody notices".

Also of note in the paragraph I quoted is the personal touch given at times by the author, Martin Davis. He is a theoretical computer scientist, with the distinction of being present in Princeton back in the 1950s, in the companion of chaps like John Von Neumann, Kurt Gödel, Hermann Weyl and Albert Einstein. As an author, Davis is probably best known for writing more technical books on computability and complexity. But please, make no mistake, this is emphatically not an academic textbook; it goes to great pains to clearly explain subtle concepts like Cantor's diagonal method, achieving a balance between rigor and ease that is hard to come by**.

1984

It is sad that I only got around to reading this book now. "Now" meaning late 2006, as these reviews are a little bit behind schedule... Anyway, as I'm having a hard time finding worthy adjectives, I guess something I could say is that after finishing 1984 I felt utterly stunned. It is powerful and it is important, so put it on your reading list if you haven't already.

A final observation is that the edition I'm linking to - a combined printing of Animal Farm and 1984 published by Harcourt - is cheap and pretty good. The preface is signed by Christopher Hitchens.

Snow Crash

I'm getting lazy (well, lazier) so this will be short: good book, so-so plot, so-so characters, awesome ambiance.

* To be fair, some of the most tricky proofs for non-crucial topics are left to end notes. Still, those notes are far easier to read than most academic mathematical tomes.
** Off the top of my head, I can only think of Nagel and Newman's book on Gödel's proof.

Monday, July 10, 2006

add1

Faltou um.

Little Schemer é um bom livro?

É sim.

Para que?

Para aprender programação funcional.

Como ele ensina programação funcional?

O leitor vê uma pergunta, pensa um pouco, compara com a resposta do livro e depois faz a mesma coisa para a próxima pergunta.

Depois de terminar o livro, dá para sair programando em Lisp ou Scheme?

Não, não dá.

Ok, o livro não foca na prática. Mas, ele é ser forte na teoria?

Não muito. Conceitos como continuations e closures são trabalhados informalmente. Cálculo lambda não chega nem a ser mencionado.

Hmm. Então porquê ele é um bom livro?

Para aprender programação funcional.

Wednesday, July 05, 2006

963 páginas

Como estou sem criatividade para escrever alguma coisa original, vou tomar a tradicional solução bloggeira. Não, não vou postar fotos de animais de estimação, estou falando da outra solução bloggueira para preguiça intelectual: book reviews.

Admito que esse semestre foi bem fraco para leituras; só tive tempo para quatro livros:

"Bartleby, o Escriturário", do Melville, edição de bolso da L&PM. Não manjo de crítica litéraria e não sou presunçoso o suficiente para pensar que poderia análisar um clássico da literatura. Só digo que gostei muito, e recomendo. Ah, e ele é curto, assim como esse comentário.

"The Design of Everyday Things", do Donald Norman. Obrigatório para qualquer pessoa que esteja envolvida na criação de qualquer coisa, o livro fala da interação entre as pessoas e os objetos cotidianos (ou não tão cotidianos). Norman é um especialista em psicologia cognitiva que se interessou pelo estudo da usabilidade quando foi chamado a compor o conselho designado para apontar as falhas que levaram ao desastre de Three Mile Island. Ele descobriu que muito do que se costuma apontar como "falha humana" na verdade é causado por objetos que foram projetados sem levar em conta as pessoas que terão de operá-los. No caso da usina, encontrou problemas como séries de controles muito parecidos produzindo ações muito diferentes e alarmes que disparavam com tanta frequência que acabaram por ser ignorados no momento crítico. O livro discute esse exemplo e muitos outros, mas não é apenas um catálogo de erros de usabilidade. Ele também formula uma teoria de como as pessoas aprendem a usar as tais "everyday things" e apresenta uma série de conselhos para quem têm a responsabilidade de projetar seja uma usina nuclear, uma chaleira ou um software. Para instigar mais o apetite, são estes os conselhos: 1. Use both knowledge in the world and knowledge in the head; 2. Simplify the structure of tasks; 3. Make things visible: bridge the gulfs of execution and evaluation; 4. Get the mappings right; 5. Exploit the power of constraints, both natural and artificial; 6. Design for error; 7. When all else fails, standardize. Algum ponto negativo? Ficar folheando até o fim para ler as notas é um saco (e eu sou simplesmente incapaz de pular e deixar para depois...).

"Secrets & Lies", do Bruce Schneier. O autor é um criptógrafo notório, e um dos livros anteriores dele, o Applied Criptography, é a obra mais popular sobre o assunto. Aqui, ele trata de segurança digital de uma maneira ampla. O livro é muito bom, mas eu esperava algo diferente do que encontrei. Usando um termo encontrado repetidas vezes no texto, pode-se dizer que o livro não é destinado a security experts, nem a aspirantes à expert, mas sim à quem contrata os tais security experts. Isso não implica que a leitura seja inútil para quem tem perfil técnico, pois as idéias do Schneier sobre como abordar a segurança são sempre muito inteligentes e às vezes até um pouco surpreendentes. Uma destas é a a constatação de que a comunidade de segurança computacional põe ênfase exagerada em medidas preventivas, como se fosse possível se proteger de todo e qualquer ataque futuro, e negligencia as outras fases: detecção e resposta. Outra tese relacionada é resumida no mantra "segurança não é um produto, é um processo", que é uma daquelas coisas que parecem óbvias até que alguém nos chame a atenção para as ramificações. Enfim, não me arrependo de tê-lo lido e até recomendo, mas a falta de profundidade me incomoda. Especialmente quando percebo que o autor sabe tratar de assuntos complexos sem alienar o leitor leigo; prova disso é o primoroso capítulo 6, que explica em linhas gerais as principais técnicas criptográficas sem entrar em detalhes teóricos, mas conseguindo ilutrar bem a mecânica básica e a importância de cada ferramenta.
Little Schemer, da MIT Press. O comentário sobre esse fica para depois. Só adianto que é um ótimo livro mas não compre sem dar uma folheada antes...