Thursday, April 20, 2006

Don't forget to use your brain

Keith Braithwaite, one of the postmodern programming guys, wrote this interesting blog post. I tried to comment on the post page but, sadly, my wise and deep words didn't find it's way to the world wide web. So, in the immortal words of the prophet, I'm just gonna have to wing it here.

A reasonable amount of bytes and brain cells have been lost, over the years, on the object-relational mapping battleground. Keith's argument is that many of those casualties could be prevented if developers let go of the RDMS dogma and adopt simpler solutions when possible. If the requirements don't indicate a heavy load of concurrent updates, then there is no need for powerful (and expensive) transactional capabilities. He gives as an example an online shopping web site. Usually, additions and changes to the product catalog don't need to be immediately reflected to the users. Also, the possibility of simultaneous modifications can be negligible here. He proposes* that, in cases like this, the data can be made available in a simple format locally, on the front-end servers. Updates can be periodically pushed from a central database to the front-ends.

The only nit I have to pick with his post is when he talks about queries:
"For instance, if you want to get hold of an object that (you have to know) happens to be backed by the underlying database then what you do is obtain an entry point to the data, and search over it for the object you want. Unlike the navigation that you might expect to do in an object graph, instead you build...queries. (...)"
Yes, building a whole query just to get hold of one object reference is too much trouble and a violation of DRY (parenthesis: this sort of thing eases somewhat this pain, minus all the factories and spurious abstractions). But I think he overlooks a bit the fact that often we really need to do a query. It's frequently part of the problem domain, not the solution domain, to put it in other (more pompous) words.

I view queries as inherently declarative operations (given x information, get me Y more information). As such, they are better expressed through declarative means. So, aCollection select: aBlock in Smalltalk is better than the equivalent for loop in Java. Still in OOland, Evans and Fowler's Specification pattern is even better for more complex cases. Advancing to the logical conclusion, a language specifically designed for searching would be even better. Unfortunatly, SQL falls short of reaching this goal in practice, because of the mess that is integrating it with the rest of the application. Microsoft's linq project is an intriguing technology in this space.

Anyway, what I wanted to point out is that there is no one size fits all software architecture and every project needs to be thought out** by a team who knows what it's doing and isn't afraid to think outside the vendor-supplied box.


* barring possible misunderstandings of my part.
** This is not a defense of BDUF

2 comments:

keithb said...

Bom dia, Rafeal.

I'm sorry that you couldn't comment on the post directly. I'll have to investigate that. In fact, I'm beginning to regret choosing blogger for my experiment in blogging for a number of reasons.

Anyway, I'm glad that you found the post interesting. And I'm very intersted to note that the "postmodern programming guys" appear a coherent enough group to you that you'd identify me as one of them. That was quick.

But that's not what I wanted to say here.

Your observation that "[...] often we really need to do a query. It's frequently part of the problem domain, not the solution domain, to put it in other (more pompous) words." is worth considering (and isn't particularly pompus).

Take one case where I've seen what I'm now thinking might be a pattern candidate, let's call it "Data Corner Shop", applied: a configurator style web flow.

In a configurator the user's choices determine what options they should be presented with next, so that as they go along they fill in the slots of a data strcuture in a way thay conforms to a certain body of rules, say to select a product with a collection of compatible options.

That problem domain looks like a strong candidate for a declarative, rules based approach, and in fact these things are really, really easy to code up in Prolog, but usually that's not a permitted solution--management wouldn't let if they did, and the developers don't want anyway, to use that tool. So instead a the tendency is to fall back on complicated database queries or else bring in a third-party tool.

But even so, let's cosider a case I'm very familar with, a configurator that guides a user from identifying their country through to gathering all the information neccesary to configure their mobile phone over the air to access an email account.

Well you know, there are only so many countries in the world (where this service is avilable). And only so many makes and models of mobile phone (that have email clients on them), and only so many network operators (who make data services avilable to their users in those countries), and only so many email providers (in those countries that have operators that...) and so on.

It turns out that a reasonably efficient encoding of all this knowledge about mobile email (and a lot of other OTA settings) as an object graph only takes up a few megabytes.

And the "queries" that get you from one page to the next, embodying the declarative rules of the domain, are not too onerous to capture as imperative code. Most importantly they aren't ad-hoc (which is where building queries is a huge win).

And none of this changes over a period shorter than a month.

So, even in cases where there's a substantial body of places where declarative approaches could be very useul, the overhead of using them can still sometimes be avoided. but not always, and you shouldn't always want to.

As you rightly say: "there is no one size fits all software architecture and every project needs to be thought out by a team who knows what it's doing and isn't afraid to think outside the vendor-supplied box."

I'd add only that they must be prepared to think outside of whatever box their own past history has created, too.

Rafaeldff said...

Boa noite :)

Thanks for responding. I liked your example and I think that object models are indeed an excelent way to represent domain knowledge. I don't have the book nearby, but as I recall there is something in Analysis Patterns about a split between active objects and reference objects (these are almost certainly not the terms used in the book, but my memory isn't helping much, even with google's aid...) that represents the kind of structure mentioned. As I undestood, the case you mentioned is a progressive refinement (or reduction) of the reference object graph following a business workflow; if you do write it up as a pattern I would be very interested in reading it.


So, even in cases where there's a substantial body of places where declarative approaches could be very useul, the overhead of using them can still sometimes be avoided. but not always, and you shouldn't always want to..
What caught my attention in this paragraph is the mention of overhead. There are all kinds of overheads here, from possible performance drops because of the complexity that general case algorithms can bring to the table, to the cognitive overhead of training a team to think declaratively. I think it's worth investigating which of these are accidental difficulties and which are essencial. That's why I'm intrigued by where Ms' linq is going (though I can't say that I know enough about the subject as I don't even have VS installed...), they are trying to bring declarative querying as an extra ingredient to the OO model.