Intro
May’s rubynortheast covered design issues. Pootsbook delivered an excellent talk, covering just about the right amount of detail and being very clear throughout.
His topic was DCI, or Data, Context and Interaction. Based on work by Trygve Reenskaug, who also developed the MVC approach, it aims to help improve maintainability of code and avoid some of the “pain” often seen in large OO systems where models get bloated and cumbersome. One place where this pain is felt is in testing, where the extra complexity hinders the kind of testing we want to do.
Philip highlighted some trends in the wider Rails community, where design approaches and its effect on testing is being re-evaluated. One important idea is that “testing is not just about testing, it should be about design”.
Briefly, DCI is about separating Data from the various Roles in the system. Too much mixing of data and role implementation leads to bloated models, and also poor modularity – where the boundaries between various roles are blurred. The interaction part deals with the various roles some piece of data may have, eg a person could do employee things or family things. The Context part ties the data and roles together in appropriate combinations as and when required.
DCI is definitely aa step in the right direction. The code examples we saw were initially strange, but soon made sense, and I can imagine how the benefits are kept when scaling it up to real examples. But now, I want to explore some of the underlying ideas and concepts from some other angles.
Some “OO” theory
The most interesting book I’ve ever read on OO (and note, that set contains more than one item), is Kim Bruce’s Foundations of Object-oriented Languages (MIT Press, 2002). The DCI ideas chimed with several points from Bruce’s book – always a good sign when the same ideas pop up in different places. Although the book can be a bit theoretical in places, there’s much that I believe is interesting and useful for OO programmers to know. It does help to understand the ideas behind the tools.
Overall, the book is a discussion of the pros and cons of OO languages, then a deconstruction of key ideas then several investigations into new combinations of features. More simply: he takes things apart and has a play with new combinations, and manages to keep connected to real-world concerns. The discussion in the first third of the book covers many of the key OO styles and the associated languages, and their strengths and weaknesses. One important point from this is that there is no canonical form of OO – researchers and practitioners are still exploring different avenues, and most significantly, there’s a lot of exploration still to be done. We can view current languages as experimental stepping stones (or sweet spots), with programmers trying to do the best they can where they are, until a better stepping stone appears which simplifies or eliminates some of the contortions programmers were getting used to… Remember: languages are tools, and we should always try to get better tools, not be forced into corners by them.
The biggest weakness that Bruce identifies in current languages is how functionality and data are lumped together, and how this leads to deeper problems in the languages, especially when combined with ideas like inheritance. Put simply, the hard fact is that functionality inheritance works one way, and data inheritance works another way, and they ain’t the same. Ever noticed how a sensible data design (ie a class hierarchy) sometimes doesn’t fit nicely when you start adding in the inheritance of functionality? Or noticed that the rules for inheritance of methods works differently from the rules for data inheritance? Java is a good example: its language spec has two distinct sets of rules for methods vs instance variables. This gets clearer if you try to work with ‘methods’ tied to instance vars, as you do with javascript. (The brave can now go and look up
co-variant versus contra-variant inheritance…) So, like in DCI, separation of Data from Functionality definitely seems like a good idea.
In case you’re not sure how separating out the functionality works, then consider some examples, say a piece of person data which has some employee-like functionality and some family-like functionality. The employee functionality corresponds to being able to do employee-like things, such as returning the company the person works for or being able to work for X hours on a given day. From this, you might have more specific kinds of employee, like a manager who is responsible for a group of employees and projects etc. Now, a key idea is that when you start isolating out an employee’s functionality, you are then more able to write code which depends on just that functionality and none of the other stuff that a concrete employee (a person) might be able to do.
Instant modularity! And you’ve not lost much in flexibility either, if anything. You might have seen this used in Java, though in my experience, it’s not really pushed in undergrad courses and so not as understood as it should be.
The rest of the book then works toward an alternative formulation of OO ideas that is no less flexible than existing languages but has some important new strengths, though he does stay within more class-based OO languages. This exploration is wide-ranging: the final chapter even asks if we really need to build unrestricted heterogeneous collections (in laymans’ terms, the option to chuck absolutely anything into a list). Is this a key or defining property required in OO languages, which we can’t do without? (Your answer goes here.) Bruce discusses whether we actually gain much in software engineering terms, balanced agains the extra complexity and implications for run-time, and IIRC his conclusion was that it wasn’t really worth it.
He doesn’t say as much, but the language he ends up with is actually quite Haskell-like. It has a notion like java interfaces, but more powerful, and a stronger account of polymorphism. It’s also more amenable to compile-time checking, so eliminates a lot of potentially slow run-time checking. Various implementations exist, if you are interested – even one done as a final year project by a student of mine. I’ve not used these extensively, but informally, they do feel like progress.
To summarise, OO is not something fixed in stone, and we have the ability and the right to experiment with different designs. Informally, it seems sensible to make a bigger separation of functionality and data. And it is possible to build programming languages which work in this way, without losing too much of OO’s claimed flexibility and at the same time, gaining some important and useful features too. And to me, DCI (as a design principle) seems to be walking in the same direction, albeit as a pattern of use rather than something encouraged by the language.
A bit of “deja vu”
Haskell contains a notion of interface too, as a promise of functionality that certain types can supply. You could also call it ‘behaviour’ or ‘role’. Java has a simple version of this idea, but Haskell version is integrated into the wider language and much more powerful. (You can actually run Prolog programs and more in the type-checker, but that’s another story.) Though not the original goal of this part of Haskell, we can use it to do the kind of interface-directed programming suggested by DCI. What follows is a sketch, rather than actual working examples. (I’m writing this in my lunch hour and want to give a flavour, not get stuck in details of type inference.)
class Product x => Cart_like a x where
add :: x -> a x -> a x
This says, given some product-like type ‘x’ (could be anything, as long as it fulfils the role), then data types ‘a x’ are “cart_like” if they provide an ‘add’ operation which can add product_like values to the cart. In the above, the ‘a’ part actually denotes some kind of container, eg lists or something more complex built on top of lists, so ‘a x’ ends up meaning “a container with ‘x’ values as its payload”. Bit scary, but it means more descriptiveness and more flexibility. Having declared this interface, we can then use it without committing ourselves to an actual choice of cart or product type, eg adding a list of things.
add_things :: Cart_like a x => [x] -> a x -> a x
add_things stuff = foldr (.) id $ map add things
# in ruby terms, = stuff.map {|x| ->(y) {add(x,y)}}.inject(cart) {|x,y| ... }
Pricing up a cart works similarly, assuming we can get the price of our product-like things, then we can walk through the container and sum up the numbers. So, Haskell supports this interface-based programming quite nicely, and the types do help to document the various combinations.
Returning to the DCI example, most of it can be done with interfaces. The interesting bit is building the context values. Basically, we don’t have to do much, and can rely on our type system instead. Plus, it is a bit more principled than dynamically combining mixins on an object (anyone else feel uncomfortable with the latter?). Below, we have a (contrived) operation that needs family and employee functionality. The type just says, can apply this to anything that has both behaviours.
some_employee_family_thing :: (Family a, Employee a) => Date -> String -> a -> IO (Maybe Approval)
some_employee_family_thing when reason person
= book (date :@ ("Taking my " + show (num_kids person) + " kids to " + reason) `for` company_of person
We could define a new class FamilyEmployee which has the two behaviours as prerequisites (or super-classes?), but it’s completely up to us. We could also put the above code inside a FamilyEmployee interface, if there’s other operations to go in there too, or if we want type-specific implementations of this operation.
Language pattern or Design pattern?
Last few words. Hopefully, if you’ve made it this far, you have got a wider feeling for what DCI and similar are doing, and have some idea of the underlying concepts.
But hopefully you will be thinking about the deeper questions. Like, do concepts like DCI have to be ‘OO’? And which OO do we want? (Contrast: a functional programmer would ask, what’s the data being manipulated and what kind of DSL would we like to use to solve the problem.)
My last point is this: should these things be a design pattern, used as “good practice” for coding in language X, or should we demand language Y, which provides more direct support, maybe even more convenience and safety? One of Reenskaug’s papers, when motivating DCI, mentioned a concept of (having to) feel our way around the program via tests, rather than being able to read it and understand it directly. More language support would definitely help with the latter.