Roman Numerals

I was recently at a FP-ish meetup group in Newcastle, and one topic of discussion is how to TDD the Roman Numerals kata.

Note: for regular readers, TDD is generally going to mean “test driven development” in this article. I suppose it could also stretch to include “test driven design” too. When instead I mean “type-directed development” I’ll mention it in full.

Of course, I remain sceptical about approaching this kata with TDD. My concern is that it’s hard to include much insight about the problem domain in the code, and from there, hard to produce clear (or even “beautiful”) code. See my earlier piece Uncle Bob and Functional Programming for some discussion on such points.

So, what alternatives are there? Maybe several, but here I’m going to use a flavour of type directed development. The core principles are to consider the data structures we want to use, particularly which ones will help to simplify the code. Code is the stuff between the types, and if we do a good job on the types, the code will become much easier.

Externally (and as a first approximation), the function we want has type “Int -> String”, but this tells us very little. It also encourages “Stringly programming” – doing too much with strings which we’ll then need to unpick when trying to understand / review / test the code. Put another way: strings lose informationand it’s this kind of info which we will want to rely on later.

We can do better. It’s almost certain that we’re going to be doing something non-trivial with values that have a reliable structure, and such structures can be rendered to string form when needed.

By the way, what are the roman numeral representation of  zero or negative integers? I don’t know. The type “Int -> String” suggests it is going to handle any integer. Let’s sidestep that for now by saying the input is going to be “a suitable number” and revisit later. To recap, we have these ideas/targets

  • some suitable intermediate type T
  • something that will convert T values to a string, ie of type T -> String
  • something that converts “a suitable number” to T, ie of type “a suitable number” -> T
  • and we can compose these functions together if we want etc

What is T? Basically, we need it to encode some details of what roman numerals are, and how they work – and the more info we can pack in, the better.

Is it a string? hell, no. We’re not stringly programmers.

Only certain symbols are allowed – so how about a list of these? better, but is [V,M, L, L] allowed? it’s a valid list but not a valid roman numeral – so this is still not good enough.

What are the rules anyway? Let’s assume the core rules:

  • M can appear up to four times
  • then D at most once
  • then C at most four times
  • then L at most once
  • then X at most four times
  • then V at most once
  • then I at most four times.
  • there are certain “subtraction” rules to avoid four symbols in a row, but let’s keep it simple and assume we can do this simplification when rendering to a string
  • apparently, the Romans weren’t that strict on rules – carvings exist with numerals that don’t follow the usual pattern; but let’s keep to the BBC version for now.

So how about a tuple of 7 integers? eg (0,0,2,1,3,1,3) to represent CCLXXXVIII (= 288). Better, but it’s still too wide – it has no protection against invalid values like (0,-3,10,2,1,1,-2). Nope, not good enough. Large tuples are also a smell, too easy to get numbers in the wrong place. At the very least, we’d want something record-like with clearly named fields.

One way to represent numeric constraints is with enumerations of the required size. Booleans have two values True and False, so we can use this or invent our own. We need something for 0 – 4 anyway.

data UptoOne  = Zero_of_one  | One_of_one
data UpToFour = Zero_of_four | One_of_four
              | Two_of_four  | Three_of_four 
              | Four_of_four

So we can now have a record containing seven such values  – much better, and it is starting to constrain the problem space.  We can no longer represent invalid values.  As an exercise, you might like to try writing a show function for such values, then maybe a function to add two such numbers. Then have a think about how to convert “a suitable value” into this type.

What about tests? depends what you aren’t confident about. I’ve not written any code yet (in the sense of stuff that transforms values into other values), but already I’m confident that the underlying type is a good basis for developing this code – it’s reducing the risk for working with bad values, and hence both simplifying the code and reducing the scope of what needs to be tested.

Is there anything test-like we can use to check our type definition? I’ll leave that for later discussion. Ditto on which tests you want to have when developing the code. What are the risks? In particular, with the kind of code you’ll be writing, what kind of mistakes are you afraid of making? Accidentally confusing field names is one, eg accidentally using the count for M when you want the count for X.


Note that we’re still in Haskell. No funny trickery, no advanced stuff. You could even write this in scala, I guess. (Send me the code if you do – I’m curious.)

Can we do even better? Forget for a moment the limitations on the various tools you are using, and the style and level of programming that they are forcing you to work at. Instead, what is the program you would like to write?

Also, how certain do you want to be about the code being bulletproof? Or about removing some of the risks of making errors?

I’ll follow up soon with a discussion of how & where the powerful features of dependently-typed languages can start to address these questions.

As a sneak peek, it’s going to involve type definitions which simultaneously allow only valid numerals to be constructed, explain how those numerals map to arabic numbers, and also make it easier to prove key properties of the code. We can probably get the type checker to write some of our code too. All this without writing a single test.

I’m not writing a book

You might have noticed, still no book. Sorry about that. It got shelved early in 2014 for a number of reasons. A bit of running out of steam after a year of writing for the magazine, a bit of “creative differences”, wanting to spend more time with my family, a bit of wanting to move onto new things and my general aversion of going over old ground.

However, serious respect to the people who do finish their books. It’s a lot of work, much more than knocking out a few articles. It was something of a culture shock for me (-:

Remember though – the whole message of the book would have been the same as the articles: it’s all about data. Not about classes or objects or processes, just data.

Capture the data structures in the problem domain, and think about transformations on them. Certainly don’t try to cram it into some awkward metaphor. Maybe use languages which offer cool tools for working with richly-structured data, like some of the newer functional languages.  

So, the book was mainly going to say what was in the articles, just at a more gentle and thorough pace.

Some useful material did get written though – maybe about 1/2 of it. I really should get round to releasing bits of it soon.

Until next time!

I’m writing a book!

I’m rather delighted to confirm that I’m now writing a book!

You might have seen my articles in the Pragmatic Programmers’ in-house magazine. (If not, start with this one from August 2012. There’s eleven articles in the series.) The book will be an expanded version of those articles, with more careful and detailed explanations and some more examples.

The articles took a wider look at functional programming (FP), about why it is useful and how to use it effectively. Thinking in terms of data played a big part in the articles, as did the use of a good type system to guide the solutions: types help to describe what data we have and how we transform the data. The series ended by looking past “mainstream” FP towards dependent types, which provide a significantly better language for describing data and its transformations, to the point of giving us a new paradigm for programming. The book will take roughly the same path.

Assuming I finish the book to the satisfaction of the publishers (and it looks good so far (-:), it will appear under the Pragmatic Programmers label,
hopefully early in 2014 – with a possibility of a beta release before the end of 2013.

I’m rather “stoked” (to use a current term) that the Pragmatic Programmers are publishing the book! They’ve got a great range of high-quality titles, plus I really like their attitude and enthusiasm for technology. (For example, not all publishers have something like or, you know.) I’m also delighted that my editor will be former Dr Dobbs editor Michael Swaine, who also edits the PragPub magazine and several of their books, and I look forward to working with him to get my earlier material into good shape. I hope my book will meet the Prags’ standards and be useful and interesting to their audience.

What would such a tome be titled? Well, my working title is “Beyond Functional Programming”. I chose it to convey a bit of controversy, and to differentiate the book from other recent titles. Firstly, I will be looking at the wider picture of programming and programming languages, and how to use them well. Secondly, I’ll look beyond current languages to the new possibilities opening up with dependent types. There’s also a nod to Bruce Tate’s book “Beyond Java”.

“Programming in X” it isn’t. My aim is more “Programming, with some X”. The comma is very deliberate, and important! I feel that too often, undue emphasis is placed on some language rather than the underlying techniques. Instead, I’ll be using language X as a vehicle to cover certain ideas. I won’t be covering language X in the usual level of detail: for one thing, there are several fine books and resources which already do that. My book aims to complement such resources.

The X in the first half of the book will be Haskell. I could have used other languages, but as a vehicle Haskell is ideal. It is the R1150GS of programming languages IMHO. The R1150GS is an excellent motorbike, very well designed, reliable, good performance – but with a few interesting quirks, like the “agricultural character” of the gearbox. The phrase “they all do that, sir” crops up regularly on forums too. Riders of these bikes (like me) refer to such features as “character”, and most of us love the bike for it.

It’s always a touchy and difficult question for an R1150GS owner over which bike they’d want to ride next. Should they stick with the R1150GS? It’s a feasible option, and the bikes are built to last! But can the bike be bettered? Probably. BMW have other more recent models, like the R1200GS, but these are a bit too smooth for my taste. So, like with programming languages, what else can we have? That will be the question in the last third of the book, as we look towards languages and related technology that use very powerful ideas about types.

Mind the gap please

Or, how all programming language evolution charts are incomplete.

Or even, how so many of you are looking the wrong way!

Just a quick moan this time. No doubt you’ve all seen charts which show how ideas from one programming language have flowed into others, in a kind of tracking-the-evolution sense.

Ever noticed what’s wrong with these diagrams? I think they are missing a crucial detail, and by missing this, they are missing a very important aspect of programming language work: the wider picture.

The same applies during many ‘discussions’ about programming languages etc. The discussion tends to focus on existing languages, as if we had to pick something that already exists. Looking backwards, in a sense.

What I think gets lost is the context. We’re programming to solve problems, and most of the fun parts of problem solving are done in our heads and then transferred to some executable format. There’s a gap between “brain” and “machine” that we need to bridge somehow, with a mix of languages, processes, techniques, and I think you’ll all agree the gap is pretty big at present.

My second favourite joke is relevant here. An eminent Comp Sci professor was asked what was the best programming language. Prof. paused, then answered “graduate student”.

So if we could, wouldn’t it be fun to explain to a graduate student what needed doing and then get a working program back? We’ve made the bridge shorter by using a fairly high-level entity to encode our solution. (YMMV though.) Sadly, there aren’t so many graduate students around and we sometimes have to do our own work. How do we reduce how much needs to be bridged?

Notice that I’m not recommending we work at the machine level. We should aim to do better than this. For example, I don’t program in Haskell: I write programs in my head and then write them down as Haskell programs. It’s the difference between programming with versus programming in. I would like my future languages to make it easier to write down what is in my head, not only because I’m lazy, but because it’s probably a bit safer that way.

One way to view programming language development is as an attempt to narrow the brain-machine gap.We try to provide features which make it easier to encode and think about concepts from the problem domain, and to say how they relate. Quite rightly, there’s a growing interest in techniques like DSLs (domain specific languages) as a way to encode important ideas more directly, without the noise of the host programming language. We can go further though.

I like the flexibility and terseness given by Haskell-style languages – great for saying what you mean without too much ceremony. But, quite often I know more about the problem domain etc and how things work than is possible to write in the code, and this is annoying. I don’t want to write such things down as comments, or write test cases to try to document and confirm additional properties. I think we can do better, and one way is with the new generation of dependently typed languages. (You can find an intro in recent PragPub magazines, in particular starting with the April 2013 issue.) Dependent types aren’t perfect, but it’s a good step forward and they introduce many new ideas to explore.

To sum up, I believe we need to be aware of the brain-machine gap and be aware how we need to develop tools (languages and otherwise) plus techniques to get more control over the gap. Don’t turn your back on it!

My PhD thesis is 15 years old

Since I left the organised education industry, my PhD thesis has not been readily available on the web. It’s about time I gave it a proper home.

So: my thesis can now be downloaded from here. (This is a postscript file converted to PDF, and might look a bit ‘scratchy’ in places. IIRC back then we only had 10 dpi fonts.

I just found another version, if you prefer, which is a scanned image from the middle-class finishing school down the road. It was added late 2012 it seems.

I also realised that it was almost 15 years ago to the day that I passed my PhD viva  (with minor changes). That seems a long time ago. What do I think about the whole thing from the distance of almost a sixth of a century? Basically, I’m still quite pleased with it. Quite a lot of what I said seems still relevant nowadays, particularly questioning the goals of wider NLP and trying to understand the value we’re creating (or not). I’ll go into more detail in another post.

But for now, here’s the abstract.

This research addresses the question, "how do we evaluate systems like LOLITA?" LOLITA is the Natural Language Processing (NLP) system under development at the University of Durham. It is intended as a platform for building NL applications. We are therefore interested in questions of evaluation for such general NLP systems. The thesis has two parts. The first, and main, part concerns the participation of LOLITA in the Sixth Message Understanding Conference (MUC-6). The MUC-relevant portion of LOLITA is described in detail. The adaptation of LOLITA for MUC-6 is discussed, including work undertaken by the author. Performance on a specimen article is analysed qualitatively, and in detail, with anonymous comparisons to competitors' output. We also examine current LOLITA performance. A template comparison tool was implemented to aid these analyses. The overall scores are then considered. A methodology for analysis is discussed, and a comparison made with current scores. The comparison tool is used to analyse how systems performed relative to each-other. One method, Correctness Analysis, was particularly interesting. It provides a characterisation of task difficulty, and indicates how systems approached a task. Finally, MUC-6 is analysed. In particular, we consider the methodology and ways of interpreting the results. Several criticisms of MUC-6 are made, along with suggestions for future MUC-style events. The second part considers evaluation from the point of view of general systems. A literature review shows a lack of serious work on this aspect of evaluation. A first principles discussion of evaluation, starting from a view of NL systems as a particular kind of software, raises several interesting points for single task evaluation. No evaluations could be suggested for general systems; their value was seen as primarily economic. That is, we are unable to analyse their linguistic capability directly.

I’ve been distracted…

I’ve not posted for a while, at least not here, but I have kept up the habit elsewhere.

If you didn’t know, I’ve been writing a few articles on Haskell and related topics for the Pragmatic Bookshelf’s in house magazine. My general aim is to talk about higher-level issues of programming and what we can get from a functional approach. The key ideas are about putting data first and about getting the programming language to fit the problem. So conceptually I try to start asking what data structures we need and what kind of transformations, then consider what we’d like from the programming language to make it easier or simpler to write the programs.

The current articles are:

One day I’d like to do a retrospective here, to restate the main points and think about what isn’t being explained clearly enough.

But soon, I need to get cracking on the next article to have it submitted before Christmas! It will probably feature Yesod, a full-scale web framework for Haskell. Or I may take it easy and just do some more Fay!


Open source at its best? A review of Spreeconf Europe 2012

The conference I could not miss

Last time I was in Dublin, I swore I’d never go back to the Guinness Storehouse. (I’ll explain why later, for those who are interested.)

But the Spree guys found the one thing that would drag me back: a two day conference on my favourite open-source project! In fact, I would rather have sold my grandmother than miss this. (Fortunately, the tickets were very reasonably priced thanks to the sponsors and my grandmother is safe. For now.)

I’m writing this the day after, sat in a Dublin hotel watching the traditional Irish summer weather (the wet kind). ‘Awesome’ doesn’t quite cover what happened over the past few days. Looking at the selection of tweets generated, this conference was, quite possibly:

print "legen-"; print ->(){ sleep 10; "dary"}[]

There was a great mix of speakers, and I learnt a lot of useful stuff. But two aspects really lifted it beyond just a good conference: interesting chats with great people outside of the talks, and an intangible but strong sense of community. For me, this made it a truly fantastic and unique experience.

What do I mean? At minimum, most of us had worked with the Spree code and had a vested interest in the general continued success of the project, so some (good) shared experiences and a common purpose. Quite a few of us had contributed to the project in various ways, all important and useful, from significant changes in the core code, to bug fixes and feedback, to writing some of the docs, to helping answer queries on the mailing list or IRC channel. Many of us had been helped by such people, and often we in turn had helped others. And for several of us, it was the first time meeting the people behind the email messages or commits (and a bit sad that a few people I had worked with before couldn’t make it this time – even the argumentative ones (-:). There was also a fair bit of pride that our various contributions over the years had built into something quite significant and important – giving many of us some kind of income wth tools we enjoy working with, and making a difference to users of the technology.

Basically, the very best that an open source project can offer!

And it’s such a great experience, I seriously recommend it to anyone who is interested in software: identify some projects which overlap with your interests, track them for a bit, maybe start fiddling with the code and participate in discussions, then maybe start helping out with the bug fixing, new features, and infrastructure. I promise you that you’ll learn a lot, much of which can’t be found in books or courses, and you’ll enjoy it too.

Amongst the various factors that contribute to the success of the Spree project, such as talented, enthusiastic, and helpful people, and tools like Ruby and Rails and their related infrastructures, I believe Sean Schofield (aka schof, uberzealot, …) deserves a huge amount of the credit. His good sense and generosity has been an inspiring model for all of us since the beginning of the project. It is an absolute pleasure to work with him!

Day one: related topics

Chris Mar kicked off with an overview of Spree and some current ideas that SpreeCommerce is investigating. The key points for me (as an old hand…) were:

  • use of the skeleton layout grid for responsive design
  • using jirafe to help collect and analyse interesting stats (looked nice
  • several payment gateway customizations for spree now collected under the “spree-gateway” gem
  • Spreecommerce has partnered with a few of the payment systems and (iirc) might lead to lower charges
  • spree-pinterest gem exists, and can be used to get some interesting numbers

Gregg Pollack gave an introduction to coffeescript. As you probably all know, Gregg is a great communicator of ideas and did a fine job. I particularly liked his technique of going from JS code to CS code to show where and how the improvements are made. It was a good overview of the basics in the time allowed.

There were some interesting questions, especially the usual one about debugging. Sounds like it is still an issue, though some tools are starting to appear which offer some help. Someone quite rightly pointed out that use of TDD will help to avoid many pitfalls.

However, CoffeeScript is not the only game in town – see for a fairly complete list. My current favourite is the ‘Fay’ language ( this is a cut-down compiler for a subset of Haskell into simple JS code. One advantage is that the code is valid Haskell, so can be debugged as a Haskell program, using the stronger type system and various Haskell testing tools, before eventual translation to JS. It also covers more language features than CoffeeScript, particularly the things I miss from CoffeeScript like nicer syntax for list comprehensions and support for more complex data types and pattern matching. Hey: it even allows you to use semicolons and braces if significant white-space isn’t your thing.

M Scott Ford was talking about testing in the context of spree instances, discussing testing of core and testing of integrations of core plus some extensions.

Unfortunately he had the “talk slot of doom” – right after lunch, and when the buffet contained various items made with Guinness. A few of us might have partaken of some liquid refreshment to wash lunch down. A few of us are also not as young as we once were and might have drifted away. (Which fool turned the lights down low? (-:) You’ll have to ask the younger guys for more info about this talk.

Do look out for a recent blog post by @albertovena on TDD-ing a Spree extension for subscriptions –

Nick Gauthier, author of Backbone Recipes, gave a nice intro to the key ideas of Backbone then launched us into a hack-along session, completing an app that he supplied the skeleton for. It’s always fun to have something to play with. Slides are at and the relevant code repo is at

In the evening, we had a few lightning talks during a github-sponsored open bar.

  • Brian Quinn talked a bit about the various ways of using the Deface gem and how the API has changed (you can play with it online too).
  • I talked briefly about the experimental Spree-based site I developed a while back to help coordinate a two-level selling process, where customers order from a trader and the trader orders from the wholesaler. (I must write this up sometime. Please bug me if I don’t do it soon.)
  •  there was a quick demo of a web-based tool to help manage and fill in I18n translations, which looked very useful (apologies: forgotten your names!)
  • The session ended with Nick Gauthier doing some penance involving some random unseen slides and a 20-second timer. I vaguely remember mention of DHH, watches, cars, and cats; but around then my Horlicks started calling…

Day two: About Spree

Quite appropriately, Sean Schofield began the proceedings by talking about his vision and purpose and what it means for Spree.

The vision is “empowering the world’s sellers”, and the purpose is “helping people and solving interesting problems”. Simple statements, but I think is significant how well it captures the motivations behind a lot of us who work with Spree. Perhaps it’s one of the reasons why we have such a strong community around Spree: from the top of the project downwards, we share and genuinely believe in these ideas.

Sean made an interesting comparison between publishing and selling, suggesting that Spree could become the WordPress for (e-)commerce by putting more and better functionality in the hands of smaller traders. One strong step towards this is via the SpreeWorks SaaS product. Sean gave us an overview. SpreeWorks gives people an easy entry into Spree-based selling and a good migration path towards full Spree use for when they need it.

Brian Quinn was up next, talking about his experiments with Backbone for providing more streamlining in tasks like product editing (in admin) and some of the store customization in SpreeWorks, also for improving the performance of the four high-traffic pages in Spree: product index, product show, taxon show, and order edit.

He has a work-in-progress tool (called ‘scribble’) for rendering json templates on the server via V8 with the aim of removing duplication between EJS and ERB templates. (Rich Quick made the sensible point that one could probably achieve this more directly by dynamically creating one set of templates from the other set with a bit of regexp magic. This is worth looking at.) Some references:

  • PJAX (= push-state + Ajax), for mirroring conventional url use
  • RABL gem to help with creation of JSON templates
  • for a demo of Brian’s experiments
  • for the code

Erik Michaels-Ober talked about code coverage tools. His starting point was “how do you become a better programmer”, and invited suggestions from the audience. Someone suggested Haskell (ok, that was me). And no-one suggested going back to school/college! Erik then gave an interesting overview of code coverage tools and gave a demo of some Ruby coverage tools. He also flagged up an article by the late William Thurston on proof & progress in maths which seems worth a look (or see the selected quote pasted in gist 3434943).

Ryan Bigg, who is doing a great job as Spree’s community manager, talked about managing people’s expectations in open source projects, particularly on the side of requesting help on IRC or other public forums. A lot of it boils down to common sense and consideration, of course, though we do occasionally get a few people who leave their common sense at home and make fools of themselves. There was one nice example of a guy who decided to ask the same question repeatedly in a channel until he got an answer…

Today’s holder of the “talk slot of doom” was Daniel Honig, who talked about his company’s experiences of using Spree in large enterprise projects. (I avoided the lunchtime Guinness and so stayed awake this time.) One such project was for the Chipotle Grill chain, with both B2B and B2C stores. He made several interesting points: the need for better reporting tools inside Spree admin (torquebox mentioned as a possible route?); how some customers want product pages to be more brochure-like or to serve as good marketing tools (and his team were experimenting the integration of Alchemy for richer CMS support – the demo looked good); and finally, some points about integrating Spree into various other business support systems like CRM and stock management. His team is working on a platform to help Spree inter-operate with some standard systems.

The best talk IMHO was Adil Wali‘s talk about developing and scaling an online business. He’s been involved in several big projects, and passed on some great insights from both his good and bad experiences of commerce projects. I’m a devout techie, but this guy really got me excited and interested about some of the business issues and how they relate to technical issues. You can see some of this on the sites and I have several pages of notes from this talk. I was going to say watch it online, but sadly the talks weren’t being videoed (apart from @joneslee85’s front-row seat – did he capture anything?). I’d better type up the notes some time.

[Side note: what’s your favourite commerce site? It’d be interesting to collect views on this. My absolute favourite is – completely unique and original, and her team are great on customer service. Free noodles too if you are lucky.]

Paul Campbell, known for his work on Exceptional and Funconf among others, talked about his ticketing site ‘Tito‘ and the various design changes and experiments that it has been through. The key point here is his goal to make it the best experience for buying tickets online. He talked about various design issues, and about experiences with javascript animations and PJAX.

Worth looking up if you haven’t already – the Stripe payment processing gateway, which is much simpler than the standard gateways. There’s some local interest in that it was started by two brothers from Ireland, and it is gradually working its way around the world from its base in the USA. Also worth a look is the Klarna gateway – these guys have partnered with Spreecommerce on a few projects, and were one of the conference sponsors.

Yehuda Katz closed the proceedings with a discussion on handling cruft in open source projects. He also touched on the strengths and weaknesses of the Unix philosophy of composing simple tools into something bigger, and what it meant in the context of larger projects, eg is it tantamount to pushing a problem from one place to another rather than reducing the problem?

Cruft means the ugly bits of code that were needed at some point to support some edge case on some awkward platform, that aren’t essential to the core project. This kind of code is often a source of annoyance and friction for several of the parties involved. The project teams are mostly aware of the cruft, and really do want to get rid of it as soon as they can, though practical details (like life) often get in the way. Newcomers to the project complain about the warts without understanding some of the circumstances and history. Competing projects derive some amusement from the cruft, particularly if they are young enough to have avoided the issues which made the cruft necessary. (Is there a lesson about parenting in here too?) So, Yehuda was reminding us that cruft is a fact of life, and is something we all need to be explicitly aware of and should try to take a balanced approach to. Worth a look: the jQuery roadmap from 28th June, on the jQuery project blog.

So, a great two days, kudos to Brian and crew for organising it, and I look forward to the next one!


Many thanks to my employer, TranscendIT Ltd ( for generously sponsoring my attendance and allowing me the time away. (We’re based in the North East of England, and we handle a variety of bespoke systems from education support to marine fuel testing, but are always interested in Spree projects, from full sites down to helping with technical problems. We’re recently recruited a great designer too!)

Big thanks as well to my better half for holding the fort and entertaining our three young tearaways in my absence. She’s also had to catch a ferocious mouse all by herself whilst I am some 300 miles away. Yes you can go for a spa day or three now (-;

Postscript: why don’t I like the Borehouse

I’ve been on a few brewery tours in my time, and the Storehouse is IMHO a bit of a shallow, anti-septic marketing exercise that leaves a rather bitter aftertaste. Kind of, that’s a few hours of my life that I’ll never get back, only slightly softened by mixing my end-of-tour pint with a good glug of blackcurrant cordial. You also realise how much of an international empire the brewer is, and what it entails. Compare with your favourite over-marketed IT brand now.

Anyway, do try Guiness shandy one time. It’s interesting. But I suggest you leave Ireland first before ordering it: I can’t be held responsible for the consequences. And if you’re in the North East any time soon, do look up Durham Brewery’s White Stout (yes: white) for a refreshing change, or their mighty 10% Imperial Stout.

I remember the time before monads

I’ve been fortunate enough to get some of my ‘essays’ into PragPub magazine, starting with an overview of how functional programmers think and how they use their languages. See Issue 38 for the first installment. Subsequent months should see articles on types and testing, practical programming, refactoring, and dependent types. And yes, eventually something about monads.

When planning these articles, I’ve been reading around some of the recent books and tutorials in Haskell and Ruby – mainly to get a feel for what people might understand and for places where they might run into stumbling blocks. I’m still not entirely satisfied with current material on Haskell, which is one reason I started to write these pieces. The ‘expert’ writing on Haskell is fairly dry, a bit too academic, and I rarely see much of the enjoyment and wonder that keeps me programming in Haskell even after 20+ years. (I love coding, it must be said, and I like to use tools that let me have fun and do great things.)

The other main camp is the people learning Haskell and trying to explain their experiences to others. Though there’s some very useful and interesting accounts out there, they generally also miss some important ideas or put too much emphasis on certain details (and I suspect that the expert camp is partly to blame, for not explaining some ideas as well as they could).

There’s two main culprits: (a) a sense of denial and (b) monads.


By denial, I mean the suggestion that Haskell etc denies itself the use of things like mutable state that everyone else takes for granted, and that a lot of what follows is an attempt to cope with our asceticism. My perspective is different: I see Haskell as starting from a different set of assumptions and arriving at a different place to the mainstream. Instead of us needing higher order functions to cope with not having mutable state, instead, I see HOFs as a useful tool for manipulating the data, and flexibility of HOFs means that we don’t need to rely on things like
mutable state so much. Hell, I’ll stick my neck out and propose a new law:

Callaghan’s 1st Law: the need for mutable state in a language is inversely proportional to its flexibility in manipulating data

Put another way, mutable state is a lower-level idea and becomes less important when your language supports higher-level ways of working.

A similar comment applies to our use of a type system. We’re not doing it to put ourselves in a strait-jacket for any silly or conservative reasons. Experienced Haskell programmers know that the type system is a tool for getting work done, and a great language for playing with designs, and they exploit these aspects to help them get their work done. Plus, if things get in the way, we often find ways to remove the obstacles.

So basically, I think FP and Haskell is more about opportunity than denial. We’re super-liberal!


There’s a view that Haskell is 99% monads (or thereabouts), and that monads are some arcane mystical concept which only a few can master. Bullshit to both!

I remember the time before monads. It _was not_ a barren wasteland, where all we could do is write programs to build trees and not communicate with the outside world. We really could do real world stuff, including file operations, console IO, IPC, though it was a bit clumsy in places. At that time, I was doing PhD work on a large Natural Language Processing system, around 60k lines of Haskell and so one of the largest programs of its time. The program could process and analyze Wall Street Journal articles in a few seconds and build a complex semantic representation of the story, and didn’t use a single monad.

It was however a time of exploration, when researchers explored various ideas to find a good way of both having our cake and eating it. Monads are one of the solutions they found, and essentially gave us a small but flexible API for working with “computations” (like IO operations or state  modifications, or various combinations thereof) as opposed to simple data values, and did so elegantly _within_ the standard langage (ie. no ad hoc extensions needed). It got even better when syntactic sugar was added.

This simple idea provided an excellent structuring pattern to tame a lot of
clumsy code, and even more useful, gave us a solid framework for exploring more powerful ideas.

So monads are highly useful for some aspects of programming work, but they are certainly not an essential or core part. I estimate that 50-80% of most large Haskell programs do not involve monads at all – they are just pure data manipulation. Of the remainder, the monad use is mostly straightforward and follows certain common idioms. Real scary stuff is pretty rare.

Last words

As a new explorer (very warm welcome, by the way!), when you look at Haskell material you may see some very unusual or scary-looking stuff. But do bear in mind that a lot of it is just playing around with abstractions on top of the core language, and probably does translate to something more intelligible. Try to work out what is being said about the data being manipulated, and then it might not look so bad.

Another thing that works is to avoid writing code for things you know how to do, and instead try to write code for other things. For example, try thinking and playing with various tree operations, like leaf counting, traversals, map/foldfilter… Then, you’ll be less tempted to slip into imperative mode.

I also recommend Graham Hutton’s “Programming in Haskell” as the best book introduction to the language. It is a brisk but surprisingly complete introduction to Haskell and related techniques, and parts of it remind me of K&R. However, it is still an academic textbook for an introductory course, and can be thin on the pragmatics and wider picture.

There’s also a thread on the PragPub forums where I’m collecting ideas/requests for future articles. Please feel free to add comments there –

DCI and Programming with Interfaces


May’s rubynortheast covered design issues. Pootsbook delivered an excellent talk, covering just about the right amount of detail and being very clear throughout.

His topic was DCI, or Data, Context and Interaction. Based on work by Trygve Reenskaug, who also developed the MVC approach, it aims to help improve maintainability of code and avoid some of the “pain” often seen in large OO systems where models get bloated and cumbersome. One place where this pain is felt is in testing, where the extra complexity hinders the kind of testing we want to do.

Philip highlighted some trends in the wider Rails community, where design approaches and its effect on testing is being re-evaluated. One important idea is that “testing is not just about testing, it should be about design”.

Briefly, DCI is about separating Data from the various Roles in the system. Too much mixing of data and role implementation leads to bloated models, and also poor modularity – where the boundaries between various roles are blurred. The interaction part deals with the various roles some piece of data may have, eg a person could do employee things or family things. The Context part ties the data and roles together in appropriate combinations as and when required.

DCI is definitely aa step in the right direction. The code examples we saw were initially strange, but soon made sense, and I can imagine how the benefits are kept when scaling it up to real examples. But now, I want to explore some of the underlying ideas and concepts from some other angles.

Some “OO” theory

The most interesting book I’ve ever read on OO (and note, that set contains more than one item), is Kim Bruce’s Foundations of Object-oriented Languages (MIT Press, 2002). The DCI ideas chimed with several points from Bruce’s book – always a good sign when the same ideas pop up in different places. Although the book can be a bit theoretical in places, there’s much that I believe is interesting and useful for OO programmers to know. It does help to understand the ideas behind the tools.

Overall, the book is a discussion of the pros and cons of OO languages, then a deconstruction of key ideas then several investigations into new combinations of features. More simply: he takes things apart and has a play with new combinations, and manages to keep connected to real-world concerns. The discussion in the first third of the book covers many of the key OO styles and the associated languages, and their strengths and weaknesses. One important point from this is that there is no canonical form of OO – researchers and practitioners are still exploring different avenues, and most significantly, there’s a lot of exploration still to be done. We can view current languages as experimental stepping stones (or sweet spots), with programmers trying to do the best they can where they are, until a better stepping stone appears which simplifies or eliminates some of the contortions programmers were getting used to… Remember: languages are tools, and we should always try to get better tools, not be forced into corners by them.

The biggest weakness that Bruce identifies in current languages is how functionality and data are lumped together, and how this leads to deeper problems in the languages, especially when combined with ideas like inheritance. Put simply, the hard fact is that functionality inheritance works one way, and data inheritance works another way, and they ain’t the same. Ever noticed how a sensible data design (ie a class hierarchy) sometimes doesn’t fit nicely when you start adding in the inheritance of functionality? Or noticed that the rules for inheritance of methods works differently from the rules for data inheritance? Java is a good example: its language spec has two distinct sets of rules for methods vs instance variables. This gets clearer if you try to work with ‘methods’ tied to instance vars, as you do with javascript. (The brave can now go and look up
co-variant versus contra-variant inheritance…) So, like in DCI, separation of Data from Functionality definitely seems like a good idea.

In case you’re not sure how separating out the functionality works, then consider some examples, say a piece of person data which has some employee-like functionality and some family-like functionality. The employee functionality corresponds to being able to do employee-like things, such as returning the company the person works for or being able to work for X hours on a given day. From this, you might have more specific kinds of employee, like a manager who is responsible for a group of employees and projects etc. Now, a key idea is that when you start isolating out an employee’s functionality, you are then more able to write code which depends on just that functionality and none of the other stuff that a concrete employee (a person) might be able to do.
Instant modularity! And you’ve not lost much in flexibility either, if anything. You might have seen this used in Java, though in my experience, it’s not really pushed in undergrad courses and so not as understood as it should be.

The rest of the book then works toward an alternative formulation of OO ideas that is no less flexible than existing languages but has some important new strengths, though he does stay within more class-based OO languages. This exploration is wide-ranging: the final chapter even asks if we really need to build unrestricted heterogeneous collections (in laymans’ terms, the option to chuck absolutely anything into a list). Is this a key or defining property required in OO languages, which we can’t do without? (Your answer goes here.) Bruce discusses whether we actually gain much in software engineering terms, balanced agains the extra complexity and implications for run-time, and IIRC his conclusion was that it wasn’t really worth it.

He doesn’t say as much, but the language he ends up with is actually quite Haskell-like. It has a notion like java interfaces, but more powerful, and a stronger account of polymorphism. It’s also more amenable to compile-time checking, so eliminates a lot of potentially slow run-time checking. Various implementations exist, if you are interested – even one done as a final year project by a student of mine. I’ve not used these extensively, but informally, they do feel like progress.

To summarise, OO is not something fixed in stone, and we have the ability and the right to experiment with different designs. Informally, it seems sensible to make a bigger separation of functionality and data. And it is possible to build programming languages which work in this way, without losing too much of OO’s claimed flexibility and at the same time, gaining some important and useful features too. And to me, DCI (as a design principle) seems to be walking in the same direction, albeit as a pattern of use rather than something encouraged by the language.

A bit of “deja vu”

Haskell contains a notion of interface too, as a promise of functionality that certain types can supply. You could also call it ‘behaviour’ or ‘role’. Java has a simple version of this idea, but Haskell version is integrated into the wider language and much more powerful. (You can actually run Prolog programs and more in the type-checker, but that’s another story.) Though not the original goal of this part of Haskell, we can use it to do the kind of interface-directed programming suggested by DCI. What follows is a sketch, rather than actual working examples. (I’m writing this in my lunch hour and want to give a flavour, not get stuck in details of type inference.)

class Product x => Cart_like a x where
add :: x -> a x -> a x

This says, given some product-like type ‘x’ (could be anything, as long as it fulfils the role), then data types ‘a x’ are “cart_like” if they provide an ‘add’ operation which can add product_like values to the cart. In the above, the ‘a’ part actually denotes some kind of container, eg lists or something more complex built on top of lists, so ‘a x’ ends up meaning “a container with ‘x’ values as its payload”. Bit scary, but it means more descriptiveness and more flexibility. Having declared this interface, we can then use it without committing ourselves to an actual choice of cart or product type, eg adding a list of things.

add_things :: Cart_like a x => [x] -> a x -> a x
add_things stuff = foldr (.) id $ map add things
# in ruby terms, = {|x| ->(y) {add(x,y)}}.inject(cart) {|x,y| ... }

Pricing up a cart works similarly, assuming we can get the price of our product-like things, then we can walk through the container and sum up the numbers. So, Haskell supports this interface-based programming quite nicely, and the types do help to document the various combinations.

Returning to the DCI example, most of it can be done with interfaces. The interesting bit is building the context values. Basically, we don’t have to do much, and can rely on our type system instead. Plus, it is a bit more principled than dynamically combining mixins on an object (anyone else feel uncomfortable with the latter?). Below, we have a (contrived) operation that needs family and employee functionality. The type just says, can apply this to anything that has both behaviours.

some_employee_family_thing :: (Family a, Employee a) => Date -> String -> a -> IO (Maybe Approval)
some_employee_family_thing when reason person
= book (date :@ ("Taking my " + show (num_kids person) + " kids to " + reason) `for` company_of person

We could define a new class FamilyEmployee which has the two behaviours as prerequisites (or super-classes?), but it’s completely up to us. We could also put the above code inside a FamilyEmployee interface, if there’s other operations to go in there too, or if we want type-specific implementations of this operation.

Language pattern or Design pattern?

Last few words. Hopefully, if you’ve made it this far, you have got a wider feeling for what DCI and similar are doing, and have some idea of the underlying concepts.

But hopefully you will be thinking about the deeper questions. Like, do concepts like DCI have to be ‘OO’? And which OO do we want? (Contrast: a functional programmer would ask, what’s the data being manipulated and what kind of DSL would we like to use to solve the problem.)

My last point is this: should these things be a design pattern, used as “good practice” for coding in language X, or should we demand language Y, which provides more direct support, maybe even more convenience and safety? One of Reenskaug’s papers, when motivating DCI, mentioned a concept of (having to) feel our way around the program via tests, rather than being able to read it and understand it directly. More language support would definitely help with the latter.

An intro to Haskell for Ruby programmers


Stuck for a topic at Newcastle’s Ruby group, I attempted an off the cuff talk on what Haskell is about, and what Ruby programmers could learn from it.

I was hoping to tie this to a kata-style programming exercise: this would give a nice set of examples where we could compare code and approaches in a variety of languages. In the end, this didn’t get organised. (Hey, we need to send a spy to the Edinburgh and Glasgow Ruby groups to see how they manage it.)

This piece here is an attempt to summarise the key points.

So what is FP and Haskell?

It’s possible to spend ages arguing about what FP is and what it means. For me, there is one central idea behind everything that informs the language design and the programming style: FP is about *putting data structures first*.

I don’t often see this view in the tutorials or the textbooks. Sometimes, I fear that there’s generally too much focus on the syntax and semantics of the language(s) and not enough on the _pragmatics_. The latter includes the problem solving techniques and general approach, and is what you really need to grok to use the language well. The syntax etc will get you through small examples and exercises, but you need to absorb some deeper ideas
to make it really fly.


This reminds me of my second-favourite joke. An eminent professor was once asked what was the best programming language. He thought for a moment, and answered “graduate student”.

It’s a serious point, think about it. What language would you like to program in? You don’t have to use the ones everyone else does. In fact, you can do whatever you like! Languages are our tools, so let’s use the best tools we can get.

There’s an approach to programming which I shall call the “McBride method”. Obviously, this has been invented several times, and is the kind of thinking behind current popularity of (embedded) DSLs. This version comes from Prof. Fred McBride in the 1970s, around the time he implemented pattern matching in Lisp. It has three steps:

  • write the program P you want to write
  • write another program Q which runs P
  • run P via Q.

Notice that this version is particularly pure: it’s not about fitting a DSL into one’s favourite language, instead more about first designing a good language for your problem – and then wondering about the implementation of it.

Notice that problem solving, and expressing a clear solution for the problem, always comes first. Be sure you know which end of the dog you are dealing with.

What do the languages provide?

So programming in FP (to me) is about focussing on the data structures in a problem domain, and on the transformations between the data. When problem-solving, you should work out what kinds of data you are manipulating and think about ways to transform the input to the output.

(Modern) Functional languages provide excellent tools for working in this way, and the good languages have rich collections of tools which work well together. The effect is greater freedom and flexibility to do complex programming.

Have you ever looked at some of your code and thought, there has to be a better way to do it? Or thought that the key details were getting lost under syntactic and language baggage? FP provides some of the tools you need. These tools include

  • easy to create a range of data structures (not just objects, arrays, hashes)
  • powerful tools for taking apart and manipulating data – pattern matching
  • lots of flexibility to parametrize and reuse code
  • powerful glue, typically via HOFs (higher order functions)

Some examples: data structures

  1. Bool isn’t built in to Haskell, in the sense that it is in many other languages. Instead, we can just define it like any other (enumeration) type, say like days of the week, and then use the general mechanisms in the language. (We also get short-circuit evaluation of conditional expressions for free, but that’s another story.)
  2. The same data type definition mechanism allows definition of records (structs), unions,
    polymorphic types, recursive types, and other stuff you’ve probably never even thought of
  3. Eg “data Maybe a = Nothing | Just a” – is a polymorphic type which is very useful in FP. You might think of it as a box which is either empty or has a value in it. I’ll say more about it later.
  4. Recursive types, like trees, eg “data BTree = BLeaf Int | BNode BTree BTree” for a simple binary tree with Ints at its leaves, or “data PTree a = PLeaf a | PNode (PTree a) (PTree a)” for the same but with anything at the leaves. And we can use these (parametric) polymorphic types with any types, eg “PTree String” for string leaves, or even “PTree [Ptree (IO ())]” for trees with trees of IO-actions at the leaves.
  5. How about this? “data X c a = XLeaf a | XNode (c (X c a))” – notice how ‘c’ is being used. Effectively, we have something that is tree-like, but the kind of tree it is can be controlled by changing ‘c’. For example, if c is the type constructor for lists, then we get trees whose nodes have zero or more children. If (c a) is the type (DayOfWeek -> a), then we have a tree whose nodes have a child for each day of the week. Powerful stuff.

Haskell can also construct automatically various widgets for the types we define, such as standard equality tests, ordering tests, show functions, enumeration ranges (eg ‘[Monday .. Friday]’).

The key point here is freedom and flexibility – we can easily add new types to model parts of a problem domain, or reuse existing types to build the types we need. There’s less overhead than in other languages – a short, one-line definition does a lot of work.

Some examples: pattern matching

Doing stuff with values in the above types is easy: we just write clauses in our functions to deal with the various patterns we expect to see.

Example, mirror-reversing a tree. There’s two main cases, leaf or node, and each clause says what to do with the contents of the tree.

mirror :: PTree a -> PTree a
mirror (Leaf x) = Leaf x
mirror (PNode l r) = PNode (mirror r) (mirror l)

Notice that we’re covering all possible cases of tree here. A value which is of tree type is either a leaf or a node, and we provide code to handle both cases in full. We’ll never get a run-time error when an unexpected input is received. Some of the compilers track this ‘totality’ for us, and can give warnings when functions don’t cover all cases.

Also note, we don’t need if-statements so much now. Pattern matching does most of the checking work for us. We can still have boolean-conditional tests, eg “if 2 > 3 then ‘a’ else ‘b'”, or there’s a short-hand for combining these with the patterns above. We can also define something ourselves.

my_if :: Bool -> a -> a -> a
my_if False _ f = f
my_if True t _ = t

Finally, we can arbitrarily nest patterns for more complex conditions.

foo :: PTree Int -> PTree String
foo (Leaf x) | x > 10 = Leaf "big"
| otherwise = Leaf "small"
foo (PNode (Leaf x) r) = PNode (foo (Leaf x)) (foo r)
foo (PNode l (Leaf r)) = PNode (foo (Leaf r)) (foo l)
foo (PNode l r) = PNode (foo l) (foo r)

Some examples: parametrization & reuse

You’ve already seen some examples of this above, eg where we defined a general structure for binary trees and used it in a variety of ways. Tree mirroring doesn’t depend on what’s at the leaves, so we can use it to mirror any tree value (as long as it’s “PTree something”). The nonsense ‘foo’ function shows a conversion from a tree with one type of ‘payload’ to a tree with a different type.

Haskell also has a notion of ‘type class’, which is a bit like Java’s ‘interface’ feature but more powerful. With this, we can declare that certain types implement some interface (as a collection of functions), and then write polymorphic functions to work on all types that obey the interface. For example, showing a PTree:

show_ptree :: Show a => Ptree a -> String
show_ptree (PLeaf x) = show x
show_ptree (PNode l r) = "PNode (" ++ show_ptree l ++ ") (" ++ show_ptree r ++ ")"

We can use this to show a tree whose payload has a showable type. In fact, this kind of function is something Haskell can define for us automatically, so we’d rarely write the above. Notice also that Haskell will check that we’re using interfaces appropriately, and tell us when some interface use isn’t ok, eg if we’re trying to show a value in a type which has some non-showable component (and the compilers usually suggest what extra code we need).

Some examples: powerful glue

Functional programs tend to be a collection of small definitions, combined in various ways to solve the overall problem. We often refer to this facility as ‘glue’.

For example, we might have broken our problem solution into a series of linear steps, so we can imagine joining the steps together in a pipeline, with one stage feeding its output to the input of the next, eg ‘unwords $ map reverse $ words “foo bar”‘ (read this right to left first).

Higher order functions, where functions receive or return other functions, is another kind of glue. Ruby programmers will know about map, filter/select, and inject. These are frequent in Haskell, the language makes it easy to use these. For example, you want to put two map operations together. In Ruby you’d write “” – but can you do this more directly? Haskell allows this: “map ( list”. The dot between the functions is another important tool in Haskell – it is _function composition_, and it’s very useful for building pipelines. Returning to the example above, we can just write “unwords . map reverse . words” and get a new function from the pieces. There’s nothing special about ‘.’ either – it’s not baked into the language, but another thing we can write in vanilla Haskell, and could even redefine if we so wished – eg. to give reverse composition or even an OO-like left-to-right kind of chaining.

Some thoughts

So, a different style of language, which promotes thinking about data structures first and provides many tools to make this work. Flexibility with function types is also an important point. Sometimes, the best design for something is as a function, and the associated DSL will effectively be building a more complex function.

This flexibility does change your style of programming, and maybe can help to bring us closer to “code with obviously no defects” (Hoare). Program like you mean it!

Side effects and the real world

Some cynics may say that the real world is not about data structures, and that FP has its limits. Sorry, but everything is a piece of data; it’s a case of recognising what kind of data and finding good ways to interact with it.

Pure FP doesn’t have a notion of mutable state, but what is state? what does it mean to change state? How about this: change of state is a bit like a functional operation, mapping ‘before’ to ‘after’; and it turns out we can get a nice account of many real world ideas by wrapping up this function in an abstract interface and allowing only certain operations on it. In particular, we want precise control over the order of side effects – and this is provided by the ‘monad’ functionality in Haskell.

What do you need to know about monads? Basically, we distinguish between simple values (like those of type Int) and _actions_ or _computations_ which eventually yield some values, eg values of type “IO Int” for a side-effecting computation which produces an Int at some time. We can glue such actions together in sequences, and arrange for the output from one stage to be fed to the input of the next. Apart from this, we can use them like any other kind of value in the language, like build HOFs for them or put them in trees. We can also have different monads for state (without IO), or pure IO, or IPC, or concurrency, …

To cut a long story short, a nice bit of abstraction allows programming with side effects in a flexible yet powerful way, without it looking like a dog’s dinner. Plus, the type system helps us keep pure code and the dodgier side-effecting stuff separated.

It’s worth adding too: with a more powerful language, there’s less need for side effects. For example, most loops can be re-cast as a combination of mapping, filtering, and folding, so can be implemented with HOFs instead of nested loops and index variables. You might be surprised by how much of your code can be expressed without side-effects.

Web stuff

Again, no limits. There’s several frameworks for doing web programming in Haskell, including Yesod and HAppStack, plus libraries for individual aspects like Html processing or database interfaces. (An OReilly book on Yesod has just been released.) Some of these libraries adopt conventional approaches, kind of like a monadic veneer on top of conventional libraries, whereas others explore other approaches to web programming inside a functional setting. For example, the rack level can be programmed as a function from request to response if no side effects are needed, else as a function in some monad if various side effects are needed. Generating Html docs is easy via a Haml-like library, which is actually an embedded DSL and so can be freely mixed with other code.

OO aspects?

I’ll save this for another time. Depends what you understand by OO, and of its strengths and weaknesses. We can model a lot of the features inside FP (albeit with a bit of syntactic baggage) – objects are just another kind of data structure with a particular kind of interface. But overall, I rarely find that I need many OO features when programming in Haskell.

So what’s useful for Ruby programmers?

I think the main benefit is to be aware of the data structures that your program is manipulating, and try to write your code to bring this more to the fore; then you’ll see more functional ideas appearing quite naturally in your code.

It also helps to think about the strengths and weaknesses of your existing tools, to use their strengths and to avoid getting caught by their weaknesses.

Similar comments apply to those working with Coffeescript or similar. It does some functional stuff, but still lacks a few useful features. Anyone looked at Roy yet? It’s more Haskell-ish. (But I still want more: watch this space.)