Tuesday, October 16, 2007

A nice puzzle for the ubiquitous language

Multi Lingual Domain

During the Domain Driven Design seminar I had the pleasure to attend in JAOO, a question arose from one of the participants, something like: "we have a domain model which is expressed in Spanish, but then the company bought an Italian one, and then the model had to be extended to the Italian scenario (which is obviously slightly different), and then to a British one…". The main problem is that a model should be expressed in a language. Countries like United States, Great Britain, Canada, India etc. are privileged because they seldom have to choose the language in which the model has to be expressed.

Our usual scenario instead is a mixture of English and Italian; getNome() and getCognome() are typical examples of this language mismatch. You can put some effort to rule out the situation, declaring an "all Italian" rule (which doesn't necessarily mean spaghetti-code) or an "all English", but don't expect a 100% compliance to it: you'll always end up with a mess. Technical terms (especially in banking and insurance industry) are scary enough in Italian so no developer would easily stick to an English definition of the term, but most of the analysts would agree with them (or use them as an excuse). Anyway, given that the domain model must be shared with the domain experts, most of the times you simply have no choice.

Defining the core domain

Given that language issue in non-English speaking countries is already an issue in a single nation scenario, how could the initial puzzle be solved? There is no real answer. Basically it depends a lot on the team special skills: if you have people with cross competencies, you probably could leverage this knowledge, to expand the core domain, including multi-lingual scenarios. If those people are not there to stay, you could be anyway putting the core domain at risk of being unmantainable.

Given that national domains will be subject to EU and national regulations we have also a two-level driver for change. EU regulations would push for a shared domain, while national one will call for separate ones. Assuming that you can't really predict where the change will happen (but EU regulations leave you more time to think than national ones) you are left with the basic question: "how much of the model is really shared?" the answer is going to be "a small portion" now but probably going to increase as long as a deeper knowledge of the system is shared. At the end of the day, the really tricky thing to do, is to evaluate the costs of sharing information versus the costs of maintaining duplicated code. One can invest on knowledge sharing (Wikis, mailing lists and so on) but I guess that if the developers don't "cross the boundaries" spontaneously to attack the duplicated component (or to translate it to English), forcing them to do so will be probably more expensive than keeping separate domains.

Language specific wrappers

If shared components are to be used, then language will be an issue again. With simple terms English would be fine. With more complex ones, then code readability might suffer. A possibility is to define language-dependent wrappers, whose only purpose is to embellish the original component, providing access to keywords in the language of the local domain. I personally prefer to use English terms and to write a multi-lingual javadoc (but alignment is risky) or wiki documentation. But in some circumstances this might help.


No comments: