On MDA and reverse engineering

This appeared somewhere in a discussion in the MDA forum on LinkedIn:

If legacy code exists, MDA can be used to capture the models from the existing code. Extracting models from legacy code is difficult, but it is much better than having someone go through the code and create a model in their head

I would pose reverse engineering has nothing to do with MDA. MDA is really about transformations in one way only: from more abstract to more specific models (while reverse engineering is the opposite).

I am not saying though that one cannot use models obtained with reverse engineering in a MDA context, but that is out of the scope of MDA. But I’d go as far saying that in the general case reverse engineering is not a good approach for producing platform independent models. Reasons:

  1. good models require care and intent in representing the business rules and constraints in the domain being addressed, and that is really rare to see in handwritten code (Exhibit A: the incipient popularity of Domain Driven Design). If something is not there to begin with, a tool cannot extract it.
  2. manual implementation results naturally in high variability in how the same the domain and technical concerns are addressed (independently and in coordination).

Those two things make it really hard (impossible, I’d say, other than in exceptional cases) for a reverse engineering tool (that is based in some sort of pattern matching) to identify code elements and map them back to platform independent models that are not only accurate and complete, but also well designed.

Reverse engineering can be useful in getting an initial approximation of the PIMs (say, covering structure only, but not dynamics or behavior), but that will require significant manual work to become properly designed models.

How can modeling be harder than programming?

One argument often posed against model-driven development is that not all developers have the skills required for modeling. This recent thread in the UML Forum discussion group includes a very interesting debate on that and started with this statement by Dan George:

Don’t take his comments about the orders of magnitude reduction in code size to mean orders of magnitude reduction in required skill. I think this is the reason model-driven development is not mainstream. The stream is full of programmers that could never even develop the skills necessary to use MDD. Humbly, I know that I’m still a wannabe.

which was contested by H.S. Lahman

I have to disagree (somewhat) with this assessment. Yes, there is a substantial learning curve and OOA models need to be developed much more rigorously than they are in most shops today. Also, one could argue that the learning curve is essentially the same learning curve needed to learn to do OOA/D properly.

and later by Thomas Mercer-Hursh:

There is something confusing about the idea of good modeling being hard. After all, all one is doing is describing how the system is supposed to work without having to worry about the implementation details. If one can’t do that, then how is one supposed to manually create a correct, working system?

I sympathize with Lahman’s and Thomas’ points (and share some of their puzzlement), but I do agree with Dan’s initial point: modeling can be harder than programming.

Separation of concerns? Not in the job description

The fact is that one can deliver software that was apparently appropriately built (from a QA/product owner/user point-of-view) and yet fail to fully understand the constraints and rules of the business domain the software is meant to serve.

Also, even if a developer does understand the business requirements at the time the solution is originally implemented, it is unfortunately very common that they will fail to encode the solution in a way that clearly express the intent in a way that would be easy for other developers (or themselves) at a later time correlate the code to business requirements (as proposed by Domain Driven Design), leading to software that is very hard to maintain (because it is hard to understand, or hard to change without breaking things). Model-driven development is a great approach for proper separation of concerns when building software (the greatest, if you ask me). However, as sad as that is, proper separation of concerns is not a must-have trait for delivering “appropriate” software (from a narrow, external, immediate standpoint). Ergo, one can build software without modeling, even implicitly.

I don’t think those things happen because developers are sociopaths. I think properly understanding and representing the concerns of a business domain when building software is a very desirable skill (I would say critical), but realistically not all that common in software developers. But how can hordes of arguably proficient programmers get away without such skill?

Delivering software the traditional (programming-centric) way often involves carefully patching together a mess of code, configuration and some voodoo to address a complex set of functional and non-functional requirements that works at the time of observation (a castle of cards is an obvious image here). Building software that way makes it too easy for one to be overwhelmed by all the minutia imposed by each technology and the complexity of making them work together and lose track of the high level goals one is trying to achieve – let alone consciously represent and communicate.

Conclusion

So even though I fully agree with the sentiment that proper programming requires a good deal of modeling skills, I do think it is indeed possible to deliver apparently working software (from an external point of view) without consciously doing any proper modeling. If you stick to the externally-facing aspects of software development, all that is valued is time to deliver, correctness, performance, and use of some set of technologies. Unfortunately that is all that is required for most development positions. Easy of maintenance via proper separation of concerns is nowhere in that list. And model-driven development is essentially an approach for separation of concerns on steroids.

What do you think?

MDD meets TDD (part II): Code Generation

Here at Abstratt we are big believers of model-driven development and automated testing. I wrote here a couple of months ago about how one could represent requirements as test cases for executable models, or test-driven modeling. But another very interesting interaction between the model-driven and test-driven approaches is test-driven code generation.

You may have seen our plan for testing code generation before. We are glad to report that that plan has materialized and code generation tests are now supported in AlphaSimple. Follow the steps below for a quick tour over this cool new feature!

Create a project in AlphaSimple

First, you will need a model so you can generate code from. Create a project in AlphaSimple and a simple model.


package person;

enumeration Gender 
  Male, Female
end; 

class Person
    attribute name : String; 
    attribute gender : Gender; 
end;

end.

Enable code generation and automated testing

Create a mdd.properties file in your project to set it up for code generation and automated testing:


# declares the code generation engine
mdd.target.engine=stringtemplate

# imports existing POJO generation template projects
mdd.importedProjects=http://cloudfier.com/alphasimple/mdd/publisher/rafael-800/,http://cloudfier.com/alphasimple/mdd/publisher/rafael-548/

# declares a code generation test suite in the project
mdd.target.my_tests.template=my_tests.stg
mdd.target.my_tests.testing=true

# enables automated tests (model and templates)
mdd.enableTests=true

Write a code generation test suite

A code generation test suite has the form of a template group file (extension .stg) configured as a test template (already done in the mdd.properties above).

Create a template group file named my_tests.stg (because that is the name we declared in mdd.properties), with the following contents:


group my_tests : pojo_struct;

actual_pojo_enumeration(element, elementName = "person::Gender") ::= "<element:pojoEnumeration()>"

expected_pojo_enumeration() ::= <<
enum Gender {
    Male, Female
}
>>

A code generation test case is defined as a pair of templates: one that produces the expected contents, and another that produces the actual contents. Their names must be expected_<name> and actual_<name>. That pair of templates in the test suite above form a test case named “pojo_enumeration”, which unsurprisingly exercises generation of enumerations in Java. pojo_enumeration is a pre-existing template defined in the “Codegen – POJO templates” project, and that is why we have a couple of projects imported in the mdd.properties file, and that is why we declare our template suite as an extension of the pojo_struct template group. In the typical scenario, though, you may would have the templates being tested and the template tests in the same project.

Fix the test failures

If you followed the instructions up to here, you should be seeing a build error like this:



Line	File		Description
3	my_tests.stg	Test "pojo_enumeration" failed: [-public -]enum Gender {n Male, Femalen}

which is reporting the code generated is not exactly what was expected – the template generated the enumeration with an explicit public modifier, and your test case did not expect that. Turns out that in this case, the generated code is correct, and the test case is actually incorrect. Fix that by ensuring the expected contents also have the public modifier (note that spaces, newlines and tabs are significant and can cause a test to fail). Save and notice how the build failure goes away.

That is it!

That simple. We built this feature because otherwise crafting templates that can generate code from executable models is really hard to get right. We live by it, and hope you like it too. That is how we got the spanking new version of the POJO target platform to work (see post describing it and the actual project) – we actually wrote the test cases first before writing the templates, and wrote new test cases whenever we found a bug – in the true spirit of test-driven code generation.

Can you tell this is 100% generated code?

Can you tell this code was fully generated from a UML model?

This is all live in AlphaSimple – every time you hit those URLs the code is being regenerated on the fly. If you are curious, the UML model is available in full in the TextUML’s textual notation, as well as in the conventional graphical notation. For looking at the entire project, including the code generation templates, check out the corresponding AlphaSimple project.

Preconditions

Operation preconditions impose rules on the target object state or the invocation parameters. For instance, for making a deposit, the amount must be a positive value:


operation deposit(amount : Double);
precondition (amount) { return amount > 0 }
begin
    ...
end;

which in Java could materialize like this:


public void deposit(Double amount) {
    assert amount > 0;
    ...
}

Not related to preconditions, another case assertions can be automatically generated is if a property is required (lowerBound > 0):


public void setNumber(String number) {
    assert number != null;
    ...
}

Imperative behavior

In order to achieve 100% code generation, models must specify not only structural aspects, but also behavior (i.e. they must be executable). For example, the massAdjust class operation in the model is defined like this:


static operation massAdjust(rate : Double);
begin
    Account extent.forEach((a : Account) { 
        a.deposit(a.balance*rate) 
    });
end;

which in Java results in code like this:


public static void massAdjust(Double rate) {
    for (Account a : Account.allInstances()) {
        a.deposit(a.getBalance() * rate);
    };
}

Derived properties

Another important need for full code generation is proper support for derived properties (a.k.a. calculated fields). For example, see the Account.inGoodStanding derived attribute below:


derived attribute inGoodStanding : Boolean := () : Boolean { 
    return self.balance >= 0 
};

which results in the following Java code:


public Boolean isInGoodStanding() {
    return this.getBalance() >= 0;
}

Set processing with higher-order functions

Any information management application will require a lot of manipulation of sets of objects. Such sets originate from class extents (akin to “#allInstances” for you Smalltalk heads) or association traversals. For that, TextUML supports the higher-order functions select (filter), collect (map) and reduce (fold), in addition to forEach already shown earlier. For example, the following method returns the best customers, or customers with account balances above a threshold:


static operation bestCustomers(threshold : Double) : Person[*];
begin
    return
        (Account extent
            .select((a:Account) : Boolean { return a.balance >= threshold })
            .collect((a:Account) : Person { return a->owner }) as Person);
end;        

which even though Java does not yet support higher-order functions, results in the following code:


public static Set<Person> bestCustomers(Double threshold) {
    Set<Person> result = new HashSet<Person>();
    for (Account a : Account.allInstances()) {
        if (a.getBalance() >= threshold) {
            Person mapped = a.getOwner();
            result.add(mapped);
        }
    }
    return result;
}

which demonstrates the power of select and collect. For an example of reduce, look no further than the Person.totalWorth attribute:


derived attribute totalWorth : Double := () : Double {
    return (self<-PersonAccounts->accounts.reduce(
        (a : Account, partial : Double) : Double { return partial + a.balance }, 0
    ) as Double);
};  

which (hopefully unsurprisingly) maps to the following Java code:


public Double getTotalWorth() {
    Double partial;
    partial = 0;
    for (Account a : this.getAccounts()) {
        partial = partial + a.getBalance();
    }
    return partial;
}

Would you hire AlphaSimple?

Would you hire a developer if they wrote Java code like AlphaSimple produces? For one thing, you can’t complain about the guy not being consistent. :) Do you think the code AlphaSimple produces needs improvement? Where?

Want to try by yourself?

There are still some bugs in the code generation that we need to fix, but overall the “POJO” target platform is working quite well. If you would like to try by yourself, create an account in AlphaSimple and to make things easier, clone a public project that has code generation enabled (like the “AlphaSimple” project).

11 Dogmas of Model-Driven Development

I prepared the following slides for my Eclipse DemoCamp presentation on AlphaSimple but ended up not having time to cover them. The goal was not to try to convert the audience, but to make them understand where we are coming from, and why AlphaSimple is the way it is.

And here they are again for the sake of searchability:

I – Enterprise Software is much harder than it should be, lack of separation of concerns is to blame.

II – Domain and architectural/implementation concerns are completely different beasts and should be addressed separately and differently.

III – What makes a language good for implementation makes it suboptimal for modeling, and vice-versa.

IV – Domain concerns can and should be fully addressed during modeling, implementation should be a trivial mapping.

V – A model that fully addresses domain concerns will expose gaps in requirements much earlier.

VI – A model that fully addresses domain concerns allows the solution to be validated much earlier.

VII – No modeling language is more understandable to end-users than a running application (or prototype).

VIII – A single architecture can potentially serve applications of completely unrelated domains.

IX – A same application can potentially be implemented according to many different architectures.

X – Implementation decisions are based on known guidelines applied consistently throughout the application, and beg for automation.

XI – The target platform should not dictate the development tools, and vice-versa.

I truly believe in those principles, and feel frustrated when I realize how far the software industry is from abiding by them.

So, what do you think? Do you agree these are important principles and values? Would you call B.S. on any of them? What are your principles and values that drive your vision of what software development should look like?

Interview on perspectives on MDD and UML

I had the honor of being interviewed by Todd Humphries, Software Engineer at Objektum Solutions, on my views on UML and model-driven development. Here is an excerpt of the interview:

Todd Humphries: Did you have a ‘Eureka!’ moment when modelling made sense for the first time and just became obvious or was there one particular time you can think of where your opinion changed?

Rafael Chaves: When I was first exposed to UML back in school it did feel cool to be able to think about systems at a higher level of abstraction, and be able to communicate your ideas before getting down to the code (we often would just model systems but never actually build them). The value of UML modeling for the purpose of communication was evident, but that was about it. I remember feeling a bit like I was cheating, as drawing diagrams gave me no confidence the plans I was making actually made a lot of sense.

After that, still early in my career, I had the opportunity of working in a team where we were using an in-house code generation tool (first, as many have done, using XML and XSLT, and later, using UML XMI and Velocity templates, also common choices). We would get reams of Java code, EJB configuration files and SQL DDL generated from the designer models, and it did feel a very productive strategy for writing all that code. But the interesting bits (business logic) were still left to be written in Java (using the generation gap pattern). It was much better than writing all that EJB boilerplate code by hand, but it was still cumbersome and there was no true gain in the level of abstraction, as we would model thinking of the code that would be generated – no surprise, as there was no escaping the facts that we would rely on the Java compiler and JUnit tests to figure out whether the model had problems, and in order to write the actual business logic in Java, we had to be very familiar with the code that was generated. So even though I could see the practical utility of modeling by witnessing the productivity gains we obtained, there was a hackish undertone to it, and while it worked, it didn’t feel like solid engineering.

It was only later, when…

Visit The Technical Diaries, Objektum team’s blog, for the full interview. Do you agree with my views? Have your say (here or there).

MDD meets TDD: mapping requirements as model test cases

Executable models, as the name implies, are models that are complete and precise enough to be executed. One of the key benefits is that you can evaluate your model very early in the development life cycle. That allows you to ensure the model is generally correct and satisfies the requirements even before you have committed to a particular implementation platform.

One way to perform early validation is to automatically generate a prototype that non-technical stakeholders can play with and (manually) confirm the proposed model does indeed satisfy their needs (like this).

Another less obvious way to benefit from executable models since day one is automated testing.

The requirements

For instance, let’s consider an application that needs to deal with money sums:

  • REQ1: a money sum is associated with a currency
  • REQ2: you can add or subtract two money sums
  • REQ3: you can convert a money sum to another currency given an exchange rate
  • REQ4: you cannot combine money sums with different currencies

The solution

A possible solution for the requirements above could look like this (in TextUML):

package money;

class MixedCurrency
end;

class Money
  attribute amount : Double;
  attribute currency : String;
  
  static operation make(amount : Double, currency : String) : Money;
  begin 
      var m : Money;
      m := new Money;
      m.amount := amount;
      m.currency := currency;
      return m;
  end;
  
  operation add(another : Money) : Money;
  precondition (another) raises MixedCurrency { return self.currency = another.currency }
  begin
      return Money#make(self.amount + another.amount, self.currency);
  end;
  
  operation subtract(another : Money) : Money;
  precondition (another) raises MixedCurrency { return self.currency = another.currency }  
  begin
      return Money#make(self.amount - another.amount, self.currency);      
  end;
  
  operation convert(anotherCurrency : String, exchangeRate : Double) : Money;
  begin
      return Money#make(self.amount * exchangeRate, anotherCurrency);
  end;  
end;
        
end.

Now, did we get it right? I think so, but don’t take my word for it.

The proof

Let’s start from the beginning, and ensure we satisfy REQ1 (a money sum is a pair <amount, currency>:

[Test]
operation testBasic();
begin
    var m1 : Money;
    m1 := Money#make(12, "CHF");
    Assert#assertEquals(12, m1.amount);
    Assert#assertEquals("CHF", m1.currency);
end;

It can’t get any simpler. This test shows that you create a money object providing an amount and a currency.

Now let’s get to REQ2, which is more elaborate – you can add and subtract two money sums:

[Test]
operation testSimpleAddAndSubtract();
begin
    var m1 : Money, m2 : Money, m3 : Money, m4 : Money;
    m1 := Money#make(12, "CHF");
    m2 := Money#make(14, "CHF");

    m3 := m1.add(m2);    
    Assert#assertEquals(26, m3.amount);
    Assert#assertEquals("CHF", m3.currency);
    
    /* if m1 + m2 = m3, then m3 - m2 = m1 */
    m4 := m3.subtract(m2);
    Assert#assertEquals(m1.amount, m4.amount);
    Assert#assertEquals(m1.currency, m4.currency);
end;

We add two values, check the result, them subtract one of them from the result and expect the get the other.

REQ3 is simple as well, and specifies how amounts can be converted across currencies:

[Test]
operation testConversion();
begin
    var m1 : Money, result : Money;
    m1 := Money#make(3, "CHF");
    result := m1.convert("USD", 2.5);
    Assert#assertEquals(7.5, result.amount);
    Assert#assertEquals("USD", result.currency);
end;

We ensure conversion generates a Money object with the right amount and the expected currency.

Finally, REQ4 is not a feature, but a constraint (currencies cannot be mixed), so we need to test for rule violations:

[Test]
operation testMixedCurrency();
begin
    try
        Money#make(12, "CHF").add(Money#make(14, "USD")); 
        /* fail, should never get here */
        Assert#fail("should have failed");
    catch (expected : MixedCurrency)
        /* success */
    end;
end;

We expect the operation to fail due to a violation of a business rule. The business rule is identified by an object of a proper exception type.

There you go. Because we are using executable models, even before we decided what implementation platform we want to target, we already have a solution in which we have a high level of confidence that it addresses the domain-centric functional requirements for the application to be developed.

Can you say “Test-driven modeling”?

Imagine you could encode all non-technical functional requirements for the system in the form of acceptance tests. The tests will run against your models whenever a change (to model or test) occurs. Following the Test-Driven Development approach, you alternate between encoding the next requirement as a test case and enhancing the model to address the latest test added.

Whenever requirements change, you change the corresponding test and you can easily tell how the model must be modified to satisfy the new requirements. If you want to know why some aspect of the solution is the way it is, you change the model and see the affected tests fail. There is your requirement traceability right there.

See it by yourself

Would you like to give the mix of executable modeling and test-driven development a try? Sign up to AlphaSimple now, then open the public project repository and clone the “Test Infected” project (or just view it here).

P.S.: does this example model look familiar? It should – it was borrowed from “Test Infected: Programmers Love Writing Tests“, the classical introduction to unit testing, courtesy of Beck, Gamma et al.

What is the focus of analysis: problem or solution?

Is the purpose of an analysis model understanding the problem or proposing a solution? I have discussed this a few times with different people. This is how I used to see it:

  • Analysis deals with understanding the problem domain and requirements in detail
  • Design deals with actually addressing those (functional and non-functional) requirements
  • A detailed design model can be automatically transformed into a working implementation
  • An analysis model can’t, as in the general case, it is not possible to automatically derive a solution based on the statement of a problem.

Rumbaugh, Blaha et al in “Object-oriented modeling and design” (one of the first OO modeling books) state the purpose of analysis in OO is to model the real-world system so it can be understood; and the outcome of analysis is understanding the problem in preparation for design.

Jacobson, Booch and Rumbaugh (again, now with the other “two amigos”) in “The unified software development process” state that “an analysis model yields a more precise specification of the requirements than we have in the results from requirements capture” and “before one starts to design and implement, one should have a precise and detailed understanding of the requirements”.

Ok, so I thought I was in good company there. However, while reading the excellent “Model-based development: applications“, to my great surprise, H. S. Lahman clearly states that contrary to structured development, where the focus of analysis is problem analysis, in the object-oriented paradigm, problem analysis is done during requirements elicitation, and the goal of object-oriented analysis is to specify the solution in terms of the problem space, addressing functional requirements only, in a way that is independent of the actual computing environment. Also, Lahman states that the OOA model is the same as the platform-independent model (PIM) in MDA lingo, so it can actually be automatically translated into running code.

That is the first time I have seen this position defended by an expert. I am not familiar with the Shlaer-Mellor method, but I won’t be surprised if it has a similar view of analysis, given that Lahman’s method is derived from Shlaer-Mellor. Incidentally, Mellor/Balcer’s “Executable UML: a foundation for model-driven architecture” is not the least concerned with the software lifecycle, briefly mentions use cases as a way of textually gathering requirements, and focuses heavily on solution modeling.

My suspicion is that for the Shlaer-Mellor/Executable UML camp, since models are fully executable, one can start solving the problem (in a way that is removed from the actual concrete implementation) since the very beginning, so there is nothing to be gained by strictly separating problem from a high-level, problem-space focused solution. Of course, other aspects of the solution, concerned with non-functional requirements or somehow tied with the target computing environment, are still left to be addressed during design.

And now I see how that all makes sense – I struggled myself with how to name what you are doing when you model a solution in AlphaSimple. We have been calling it design based on the more traditional view of analysis vs. design – since AlphaSimple models specify a (highly abstract) solution, it couldn’t be analysis. But now I think I understand: for approaches based on executable modeling, the divide between understanding the problem and specifying a high-level solution is so narrow and so cheap to cross, that both activities can and should be brought closer together, and the result of analysis in approaches based on executable modeling is indeed a model that is ready to be translated automatically into a running application (and can be quickly validated by the customer).

But for everybody else (which is the vast majority of software development practitioners – executable modeling is still not well known and seldom practiced) that is just not true, and the classical interpretation still applies: there is value in thoroughly understanding the requirements before building a solution, given that the turnaround between problem comprehension, solution building and validation is so damn expensive.

For those of you thinking that this smells of BigDesignUpFront, and that is not an issue with agile or iterative approaches in general – I disagree. At least as far as typical iterative approaches go, where iterations need to comprise all/several phases of the software development life cycle so they can finally deliver results that can be validated by non-technical stakeholders. As such they are still very wasteful (the use of the word agile feels like a bad joke to me).

Approaches based on executable modeling, on the other hand, greatly shrink the chasm between problem analysis and conceptual solution modeling and user acceptance, allowing for much more efficient and seamless collaboration between the problem domain expert and the solution expert. Iterations become so action packed that they are hardly discernible. Instead of iterations taking weeks to allow for customer feedback, and a project taking months to fully cover all functional requirements, you may get a fully specified solution after locking a customer and a modeler in a boardroom for just a day, or maybe a week for bigger projects.

So, long story short, to answer the question posed at the beginning of this post, the answer is both, but only if you are following an approach based on executable modeling.

What is your view? Do you agree with that? Are you an executable modeling believer or skeptic?

UPDATE: make sure to check the thread on Google+ as well.

Upcoming presentation on code generation and executable models @ VIJUG

This coming Thursday I will be doing a presentation entitled “Code generation – going all the way” to the Vancouver Island Java User Group.

The plan is to take the audience from the most basic ideas around generating code from models, visiting approaches increasingly more sophisticated, analyzing their pros and cons, all the way to full code generation based on executable models.

In the process, we will be taking a cursory look at some code generation tools in the market, culminating with a preview of the upcoming release of AlphaSimple, our online modeling tool, which will support executable modeling and full code generation.

What:

May 2011 VIJUG Meeting – Code generation: going all the way (official announcement)

Where:

Vancouver Island Technology Park, Conference Center Room – 4464 Markham Street, Victoria, BC

When:

Thursday, 26th May 2011, 18:00-20:00

If you are in Victoria and think developing business applications got just way too complicated and labor-intensive, and that there must be a saner way to build and evolve them (no matter what platforms you use), come to this presentation and learn how executable models and full code generation can fix that.

Update: these were the slides presented:

Productivity of UML

Found an old discussion on the MDSN site about a study on the productivity of UML, brought up by the DSM folks. You can see some of the common caveats raised in this comment by MetaCase’s Steve Kelly. Please read his points and come back here.

I actually didn’t notice it was an old thread and replied to it. Call me cheap, but I hate perfectly good arguments going to waste on a dead thread, so I am recycling my original response (now deleted) here as a blog post.

1) repeat with me, UML is not a graphical language – it has a graphical notation, but others are allowed. Criticism of UML as a whole based on the productivity issues around the graphical notation is cherry picking or (at best) a misinformed opinion. If you don’t like the default notation, create one (like we did!) to suit your taste (and it will still be UML). The specs are public, and there are good open source implementations of the metamodel, that are used by many tools.

2) you don’t need to give up on the semantics of UML to map a modeled class to multiple artifacts. That is just plain OO design mapping to real-world implementation technologies. UML is an OO language first and foremost.

3) There is no need to mix languages, UML has support for both structural and behavioral modeling (since 2002!). Action languages are not (or don’t have to be) “other languages” – but just a textual notation on top of the existing abstract syntax and semantics. That is not a marketing ploy, incorporating elements of the Shlaer-Mellor approach was just a sound strategic decision that made UML much better.

4) Annotations (or stereotypes) is an established (see C#, Java) and cost effective way of tailoring a general purpose language to one’s needs. Not everything calls for a DSL. Both approaches have pros and cons, one has to pick what is best for the situation at hand.

5) All the stories of failure or limited success with generating code from UML models I heard or read are caused by the decision of ignoring behavioral modeling in UML and doing partial code generation. That is a losing proposition, no matter the modeling language. Again, just like the notation issue, analyzing UML productivity based exclusively on those narrow minded cases is at best spreading misinformation. Kudos to MetaCase for promoting full code generation, that is the way to go. But full code generation is not an exclusivity of DSL, the Executable UML folk (and other modeling communities) have been doing it successfully for a long time as well.

Can we move away from the pissing contest between modeling approaches? That got old ages ago. There are way more commonalities than differences between DSM and executable modeling with GPLs like UML, productivity gains included. There is room for both approaches, and it would not be wise to limit oneself to one or another.

What is your opinion? Are you still using old school UML and limiting yourself to generating stubs? Why on earth haven’t you moved to the new world of executable models yet?

Anti-patterns in code generation – Part I

So it finally hits Ted, The Enterprise Developer: all his enterprise applications consisted of the same architectural style applied ad nauseum to each of the entities they dealt with. And Ted asks himself: “why am I wasting so much time of my life doing the same stuff again and again, for each new application, module or entity in the system? The implementation is always the same, only the data model and business rules change from entity to entity!”

The Epiphany

So Ted figures: “just like I write code to test my code, I will write code to write my code!”.

Ted decides that, for his next project, he will take the approach of code generation. Ted is going to model all domain entities as UML classes, and have the code generator produce not only the Java (or C#, or whatever) classes, properties, relationships and methods, but all the boilerplate that goes along with it (constructors, getters, setters, lazy initialization, etc). “This is going to be awesome.”

The Compromise

One of the first things Ted realizes is that since his UML models are pretty dumb and contain no behavior (“UML models can have no behavior, right?”), there is no way to fully generate the code. Bummer.

“Wait a minute, that is not totally true.” Ted’s models contain operation names, parameter lists and return types, so Ted can at least generate empty methods (stubs), complete with Javadoc with the operation description. “This *is* awesome!”

Ted still has all these empty methods that need to be filled in for the application to be fully functional. So he starts filling them in with handwritten code.

Reality Kicks In

Things are looking great. Ted is already filling in the stubbed methods for the tenth entity in the system. But then he realizes there is a problem in the generated code. It would be an easy fix in his generator, and rerunning it will fix the problem everywhere (isn’t that beautiful?). However, Ted would end up losing all changes he had made so far. Argh.

Any way out?

Ted thinks: “shoot, this was going so well, look at how much code I produced in so little time. There must be a solution for this.”

He almost feels like backing up his current code somewhere, regenerating the code (losing his changes) with the new generator, and then adding his handwritten code back (“Just this once!”)”. But he knows better. At some point he will need to regenerate the code again (and then again, and again…), and his team won’t buy the approach if it is that complicated to fix problems or to react to changes. It will look pretty bad.

He opens a new browser tab, and starts thinking about the best search terms he should use to search for a solution to this problem…


In the next episode, Ted, The Enterprise Developer, continues his saga in search for a fix to his (currently) broken approach to code generation. If you have any ideas of what he should try next, let me know in the comments.


Model-driven Development with Executable UML models

Last November I did a lecture on Model-driven Development with Executable UML models to a class of Software Engineering undergrad students at UVic. Here are the slides:

I think it gives a good summary of my views on model driven development (with Executable UML or not):

  • even though problem domains are typically not very complex, enterprise software is complex due to the abundance of secondary crosscutting concerns (persistence, concurrency, security, transactions etc)
  • there are two dominant dimensions in enterprise software: business domain concerns and technological concerns
  • they are completely different in nature (change rate, abstraction level) and require different approaches (tools, skills, reuse)
  • MDD is a strategy that handles well that divide: models address business domain concerns, PIM->PSM transformation addresses technological concerns
  • brainstorming, communication, documentation and understanding (rev. engineering) are not primary goals of MDD – to produce running code in a productive and rational way is
  • models in MDD must be well-formed, precise, complete, executable, technology independent
  • graphical representations are not suitable for executable modeling (textual notations are much better)
  • diagrams != models, text != code (that would look good on a t-shirt!)

I guess those who know me won’t have seen anything new above (these ideas make the very foundations of the TextUML Toolkit and AlphaSimple).

Do you agree with those positions?

The value of UML (or lack thereof) in the development process

[What follows was adapted from a discussion on LinkedIn on why companies developing software don't use UML more]

UML proponents argue that by using UML one gains the ability to properly represent important knowledge on the problem domain and intended solution design. That leads to good things such as improved communication and better quality of the software.

I certainly don’t question that. The ability of devising a solution at the right level of abstraction is extremely important. Yet, that on its own is not enough. It doesn’t help if the language that allows you to specify a solution at the right level of abstraction does not lead to running code (via code generation or direct execution). If one has to specify a solution again by implementing by hand (say, in Java or C#) what the model describes (i.e. using the model as a reference), that greatly offsets any gains from modeling. As the agilists say: Don’t Repeat Yourself. I won’t bore you with the consequences of breaking that simple rule, it must be obvious to any developer worth their salt.

If you are going to be pragmatic, and you cannot generate running code from your models, from a developer’s point of view, there is no much value for UML in the development process.

Developers play an increasingly important role in software product development. Non-executable UML models are seen as fluff, an unnecessary burden to software developers, and hence the poor reputation with that crowd, as the majority of the places using UML is unfortunately using it that way.

I see two solutions for this: either get models to be executable, or get programming languages to support a higher level of abstraction. Even though I am a believer of the former (and our work at Abstratt follows that approach – see AlphaSimple and TextUML Toolkit), I won’t be surprised if the latter ends up being the winning approach (see RAD frameworks such as Grails and Rails).

Do you agree? Which approach do you prefer? Why?

On model-to-model transformations

There was some strong (but polite) reaction to some comments I made about the role of model-to-model (M2M) transformations in model-driven development.

My thinking is that what really matters to modeling users (i.e. developers) is that:

  1. they can “program” (i.e. model) at the right level of abstraction, with proper separation of concerns
  2. they can automatically produce running code from those models without further manual elaboration

In that context, M2M is not a requirement. That is not to say that to support #2 above, tools cannot use model-to-model transformations. But that is probably just an implementation detail of that tool, all that modeling users care is that they are able to model their solutions and produce working applications. Of course, modeling experts will be interested in less mundane things, and more advanced aspects of modeling.

Also, my comments were about model-driven development (MDD), and not model-driven engineering (it seems most people disagreeing with me are from the MDE camp). To be honest, I didn’t even know what MDE meant until recently (and I know that MDE contains MDD), and have just a superficial grasp now. To be even more honest, I am not interested in the possibilities of the larger MDE field. At least not for now. I will explain.

You see, I think we still live in the dark ages of software development. I want that situation to change, and the most obvious single thing that will let us do that is to move away from general purpose 3GLs to the next level, where developers can express themselves at the right level of abstraction, and businesses can preserve their investment in understanding their domain while at the same time being able to take advantage of technological innovation. Hence, my deep interest in making MDD mainstream.

I see value in the things beyond MDD that MDE seems to be concerned with (mining existing systems for models, model-level optimization). I just don’t think they are essential for MDD to succeed. Thus, I prefer to just cross that stuff off for now. We need to lower the barrier to adoption as much as we can, and we need to focus our efforts on the essentials. The less concepts we need to cram into people’s minds in order to take MDD to the mainstream, the better. It is already hard enough to get buy-in for MDD (even from very smart developers) as it is now. It does not matter how powerful model technology can be, if it never becomes accessible to the people that create most of the software in the world.

Model interpretation vs. code generation? Both.

Model interpretation vs. code generation? There were recently two interesting posts on this topic, both generating interesting discussions. I am not going to try to define or do an analysis of pros and cons of each approach as those two articles already do a good job at that. What I have to add is that if you use model-driven development, even if you have decided for code generation to take an application to production, it still makes a lot of sense to adopt model interpretation during development time.

For one, model interpretation allows you to execute a model with the fastest turnaround. If the model is valid, it is ready to run. Model interpretation allows you to:

  • play with your model as you go (for instance, using a dynamically generated UI, like AlphaSimple does)
  • run automated tests against it
  • debug it

All without having to generate code to some target platform, which often involves multiple steps of transformation (generating source code, compiling source code to object code, linking with static libraries, regenerating the database schema, redeploying to the application server/emulator, etc).

But it is not just a matter of turnaround. It really makes a lot more sense:

  • you and other stakeholders can play with the model on day 1. No need to commit to a specific target platform, or develop or buy code generators, when all you want to validate is the model itself and whether it satisfies the requirements from the point of view of the domain. Heck, you might not even know yet your target platform!
  • failures during automated model testing expose problems that are clearly in the model, not in the code generation. And there is no need to try to trace back from the generated artifact where the failure occurred back to model element that originated it, which is often hard (and is a common drawback raised against model-driven development);
  • debugging the model itself prevents the debugging context from being littered with runtime information related to implementation concerns. Anyone debugging Java code in enterprise applications will relate to that, where most of the frames on the execution stack belong to 3rd-party middleware code for things such as remoting, security, concurrency etc, making it really hard to find a stack frame with your own code.

Model-driven development is really all about separation of concerns, obviously with a strong focus on models. Forcing one to generate code all the way to the target platform before models can be tried, tested or debugged misses that important point. Not only it is inefficient in terms of turnaround, it also adds a lot of clutter that gets in the way of how one understands the models.

In summary, regardless what strategy you choose for building and deploying your application, I strongly believe model interpretation provides a much more natural and efficient way for developing the models themselves.

What are your thoughts?

UML may suck, but is there anything better?

UML has been getting a lot of criticism from all sides, even from the modeling community. Sure, it has its warts:

  • it is a huge language, that wants to be all things to all kinds of people (business analysts, designers, developers, users)
  • it has a specification that is lengthy, hard to navigate and often vague, incomplete or inconsistent
  • it is modular, but its composition mechanism (package merging) is esoteric and not well understood by most
  • it is extensible, but language extensions (profiles and stereotypes) are 2nd-class citizens
  • it lacks a reference implementation
  • its model interchange specification is so vague that often two valid implementations won’t work with each other
  • its committees work behind closed doors, there is no opportunity for non-members to provide feedback on specifications while they are in progress (membership is paid)
  • <add your own grudges here>

However, even though I see a lot of room for improvement, I still don’t think there is anything better out there. The more I become familiar with the UML specification, the more impressed I am about its completeness, and how issues I had never thought about before were dealt with by its designers. And it seems that the OMG recognizes some of the issues I raised above as shortcomings and is working towards addressing them. Unfortunately, some fundamental problems are likely to remain.

In my opinion (hey, this is my blog!), for a modeling language to beat UML:

  • it must be general purpose, not tailored to a specific architecture or style of software
  • it must not be tailored to an implementation language
  • it must be based on or compatible with the object paradigm
  • it must not be limited to one of the dominant aspects of software (state, structure, behavior)
  • it must be focused on executability/code generation (and thus suitable for MDD) as opposed to documentation/communication
  • it must be modular, and user extensions should be 1st class citizens
  • its specification should follow an open process
  • it must not be owned/controlled by a single company
  • it must not require royalties for adoption/implementation

My suspicion is that the next modeling language that will beat the UML as we know today is the future major release of UML. Honestly, I would rather see a new modeling language built from scratch, focused on building systems, that didn’t carry all that requirement/communication/documentation-oriented crap^H^H^H^Hbaggage that UML has (yes, I am talking about you, use case, sequence, instance and collaboration diagrams!), and developed in a more open and agile process than the OMG can possibly do. But I am not hopeful. The current divide between general purpose and domain specific modeling communities is not helping either.

So, what is your opinion? Do you think there are any better alternatives that address the shortcomings of UML without imposing any significant caveats of their own? Have your say.

Myths that give model-driven development a bad name

It seems that people that resist the idea of model-driven development (MDD) do so because they believe no tool can have the level of insight a programmer can. They are totally right about that last part. But that is far from being the point of MDD anyways. However, I think that unfortunate misconception is one of the main reasons MDD hasn’t caught on yet. Because of that, I thought it would be productive to explore this and other myths that give MDD a bad name.

Model-driven development myths

Model-driven development makes programmers redundant. MDD helps with the boring, repetitive work, leaving more time for programmers to focus on the intellectually challenging aspects. Programmers are still needed to model a solution, albeit using a more appropriate level of abstraction. And programmers are still needed to encode implementation strategies in the form of reusable code generation templates or model-driven runtime engines.

Model-driven development enables business analysts to develop software (a variation of the previous myth). The realm of business analysts is the problem space. They usually don’t have the skills required to devise a solution in software. Tools cannot bridge that gap. Unless the mapping between the problem space and solution space is really trivial (but then you wouldn’t want to do that kind of trivial job anyways, right?).

Model-driven development generates an initial version of the code that can be manually maintained from there on. That is not model-driven, it is model-started at most. Most of the benefits of MDD are missed unless models truly drive development.

Model-driven development involves round-trip engineering. In MDD, models are king, 3GL source code is object code, models are the source. The nice abstractions from the model-level map to several different implementation artifacts that capture some specific aspect of the original abstraction, combined with implementation-related aspects. That mapping is not without loss of information, so it is usually not reversible in a practical way, even less so if the codebase is manually maintained (and thus inherently inconsistent/ill-formed). More on this in this older post, pay attention to the comments as well.

Model-driven development is an all or nothing proposition. You use MDD where it is beneficial, combining with manually developed artifacts and components where appropriate. But avoid mixing manual written code with automatically generated code in the same artifact.

What is your opinion? Do you agree these are myths? Any other myths about MDD that give it a bad name that you have seen being thrown around?

Rafael

Interview at Modeling-Languages.com

Last December I had the pleasure of being interviewed by Jordi Cabot, the maintainer of Modeling-Languages.com, a web site on all things model-driven. We talked mostly about the TextUML Toolkit project, but Jordi also asked about my opinions on more general subjects, such as modeling notations, textual modeling frameworks, DSLs, UML and trends in modeling.

Jordi has recently made a transcription of the interview available on his web site. Take a read, feel free to leave a comment, I am very keen on discussing on any of the topics covered.

Model-driven prototyping presentation @ VIJUG

Last week I did a short presentation on “Model-driven prototyping” for the Vancouver Island Java User Group (VIJUG). It was lots of fun, with good participation from the group. I also showed a quick demo of AlphaSimple, our upcoming service for model-driven prototyping, which seemed to be well received.

For the benefit of those not there, here is a web-version of that presentation, with notes showing on the slides (click here for a full screen view).

Comments are very welcome. I would be very happy to discuss the approach with anyone interested.

Model-driven prototyping with AlphaSimple

It’s been a while since the last post, but I have a good excuse. I have been working on a new MDD product named AlphaSimple.

AlphaSimple is our upcoming web-based service that renders functional prototypes straight from rich domain models. The goal is to bridge the gap between design and requirement analysis, creating a fast feedback loop between those two activities. The result is much more precise and complete requirements early in the project lifecycle, and a sound design model that not only will be a breeze to implement, but that even your customers will understand.

We are craving feedback. If you want to take an alpha version for a spin, sign up at alphasimple.com.

Cheers,

Rafael