On code being model - maybe not what you think
I have heard the mantra ‘code is model’ several times. Even though I always thought I got the idea of what it meant, only now I decided to do some research to find out where it came from. Turns out that it originated from a blog post that MS’ Harry Pierson wrote back in 2005. It is a very good read, insightful, and to the point.
The idea that gave title to Harry’s post is that whenever we use a simpler representation to build something that is more complex and detailed than we want to care about, we are creating models. 3GL source code is a model for object code. Byte code is a model for actual CPU-specific executable code. Hence code is model.
He then goes to ask that if we have been successfully reaping the benefits of increased levels of abstraction by using 3GLs for decades now, what prevents us from taking the next step and using even higher level (modeling) languages? He makes several good points that are at the very foundations of true model-driven development:
- “models must be precise“- models must be amenable to automatic transformation. Models that cannot be transformed into running code are “useless as development artifacts“. If you like them for conceiving or communicating ideas, that is fine, but those belong to a totally different category, one that plays a very marginal role in software development, and have nothing to do with model-driven development. Models created using the TextUML Toolkit are forcefully precise, and can include behavior in addition to structure.
- “models must be intrinsic to the development process” - models need to be “first class citizens of the development process” or they will become irrelevant. That means: everything that makes sense to be modeled is modeled, and running code is generated from models without further manual elaboration, i.e., no manually changing generated code and taking it from there. As a rule, you should refrain from reading generated code or limit yourself to reading the API of the code, unless you are investigating a code generation bug. There is nothing really interesting to see there - that is the very reason why you wanted to generate it in the first place. Build, read, and evolve your models! Generated code is object code.
- “models aren’t always graphical” - of course not. I have written about that before here. The TextUML Toolkit is only one of many initiatives that promote textual notations for modeling (and I mean modeling, not diagramming - see next point).
- “explicitly call out models vs. views” - in other words, always keep in mind that diagrams != models. Models are the real thing, diagrams are just views into them. Models can admit an infinite number of notations, be them graphical, textual, tabular etc. Models don’t need notations. We (and tools) do. Unfortunately, most people don’t really get this.
The funny thing is that, most of the times I read someone citing Harry’s mantra, it is misused.
One misinterpretation of the “code is model” mantra is that we don’t need higher-level modeling languages, as current 3GLs are enough to “model” an application. The fact is: 3GLs do not provide an appropriate level of abstraction for most kinds of applications. For example, for enterprise applications, 4GLs are usually more appropriate than 3GLs. Java (EE) or C# are horrible choices, vide the profusion of frameworks to make them workable as languages for enterprise software - they are much better appropriated for writing system software.
Another unfortunate conclusion people often extrapolate from the mantra is that if code is model, model is code, and thus it should always be possible to translate between them in both directions (round-trip engineering). Round-trip engineering goes against the very essence of model-driven development, as source code often loses important information that can only exist in higher level models. The only reason people need RTE is because they use models to start a design and generate code, but then they switch to evolving and maintaining the application by directly manipulating the generated code. That is a big no-no in true model-driven development - it implies models are not precise or complete enough for full code generation.
So, what is your view? How do you interpret the “code is model” mantra?