mbriggs.io

Objects are Behavioural

February 12, 2019

When learning Object Oriented Design in a school setting, the discussion usually goes something like this:

“This is a picture of a dog. What things can a dog do? Bark, eat, run, walk, these are its methods. What are some things that describe a dog? It is soft, brown, has 4 legs. These are its properties”

I like to refer to this object modelling advice as “Find the Nouns”, and it is wrong. It is not only incorrect, but it is also directly responsible for both a large amount of waste in the industry, as well as putting developers on a path which makes it extraordinarily difficult to gain a more nuanced understanding of object design.

Now, let’s look at Domain Driven Design, arguably the most influential methodology used in Object Oriented programming:

“Application Layer: Defines the jobs the software is supposed to do and directs the expressive domain objects to work out problems. The tasks this layer is responsible for are meaningful to the business or necessary for interaction with the application layers of other systems. This layer is kept thin. It does not contain business rules or knowledge, but only coordinates tasks and delegates work to collaborations of domain objects in the next layer down. It does not have state reflecting the business situation, but it can have state that reflects the progress of a task for the user or the program.”

“Domain Layer: Responsible for representing concepts of the business, information about the business situation, and business rules. State that reflects the business situation is controlled and used here, even though the technical details of storing it are delegated to the infrastructure. This layer is the heart of business software.”

This is not bad advice, but it becomes bad advice when taken from a grounding in “Find The Nouns” based analysis. It is often translated to:

“Purely behavioural code should be kept at a minimum at the boundaries of your domain layer, mostly to interface with the richer world of Nouns.”

Finally, lets contrast this with what Dr. Alan Kay had to say when asked what Object Oriented programming meant to him:

I thought of objects being like biological cells and/or individual  computers on a network, only able to communicate with messages (so  messaging came at the very beginning — it took a while to see how to  do messaging in a programming language efficiently enough to be  useful).

I wanted to get rid of data. The B5000 almost did this via its  almost unbelievable HW architecture. I realized that the  cell/whole-computer metaphor would get rid of data, and that ”<-”  would be just another message token (it took me quite a while to  think this out because I really thought of all these symbols as names  for functions and procedures.

My math background made me realize that each object could have  several algebras associated with it, and there could be families of  these, and that these would be very very useful. The term  “polymorphism” was imposed much later (I think by Peter Wegner) and  it isn’t quite valid, since it really comes from the nomenclature of  functions, and I wanted quite a bit more than functions. I made up a  term “genericity” for dealing with generic behaviors in a  quasi-algebraic form.

I didn’t like the way Simula I or Simula 67 did inheritance  (though I thought Nygaard and Dahl were just tremendous thinkers and  designers). So I decided to leave out inheritance as a built-in  feature until I understood it better.

OOP to me means only messaging, local retention and protection and  hiding of state-process, and extreme late-binding of all things. It  can be done in Smalltalk and in LISP. There are possibly other  systems in which this is possible, but I’m not aware of them.

It is quite difficult to reconcile this with “Find the Nouns”. To Dr. Kay, OO is about cell inspired behaviour partitioning and isolation, where all communication is done with messaging between partitions. We will get back to why this distinction is important, but first I want to address the issues that come about when doing noun-based design.

What is so bad about Nouns

When we design our objects around what they are, as opposed to what they do, what we are really doing is designing around data normalization rules (rules that differ from what drives good RDBMS schema design). We think about how data should be structured in order to be flexible in our object model, and then think about what behaviour fits where, based on what data access it needs. This quite frequently leads to violations of design principles.

You are going to end up in situations where behaviour does not have the data required, but will have access to other objects which have that data. This results in “getter” style queries, which is a form of violating encapsulation, and leads to poor isolation of change. Our goal in design is to create code where a change to an implementation detail does not result in cascading errors through the codebase. Since encapsulated data is an implementation detail, it is not difficult to see why violating Tell, Don’t Ask is problematic. 

Another big one is the Single Responsibility Principle. When an object has multiple responsibilities, it has multiple reasons to change. If you design behaviour around data partitioning, you will inevitably end up with situations where multiple responsibilities depend on the same data. 

It is also a central cause for certain highly problematic anti-patterns. For example: God Objects, which are coupled to a large percentage of other objects in your system and have many responsibilities. But it is exceedingly difficult to try to abide by “Tell, Don’t Ask” and do data-centric design without this being the end result.

Mitigation Strategies and the Rise of FP

The above problems can be mitigated (and are, by experienced developers) through various techniques and design patterns. The question that people tend to end up asking at a certain point is “Should this really be so difficult?” and start looking to other programming paradigms, like functional programming.

Functional programming says instead of designing objects around data, you should be considering the flow of data through code. Instead of tying behaviour to data, isolate compositional behaviours and bring them together to create simple, re-usable solutions to solving large problems. This perspective is a light in the darkness for experienced developers who have have realized the problems described above. FP brings a completely different set of challenges with it, but if you believe noun-based OO is inherent to the paradigm of Object Oriented design, it is not difficult to see why so many people see FP as a solution to this specific problem.

A Journey to Behaviour

The alternative to switching paradigms is to question whether we have a proper grasp on our fundamentals. For me, this started happening when I was exploring DCI, a methodology that was popular in Ruby for about a year or so . I did not find DCI helpful in practice, but its questions (and solutions to those questions) opened my mind to seeing there being a better way. I read some of the older work from Jacobson about use case controllers and found that valuable. Finally, I read Rebecca Wirfs-Brocks work on responsibility driven design and a light went off for me. 

Responsibility Driven Design says to focus on the roles and responsibilities of an object and its collaborators during the design process. Behavioural segregation via role is an idea I found to be massively beneficial to my own practice. Viewing OO as a neighbourhoods of role based collaborators led me to a deeper understanding of design. More importantly, I found much greater success in my work using OO.

The final realization came from talking with Scott Bellware about design related topics. One of the topics he likes to passionately speak against is what he refers to as “ORM Thinking”. ORM thinking is treating your object model as a side effect of an ORM. That was what clicked for me, he wasn’t talking about egregious issues resulting from ORM use that most people talk about when speaking against their use. He was talking about the pernicious industry issue of data-centric design. He was speaking to “Find The Nouns”. 

Reflecting on Scott’s thoughts around the problems of “ORM Thinking” with the benefits I saw from integrating Responsibility Driven Design into my practice, I realized the fundamental benefits I was seeing were specifically around relaxing my noun-based interpretation of objects. I went back to re-read some of Fowler’s work with that realization, and saw something that had never clicked with me before: He spends a lot of time talking about the principles of “co-locating data and behaviour” and how important it is in design to get that right. Reading it with my earlier bias led had me to completely miss the point of what he was saying. I have had similar experiences re-reading other foundational works around OO design.

What does Behaviour-centric OO Look Like?

So what does this look like? Someone new to OO may have something that looks like (in ruby):

order.calculate_price

This is context specific code on an object which is likely used in many contexts. That is the path to the God Object.

A more experienced developer may have something like looks like:

calculator = OrderPriceCalculator.new
calculator.calculate_price(order)

So here is the thing, why not just do this?

calculate = Order::CalculatePrice.new
calculate.(<data required for the calculation>)

An OrderPriceCalculator does one thing. You can’t separate what it is and what it does. How it is being thought of and named is actually a contrivance to satisfy the noun/verb dogma present in our thinking about objects and methods. 

CalculatePrice is being precise in its language around what its responsibility is. It does not equivocate and it does not have overly specific descriptors. Having .call syntax built into Ruby is actually a very elegant way of showing you are actuating the responsibility of the object. 

This is an example around a place that has a successful mitigation strategy for experienced developers. What about (using Rails)

order.update_attributes!(ship_address: "New Address")

Now, I would expect an experienced rails developer to not directly call a method like that from anywhere but internal to the object, and instead provide a method like

order.change_address!("New Address")

But what is wrong with

change_address = Order::ChangeAddress.new
change_address.(order, "New Address")

This is still not ideal, but it is working under the constraints of active record. You also may not be a fan of the verbosity, but that can easily be condensed to 

Order::ChangeAddress.(order, "New Address")

At the end of the day, all of these examples are relatively superficial. My intention is:

  1. to show what the code looks like, and how
  2. how this is still OO
  3. that it is not difficult to see how there will be improvements in design heuristics through even trivial transformations.

Behaviour-centric Implications

There are deeper implications at play here. When I say “Behaviour-centric” I don’t mean “Never use a noun”. For example, if you have a data store, you may use the word “Store” to describe it. In that case, its behaviour is the act of storing something. Where things get more awkward is naming like “Reporter” or “Writer”. If you are thinking in terms of behaviour, it is not necessary to contort your language; just use “Report” and “Write”. However, the bigger question is more one of what leads you to creating that artifact in your code.

When you think about data concerns first, the behaviour naturally follows. The problem is that the behaviour is the harder part. When you design behaviour first, the data to support that behaviour naturally follows. You end up with much better internal cohesion and it doesn’t impose artificial constraints on your design process, which are very difficult to work against. It is easier to limit objects to a single responsibility and collaborations become more obvious and less weird (i.e. I don’t have to collaborate with the User object in half the objects in my system)

Another natural conclusion to this line of thinking is that the Canonical Model is an anti-pattern. This is a topic worthy of a full post by itself, but if you take identifiers out of the equation, there is very little data that has contextual relevance in all situations that the domain entity is relevant in. For example, Product.price is highly relevant in a product catalog, but irrelevant in a warehouse inventory context. The idea that “There is one place in the entire system where Product functionality lives, and its instance corresponds to one row in one table in one database” is an extraordinarily difficult design constraint when building a complex system. Evans touches on this when talking about “Bounded Contexts” in DDD, but if you read that through the lens of Data-centric OO you are going to miss the point, and create a “Common” context that all other contexts depend on.

Caveats

First of all, there is a giant exception to all of this, which is data objects. A data object is something whose only responsibility is to hold data, and all functionality is directly related to that data (such as having predicate query methods). A data object should never actually do anything other then hold data, and it should definitely be designed around what data it holds. However, this exception does not disprove the general rule and if the majority of your system is made of data objects, I would seriously question the soundness of the design.

Secondly, if the literature I have been referencing in this post is new or foreign to you, please read it and use it to help you make up your own mind about what I am trying to say. This is a very subtle topic, and it requires theory, reflection, and practice to absorb properly. My goal is to share a major revelation that has been extraordinarily helpful for me on my journey, but I don’t think it is something that is possible to understand without a lot of background.


Matt Briggs

Matt Briggs thinks about programming.

Comments are weird.