3

published book

Core concepts of Domain-Driven Design

This chapter covers:

The parts of Domain-Driven Design that have the most effect on security
How domain models are strict representations of a selected part of the problem domain
The well-established building blocks of value objects, entities, and aggregates
How domain models form a language that is ubiquitous when discussing the system
How domain models have limited reach through bounded contexts
How many bounded contexts form a context map for large system integration

During the years we have been developing software, we have found inspiration from many sources — some different, some shared. One of the biggest sources of inspiration we have in common is Domain-Driven Design, often abbreviated DDD. Domain-Driven Design sets the bar a little bit higher in one regard than most system development. A lot of system development talk about ways to make things work. Domain-Driven Design says we don’t just want our systems to work, we want to truly understand what we are building. The beauty we saw in DDD was that it captures that understanding in code; it makes the code speak the language of the problem we are solving. We found that focus on deep understanding helped us become better developers. It was much later that we realized that this approach also has a profound effect on security.

This chapter is about Domain-Driven Design, but not all aspects of it. Domain-Driven Design is itself a huge and multifaceted subject. It spans from crafting code to system integration, from requirement analysis to testing. It links into other agile-minded methodologies and processes. There are multiple books and an overwhelming number of articles about DDD, so covering it comprehensively in one chapter would be impossible. We will instead focus on those parts of DDD that we have found can drive security.

If you are unfamiliar with Domain-Driven Design, this chapter gives you the understanding of DDD that we are going to use in later chapters. This chapter is also here as a reference. In later chapters we are going to use parts of DDD to promote security, so come back here whenever you need a refresher about value objects, aggregates, context maps, or any other DDD concept. If you are somewhat familiar with DDD, read this chapter as a refresher. If you are a proficient DDDer, please read this chapter anyway, as there are some aspects we want to stress — the aspects that we will use later for promoting security. Also, be aware that we might express some ideas in a somewhat compressed fashion and they might seem somewhat distorted. We are not aiming for completeness, but for an understanding that is enough to talk about the relationship to security.

We will cover domain models, which form the foundation of system development à la DDD. Domain models give an exactness to what the system actually does. From a security perspective this is interesting. When we define what the system should do, it also gives us a very powerful tool to say what the system should not do.

When modeling and thereafter implementing that model as code, it is handy to have some building blocks. Domain models are usually based on value objects and entities. Larger structures are usually represented through aggregates. Using these makes the code more precise, and thus less prone to contain vulnerabilities.

When zooming out from a single system to the integration level, DDD gives us the tools of bounded contexts and context mappings. These give us a better possibility to ensure that integration between systems is "tight" so that it’s easier to hold up security across several systems.

As Domain-Driven Design is founded on domain models, let’s start with creating strict models to capture deep understanding about the problems we solve with our software.

3.1 Models as a tool for deeper insight

Lets start with explaining what DDD mean with model as those are at the center of DDD. In system development the word "model" is used for many things - UML diagrams over flow, how data is laid out in the tables of a database, and many other things. In DDD we use the word "model" to explain how we have captured our essential understanding of the business at hand as a selected set of concepts. So why do we need such models, and what should they look like?

Domain-Driven Design isn’t a silver bullet. Domain-Driven Design is at its best when your system handles a problem that is not trivial to grasp. In those cases the most critical problem is understanding the complexity of the domain. Then, understanding and modeling that domain should be your main focus. If you fail to master the complexity of various technical aspects, you get a system that is less useful. But, if you fail to master the complexity of the domain, you get a system that is close to worthless. In that regard the domain is the critical complexity.

Imagine you have a system that handles checked-in baggage at an airport. The complexity of the domain is probably your critical complexity. If you fail to properly represent how baggage is routed from check-in counters to airplanes, via conveyor belts and loading trucks, then the bags might not make it in time for the right flight, or end up on the wrong one. Passengers will be angry and the business will lose goodwill and money.

Even worse, there are important security aspects at stake. If a bag is checked, but the passenger does not show up at the gate, then the baggage system must ensure the bag is unloaded. If the system is not properly crafted it might be possible to trick it into loading a bag onto a specific flight, or not unloading it — something that could have severe security consequences.

If you fail to capture a very deep and precise understanding of baggage handling, you do not just build a flawed system. What you do is build something that is harmful to the business and potentially dangerous to the customers. That’s worse than bad. It’s so bad it makes the system meaningless. The airport might even be better off closed down than with such a flawed system in place. This is not just a hypothetical example; the opening of Denver Airport in the 1990s was delayed a year and a half due to deficiencies in the baggage system.

In this case, understanding and modeling the domain of baggage handling should be the focus of your work. Spending time on optimizing your database connection pool would be a bad choice. The critical complexity is the domain.

	Caution
	Failing to address the critical complexity makes the solution meaningless.

Now the connection to security. It’s hard to capture enough understanding to make a system that behaves well in all possible cases. It’s hard enough to do it just for benevolent, "normal" data, with all the weird cases that can occur. It’s even harder to do it in a way that is resistant to malevolent data. Someone might try to attack your system by sending bizarre data to it, manipulating it into doing something bad. The system should still respond in a sound and safe way.

We saw an example of this in the case study of the online book store in chapter 2. No normal business procedure results in an anti-book (quantity -1) being placed in the shopping cart. Still, a dishonest customer might do so to manipulate the system — in that example, to avoid paying the full price for an order.

We have found that for security it’s essential to focus on building domain models. A lot of security problems are avoided as a side effect — especially business integrity problems, but to an extent it also shields from some technical attacks.

Of course, Domain-Driven Design is not a silver bullet. There are situations where a main focus on modeling the domain is not the right choice. For example, if you write software for a network router, then I/O throughput will be the most critical thing. Your critical complexity is technical in this case. But even here, you should consider whether a sloppy domain model might be a security issue.

	Caution
	There is always a critical complexity. Be aware of whether it’s a technical aspect or the domain.

We have found the main benefit of domain modeling is that it works as a vehicle for learning, at a very deep level. And learning at that level is crucial. It’s not hard to just "catch the lingo" of the businesspeople, and we can use that same lingo to write a requirements document that looks good. But without deep learning, such a document will contain subtle misunderstandings, inconsistencies, and logical loopholes. These flaws make it impossible to build a solid system that actually does the right thing in tricky situations — with security vulnerabilities as the worst consequence. Working in collaboration with domain experts to create a domain model fuels that learning.

What we need are domain models that support development in a stable and secure way.

For a domain model to be effective, it needs to:

Be simple so we focus on the essentials
Be strict so it can be a foundation for writing code
Capture deep understanding to make the system truly useful and helpful
Be the best choice from a pragmatic viewpoint
Provide us with language we can use whenever we talk about the system

3.1.1 Models are simplifications

A model is a simpler form of reality. It’s a simplification where we have removed irrelevant parts. For example, when you check in a bag at the airport, there is no need for the system to represent your shoe size. On the other hand, it’s probably very relevant to represent how heavy the bag is. To make it easier to understand and code the system, we create a model that contains the weight of the bag, but not the shoe size of the passenger. We keep just the details we think are relevant.

To be clear, models are not diagrams. In many other contexts "model" means a specific diagram type, such as an entity-relationship model, which is often used for database design, or the class diagram from UML. These diagrams are representations of the model, but the model itself is the conceptual understanding of how our simplified view of reality works.

The usage of "model" in Domain-Driven Design is closer to another use of the word, in the phrase "model train." When building model trains the builders put very much effort into keeping some aspects of reality, while totally ignoring others. What details to keep and what details to distort is key to building train models, as well as domain models.

Figure 3.1. A model train looks like the real original train.

There’s no doubt that what we see in figure 3.1 is a model train. It looks like a train and moves around on rails, but it is not a real train. We consider it a train model because it has kept some important attributes, while we have allowed it to disregard some other attributes.

Let us list some attributes the model has in common with reality:

Color — We think that the model of a specific train should have the same colors as the original train.
Relative size — We expect the proportions to be maintained. If the doors are twice as high as they are wide in reality, we expect the same ratio on the model train.
Shape — We expect the model train and its details to have the same shape, such as the curvature of the front window.
Movement — We expect the model train to move along rails in the same way as a real train does.

Let us also list some attributes where the model differs from reality, and where we think the difference is fine:

Material — It is OK that the model train is made out of plastic or tin, when the original was built from other materials.
Absolute size — If the real wagons were 30 meters long, we are fine that they are much smaller in the model.
Weight — The model is much lighter, which is OK.
Method of propulsion — The model does not have a steam engine; it runs on electricity.
Rail curvature — The curves in the model are much tighter than in reality, which we accept.

Strangely enough, it is easier to find differences between the model train and a real train than it is to find things they have in common. Still, we have a very firm opinion that this is a proper model of a train. Clearly this specific model has managed to capture the essentials of our understanding of a train.

It seems like "color, relative size, and movement" are enough for us to understand that the model is a train. These three attributes are necessary — if the model does not fulfill them we will not play along and pretend it is a train. And these three are sufficient — if the model fails to fulfill some other expectation, such as material, we will still play along and pretend it is a train.

	Note
	A model is a simplification of reality. A simplification we still accept as valid representation of the real thing.

We will now leave the realm of toys and take with us the idea that a model is a simplified understanding of the real thing. This goes for the models we use in system development as well. If we model a person, we might choose to grab onto a few attributes: a person has a name, is of a certain age, has a specific shoe size, and optionally has a pet. Agreed, this is a very crude model, but a model nevertheless.

Figure 3.2. One possible model of people and pets

A model is a simplification, but it must still be general enough so that we can capture some variations that we think are interesting. In our example, we want to allow different names, different ages, and different shoe sizes, and we allow people to have pets or not. All these differences we allow to show up in the model. We do not make any distinctions between people of different height, or pay any attention to their hairdo.

Figure 3.3. Joe, age 34, shoe size 9, and his dog Zarphac together with Jane, age 28, shoe size 6, no pet

We can represent this model in many different ways. We can use plain text to explain what we mean. We can use different kinds of diagrams to illustrate it. We can use code: pseudocode or actual code from a programming language. The important point here is that none of these representations is the model. Class diagrams in particular are often confused with being the model, but the model as such is not any of the representations. The model is the conceptual understanding of what we consider as essential in our modeling — in this case name, age, shoe size, and pet.

Figure 3.4. The same model as before, just another representation

The main benefit of keeping models as really simplified versions of reality is that simple models are easier to make very strict, something that is essential when we later build software from them.

3.1.2 Models are strict

The domain model is not just a watered-down version of reality; what it has lost in richness it has gained in strictness. People are really complex beings with lots of attributes and lots of relations. When we decide to focus on name, age, shoe size, and pets, we lose a lot of richness. But we gain precision in what we mean by "person" — a precision that makes it possible to represent represent this entity in software.

	Note
	The folks that really understand the domain, we call domain experts.

Writing software is a collaboration between two kinds of professionals who come from different directions and who need to meet in a productive way: the businesspeople and the developers. Each have different needs that have to be fulfilled to create great software. Businesspeople need to see the terminology they are used to, not some quasi-technical mumbo-jumbo. If they do not recognize their domain, we have failed them. But it’s not enough to just have some familiar words as labels in the user interface or in the headers of printed-out reports. The system must also behave in a way that businesspeople think is reasonable, consistent, and understandable.

For this to happen, the domain model has to be strict. If the model is not strict and contains ambiguities, then one part of the system might behave in one way and another part in another way. For example, a screen at the check-in counter might talk about "number of bags," another at the gate might say "baggage count," and the tablet used by loading staff might say just "luggage." To make things worse, some of these terms might count the carry-on as part of the number, while others do not. Whenever the personnel speak to each other they each have to remember what screen the other person is seeing and remember to add or subtract the carry-on from the number they are seeing. Sometimes there are misunderstandings and bags are lost. The system fails the business, and not even the domain experts think it makes sense.

	Caution
	Many almost-synonyms describing the same concept is often a sign that the model is not very strict.

Another shameful variant is when a model is consistent in the terminology, but too lenient in its constraints and relations. This is often the result of using a "standard system" and "configuring it" to the domain. This is the usual way of working with, for example, Enterprise Resource Planning (ERP) products. ERP where first created for the manufacturing industry, to plan the usage of machines and raw materials. As factories differ, ERP systems where made highly configurable to suit the needs of each factory. Nowadays, they are often described and sold as "standard systems" which can be configured to handle any domain, whereas under the hood they are still the same flow-of-materials system. But this line of business has successfully sold such system to handle customer complaints, police investigations, or other completely different domains. Unfortunately, successfully selling is one thing, and successfully delivering value is another thing.

If you want to configure a flow-of-materials system to handle police investigation, you need to do some very non-intuitive abstractions: "A police can be seen as a machine, and a report about burglary can be seen as a pile of raw material which is refined by the police-machine during the investigation." In order to shoe-horn one domain into another you need to be less and less specific, less and less precise. The result is often a general "object management system" where everything is an "object." Through the user interface you can update the attributes of the objects, but it carries very little understanding of what those objects actually represent. Often you can fill in any combination of attributes and relationships. A system that is so lenient is of course prone to mistakes, and as we saw in the case study of the online book store such lenience can even result in security flaws.

	Tip
	It takes both happy businesspeople and happy developers to make a good system. Both groups need to have their basic needs fulfilled.

Obviously it is important to pay attention to the businesspeople. They need to recognize the domain they are used to working in, so we should choose terminology that’s familiar to them. It’s a big mistake to fail to meet the needs of the domain professionals. It is an equally big mistake to fail to meet the needs of the other professionals, the developers. As developers, we need strictness. It is not good enough to say that "most people just have one pet." We need to know if "having a pet" is strictly restricted to having just one.

And this is where it takes some courage to be a developer. We need to ask the questions that make the model strict, without ambiguities. If we ask, "Can there be more than one pet?" we might get the answer, "Oh, that is really unusual." This leaves us with two options. Either we think, "Then I need to allow for a list of pets," or we think, "Just one pet allowed." In the first case we end up writing a system with possibly more complexity than necessary, and sooner or later some weird combination will occur. In the second case we disallow multiple pets, just to get hammered a few months later when it turns out that there are some customers — perhaps customers we get when acquiring another company — who actually have two or more pets. To add insult to injury this can even turn into blame-shifting toward us, with businesspeople saying, "We told you it could happen."

The way out of this dilemma is to actively ask what should be in the model: "Shall we allow for multiple pets, or shall we restrict to having just one?" Deciding whether the unusual multi-pet people should be covered or not isn’t a technical decision — it is a business decision. If we do not have system support for them, then they have to be handled through a separate manual routine.

On the other hand, providing scope for lots of diversity doesn’t come for free either. It is tempting to allow for more and more general models; sooner or later everything is in a many-to-many relation with everything else. But that doesn’t make anything better in the long run. It can be hard to foresee and get an overview of the ramifications of a general model.

Say there is a function that allows one person to "swap pets" with another person. If we also allow for multiple pets per person, then we need to figure out what it means to swap pets. Does that mean person A gets all of the pets of person B, and vice versa? Or do we just swap one pet?

If we do not let the model reflect the business domain, we let the businesspeople down. If we do not let the model be strict, we let the development people down. A good model must reflect the business domain and be strict.

Having a model be strict means that we eventually are able to build code using the model as a foundation.

When we design software we make similar choices. We make really simple representations of really complex phenomena. Let us have a look at a schoolbook example of object orientation where lots of attributes and relations are ignored, and only a very narrow view of a person is left:

class Person { 
     ❶
    private String name;
    private int age;
    private int shoeSize;
    private Animal pet;
    void growOlder() {
       age++;
    }
    void swapPetWith(Person other) {
     ...
    }
}

❶	The model of the domain concept "person," captured as code

In this design we have removed tons of attributes and behaviors that a person might have, reducing it to four attributes that are essential for us. Leaving out details might seem to make the system poorer, but it gives us a great benefit.

What we gain by leaving out details is the possibility to be precise. In the domain of people, a "person" is a very complex being with complex interactions. But in our model of the domain, a Person is something that has a name, an age, a shoe size, and the ability to grow older. Period. That is exactly what we mean when we use the word "person." What we lose in richness we gain in precision.

Domain — A part of the real world where stuff happens, for example the domain of baggage handling

Domain model — A distilled version of the domain where each concept has a specific meaning

Code — An encoded version of the domain model, written in a programming language *

3.1.3 Models capture deep understanding

The strict understanding that we capture in the domain model is deeper than most people think. In fact, the knowledge we need to capture is even deeper that the understanding most domain experts exercise in their day to day work, when they handle situations on a case by case basis. The reason for this is that we don’t only need enough understanding to work in the domain; we need an understanding deep enough to build a machine. Let us compare this with the challenge of riding a bike.

Most of us are experts at riding a bike. We can prove it by taking a bike and riding it, even in pretty challenging conditions such as on a bumpy road and in windy weather, and perhaps even while carrying a large package under one arm. That takes expertise — compare it with the difficulties faced by a child who is just learning to ride on flat ground on a nice sunny summer day.

This expertise is comparable to the expertise of a domain expert. They know how the domain works. For example, a shipping expert knows how to route cargo containers even when conditions get tough, such as when a container is mistakenly unloaded from a ship and there is no other ship leaving for the same destination for a substantial amount of time. The domain expert will be able to handle even tricky cases, taking each case on its own.

Unfortunately, the understanding we need to write a software system is even deeper. We do not have the luxury of being "at the site" to handle any situation that arises, of being able to assess and improvise to resolve the situation. We are writing a program that should do this, without us (with all our expertise) being there in human form. The challenge we face is not so much like riding a bike, but more like building a bike-riding robot.

Figure 3.5. To build a bike-riding robot, you need really deep understanding of how to make a right turn

If we are to build a bike-riding robot, the understanding of bike riding we will need is much deeper than most experts will have, even professional bicycle messengers or BMX pros. For example: how do you turn right while riding a bike? Think about it for a few seconds — you have probably done it a thousand times. Most people spontaneously answer, "I pull on the right handlebar." Unfortunately, doing so would cause you to fall to the left, down onto the asphalt, due to the centrifugal force. What you actually subconsciously do is turn the handlebars left, causing you to fall to the right for a very short period of time. After a few milliseconds you have tilted right just to the appropriate angle, and then you turn the handlebars to the right — taking you into a right turn. Your angle leaning to the right will be exactly what is needed to compensate for the centrifugal force, and you will turn right, safe and stable. You do this without thinking, and without understanding the subtle kinesthetic mechanics.^[24] But if we want to build a bike-riding robot, this is the depth of understanding that we need to have.

This bike-riding robot story gave us some bad news and some good news. The bad news is that if we look inside the head of a domain expert, we find no ready-to-go model. There is no "true" model inside there. We can’t simply ask the domain experts and expect to get all the answers we need. The good news is that working together with domain experts to craft a model is a fun and rewarding job. Doing so is an iterative process of exploring lots of possible models and choosing one that is appropriate for solving the problems we have at hand.

3.1.4 Making a model means choosing one

One of the usual myths of modeling is that there is a "true" model somewhere, often thought to be embedded inside the head of the domain expert. This is not the case. Making a model involves an active choice between many possible models, and we need to choose the one that best suits our needs.

In Domain-Driven Design we sometimes use the phrase "distilling a model." Let us compare ourselves for a while with a whiskey distiller. The whiskey distiller starts with a large batch of fermented wort — something basically undrinkable — then adds some heat and collects the vapors. The distiller throws away the first part, which contains acetone. The middle part consists of most of the alcohol, some of the water, and the natural flavors that are dissolved therein. This is considered the good part and is kept. The last part consists of some alcohol, a lot of water, and some less attractive flavors. This is also discarded. What is kept is what we call whiskey. Your personal attitude toward whiskey or your tastes might vary, but you get the point. When we distill a model, we throw away some parts of reality and keep others.

The important point here is that there are many ways for a distiller to do their job. They have a choice. Keeping the middle part is a choice, because the objective for the distiller is to get a high-alcohol result with some specific flavors. But the distiller could have made other choices. Had the distiller wanted acetone instead, then the distillation would have looked different. The distiller would have kept the first part and thrown away the rest. In the same way, we can distill different models from the same reality depending on what we intend to use the models for.

Our model describing a person with name, age, shoe size, and pet is just one model. Another model could be to describe a person by date of birth, place of birth, mother’s name, and father’s name. Neither of these two models is more correct than the other. They are different, and they are good for different purposes. If we are keeping a registry for a dog owners' club, the first model is clearly superior to the second. If we are studying how a family has spread across the world through migration, the first model is worthless, and the second excellent.

Figure 3.6. Two different models of people — good for different things

When doing modeling, actively try to find different models that express your domain. Try to find three different models and compare how good they are at expressing your domain problems. Finding a good model is important, because it makes it possible to talk about the domain in an efficient and unambiguous way. A good model forms a language.

3.1.5 The model forms the ubiquitous language

An interesting aspect of modeling is that the model creates a language — the language we speak about the system.

To start with we must realize that when domain experts speak with each other they use a language of their own. This is the domain language. It might sound like English, but it’s in one regard a subset of English — there are a lot of common English words that are not used in this domain-expert language. In another regard it is a superset of English — there are a lot of domain-specific terms and idioms that are not used in common English. What domain experts speak to one another is simply a language that is geared to enabling effective communication.

Take a moment to consider the domain-expert language of system developers. Among ourselves we easily throw around terminology that makes perfect sense to us, but is completely impossible to understand for non-developers; we might "pool the connections" or "make that a strategy." And the domain experts of finance, logistics, or health care have their own lingo too.

If we are building a logistics system, it seems like a logical approach to take the terminology from logistics and just encode that as a software system. This is a wonderful idea, but unfortunately flawed. The language used by logistics experts is not logically consistent. This is not because they are particularly sloppy with terminology. We software developers are equally sloppy with our terminology. Listen in on any two seasoned developers talking, and you will find that they might use the words "object," "instance," and "class" interchangeably, as if they were synonyms. And we know they are not, because when we explain object orientation to beginners we are very careful to distinguish between "classes" and "objects." But when two experts discuss, they can be sloppy because they understand each other anyway and the real discussion is elsewhere, on a higher level.

	Tip
	Don’t turn into the language police, correcting domain experts when they talk to each other. They are allowed to be sloppy, and so are you when talking to your peers.

If we are building a logistics system, wouldn’t it be wonderful if we could form a language where we can talk about the system in a precise way without the risk of misunderstanding? This is exactly what the model is. If we have jointly between logistics experts and developers decided that a "leg" means a transportation from one place to another using the same vehicle, and we have decided that "terminating a leg" means that the cargo is unloaded at the destination, then we can use those terms and make ourselves understood. If we say, "If two transports terminate a leg at the same dock then they can be co-transported on the next leg," then that phrase can be unambiguously understood and the functionality can be implemented.

Figure 3.7. The domain model forms a language in common.

When discussing the functionality of a system, use the words and phrasings that are part of the model. By doing so you will quickly realize whether the functionality can be implemented or not. If it is awkward to express the functionality using the terms from the model, this is a sure sign that it will be awkward to implement. It might be a sign that the model needs to be extended to contain a new term, and the system refactored for consistency.

Using the terminology of Domain-Driven Design, we want the model to become the ubiquitous language when talking about the system. By ubiquitous in this case we mean that the terminology should be used everywhere we talk about the system. The same terms should be used in the user interface, in the manuals, in the requirements or user stories, in the code, in the database tables. There is simply no point in calling something a "quantity" in the user interface, referring to it as an "amount" in the manual, and naming the database column "Volume." Insisting on using the same language across disciplines will help in finding ambiguities that could manifest as bugs or security flaws.

Figure 3.8. The model is ubiquitous: "quantity" is used consistently all over the place.

It is worth pointing out that of course the persistence model might be slightly different from the conceptual model. For example, we might have to split concepts into different tables, and we might need join tables or synthetic keys that are not part of the conceptual model. In the same way, the classes in the code might be slightly different from the terms used in the conceptual model, for implementation-specific reasons. Nevertheless, the understanding we capture is still the same, and we try to use terminology from the ubiquitous language as much as possible when we name our constructs (classes or database tables).

This does not mean we are turning into a language police force. The model, or the domain model language, is the ubiquitous language when talking about the system. The domain experts are still allowed to use their ambiguous domain language amongst themselves, in the same way as developers are allowed to be sloppy about "objects" versus "classes" in discussions with other developers.

The important point about being precise in the ubiquitous language is that when we talk about the system we need to be precise. This is especially important when business experts and developers interact, and the risk of misunderstanding is the highest. In these situations we should insist on using the terminology of the ubiquitous language.

	Tip
	Insist on using the words from the domain model in any requirements document. If something is hard to express in the terminology of the domain model, it’s probably hard to write as software.

It’s also worth pointing out that just because language is ubiquitous does not mean that it is universal. It is the ubiquitous language when talking about this specific system, not for talking about other systems (even other logistics systems). Different systems will have different needs, and different focuses. They will have different models and thus different languages. Each domain model language will be the ubiquitous language within its realm, but not outside. The context for the language has an outer bound. In Domain-Driven Design we refer to this as the bounded context for the model. Within the bounded context each word in the model has a very well-defined meaning, but outside the bounded context words can mean something completely different. We will cover bounded contexts more deeply later on in this chapter.

Understanding more about models and their purpose, we can now move on to some more pragmatic aspects. We need to actually build those models, so some typical building blocks are handy to have.

^[24] Classical Mechanics by Herbert Goldstein is an excellent book on the subject.

3.2 Building blocks for your model

Figure 3.9. Fundamental building blocks of a domain model

ch03 entities value objects aggreagates small

In order to express your domain model in code you need a set of building blocks. These building blocks should be well defined, and their purpose is to bring order and structure to complex models. They provide a framework that will allow you to keep your domain logic clearly separated from the rest of your code and guide you through the technical difficulties in doing so. The building blocks from DDD that are of special interest in this book are entities, value objects, and aggregates, figure 3.9. They are interesting because used in a certain way they can also be building blocks for software security. Understanding the meaning of these will help you understand the concepts discussed in the rest of this book. In this section you’ll learn the meaning of each of these terms, the details that define them, and how they are used.

3.2.1 Entities

Every part of your domain model has certain characteristics and a certain meaning. Entities are one type of model object that has some very distinct properties. What makes an entity special is that:

It has an identity that defines it and makes it distinguishable from others.
It has an identity that is consistent during its lifecycle.
It can contain other objects, such as other entities or value objects.
It is responsible for the coordination of operations on the objects it owns.

What this means is that if we need to know if two entities are the same, we look at their identities instead of their attributes. It’s the identity of the entity that defines it, regardless of its attributes, and the identity is consistent over time. During the lifecycle of an entity it may transform and take on many different attributes and behaviors, but its identity will always remain the same. Let’s consider a car, for example. Many attributes of a car can change during its existence. It can change owners, have parts replaced, or be repainted. But it’s still the same car. In this case the identity of the car can be defined by its vehicle identification number (VIN), which is a unique 17-character identifier given to every car when it’s manufactured.

Sometimes an entity’s identity is unique within the system, but sometimes its uniqueness is constrained to a certain scope. In certain cases the identity of an entity can even be unique and relevant outside of the current system. The identity is also what is used to reference an entity from other parts of the model.

Another important trait of an entity is that it’s responsible for the behavior and coordination of the objects it owns, not only in order to provide cohesion but also so that it can maintain its internal invariants.

The ability to identify information in a precise manner, and to coordinate and control behavior, is crucial if you want to avoid security bugs sneaking into your code. In upcoming chapters you will see that this is what makes entities a very important tool for designing secure code.

The continuity of identity

Figure 3.10. The attributes of the customer change, but the identity remains the same.

ch03 entities attributes change identity remains small

Sometimes a domain object is defined by its attributes. But sometimes those attributes change over time without implying a change of identity of the domain object. For example, a representation of a customer can be defined by its attributes name, age, and address. Most of these attributes can change during the time the customer exists in the system, but it’s still the same customer with the same trail of history in the system, so its identity should not change. It would quickly become quite messy if the system were to create a new customer every time an address got updated. The customer in this case is not defined by its attributes but rather by its identity and should therefore be modeled as an entity. This way the customer’s identity will stay consistent for as long as the customer exists in the system and regardless of how many state changes it goes through during that existence.

Choosing the right way to define an entity’s identity is essential and should be done carefully. The result of that definition will typically be in the form of an identifier. This means that the identity, and uniqueness, of an entity is determined by its identifier. Sometimes the identifier can be a generated unique ID, and sometimes it can be the result of applying some function to a selected set of attributes of the entity. In the latter case you need to pay careful attention to not include any attributes that may change over time. This can be hard because what attributes stay fixed may change during the evolution of the system. As a rule of thumb, favor generated IDs over an identity based on attributes.

It’s also important to note that what we mean by identity in DDD is not the same concept of identity, or equality, that is built into many programming languages. In Java, for example, object equality is by default the same as instance equality. Unless we explicitly define our own method for equality, two object instances representing the same customer will not be equal. In other words, the identity is not dependent on a specific representation of the entity. Regardless of whether the customer is represented as an object instance, a JSON document, or binary data, it’s still the same entity.

Local, global, or external uniqueness

The identity of an entity is important, but the scope in which its identity is unique can vary. Consider for example our customer entity. A system could use an identifier that is unique not only to the current system but also outside of the system. This is an example of an externally unique identifier. An example of this would be to use a national identifier like those used by many countries as a means to identify their citizens. In the United States this would be the Social Security number. Using an externally defined identifier can, however, come with certain drawbacks. One of them is security implications, as we will see in later chapters.

Figure 3.11. Some entities need to be globally unique.

Perhaps more common than externally unique identifiers are identities made to be unique within the scope of the system or within the boundaries of the current model. Such identifiers can be referred to as being globally unique. An example of this is a unique ID generated by the system when a new customer is created. There can be some interesting technical challenges involved here that are worth pointing out. If the method used to generate IDs can guarantee the uniqueness of each ID, assigning them is a fairly straightforward process. But if you are dealing with distributed systems, generating globally unique IDs in an efficient way can be a technical feat in itself.

Figure 3.12. Some entities only have local identities.

Some entities will be contained within another entity. Because such encapsulated entities are managed by the entity that holds them, it’s usually enough if they have an identity that is only unique inside the owning entity. This identity is said to be local to the owning entity. To go back to our customer entity, say our system is a customer management system for retail stores and every customer belongs to one, and only one, store. In this case the identity only needs to be unique within the store the customer belongs to. Modeling an identity to have local uniqueness can simplify the ID generation function. It also makes it clearer that the responsibility for managing those entities lies with the encapsulating entity.

Keep entities focused

One thing to keep in mind when you are modeling entities is to try to only add attributes and behaviors that are essential for the definition of the entity, or help to identify it. Other attributes and behaviors should be moved out of the entity itself and put into other model objects that can then be part of the entity. These model objects can be other entities or they can be value objects, which we will look at in the next section.

Entities are concerned with the coordination of operations on not only themselves but also the objects they own. This is important because there may be certain invariants, or rules, that apply to a certain operation, and because the entity is responsible for maintaining its internal state and encapsulating its behavior, it must also own the operations on the internals. If the operations were to be moved outside of the entity, this would make it anemic.^[25]

Figure 3.13. Entities coordinate operations.

When boarding an airplane, each passenger must present a boarding card in order to verify that they’re about to enter the correct plane, and to make it easy to keep track of whether anyone is missing when the plane is about to depart. If passengers were allowed to freely walk in and out of the airplane the stewards would need to check all the boarding cards after everyone was seated. This would be a lot more time-consuming and possibly cause confusion if passengers had taken a seat in the wrong plane. With this in mind, it makes sense to control and coordinate the boarding of passengers. The same goes for the software model to handle this. If we have the airplane modeled as an entity with a list of boarded passengers, then other parts of the system shouldn’t be allowed to freely add passengers to that list as it would be too easy to bypass the invariants. A passenger should be added by a method board(BoardingCard) on the airplane entity. This way the airplane entity controls the boarding of passengers and can maintain a valid state. It will only allow boarding of passengers with a boarding card that matches the current flight.

Entities play a central role in representing concepts in a domain model, but not everything in a model is defined by its identity. Some concepts are instead defined by their values. We use value objects to model such concepts.

3.2.2 Value objects

As you learned in the previous section, an entity is often made up of other model objects. Attributes and behaviors can be moved out of the entity itself and put into other objects. Some of them will be other entities, but many of them will be value objects. The key characteristics of a value object are that:

It has no identity that defines it, but rather it is defined by its value.
It is immutable.
It should form a conceptual whole.
It can contain other value objects.
It can reference entities.
It explicitly defines and enforces important constraints.
It can be used as an attribute of entities and other value objects.
It can be short-lived.

As we will see in upcoming chapters, these properties are part of what gives value objects an important role to play when it comes to writing code that is secure by design.

Defined by its value

Because a value object is defined by its value rather than its identity, two value objects of the same type are said to be equal if they have the same value. In other words, we only care about what they are, not who or which they are.^[26] Value objects have no identity. This is the total opposite of how we define entities.

Say we have the concept of money in our domain model. We can choose to model money as a value object because we don’t distinguish between different coins or bills. A $5 bill is worth just as much as another $5 bill. It is the value of the bill that matters, not which bill it is. Note, however, that whether a concept should be treated as a value object without identity or as an entity with a unique identity is dependent on which context we are currently looking at. If we were modeling the domain of a central bank then we probably would choose to model money as an entity, because in the view of a central bank, which is responsible for not only creating banknotes but also keeping track of them and eventually destroying them, each $5 bill is unique. It is created and given a unique serial number that identifies it so it can be distinguished from other $5 bills. It will remain in use until one day it’s time to destroy it (perhaps to be replaced by a new type of bill, with a new identity). In the view of the central bank, money has an identity and a lifecycle.

Immutable

Because a value object is defined by its value it’s important to make sure that the value cannot be changed — if the value is changed, it’s no longer the same value object. This is why a value object must be immutable. If a value object were mutable, then changing its value could break the invariants of some other object containing the value object. Having value objects be immutable also means that they are safe to pass around as arguments and allows for various technical optimizations, such as reusing objects if memory is scarce, and ease of use in multithreaded solutions.

Conceptual whole

A value object can consist of one or more attributes, or other value objects. It can also reference, but not contain, entities. The reason for this is that the value of an entity can change. If the value object contained the entity, rather than referencing it, then the value object itself would change whenever the entity changed, which in turn would break the immutability of the value object.

When modeling a value object and deciding what it should contain it’s important that it forms a conceptual whole. In other words, it should be a whole value.^[27] This means that a value object should not just be a convenient grouping of attributes, objects, and references, but should form a well-defined concept in the domain model. This is true even if it contains just one attribute. When your value object is modeled as a conceptual whole it carries meaning when passed around and it can uphold its constraints.

Figure 3.14. A value object should form a well-defined concept.

In figure 3.14 you can see two different ways to model a customer and its related attributes. In the model to the left all the attributes have been grouped together in a model object called CustomerInfo. In the model to the right, the attributes have been modeled so that they are grouped to form well-defined concepts. Street, zip, and city have been grouped together in a value object called Address. Phone number and email have been put in a value object called ContactInfo. Age became its own value object. Always strive to model your value objects to form a conceptual whole.

It’s also important to understand that a value object is not just a data struct that holds values. It can also encapsulate (sometimes nontrivial) logic associated with the concept it represents. For example, a value object representing a GPS^[28] point could have a method that calculates the distance between itself and another GPS point using nontrivial numerical calculations.

Defines and enforces invariants

Figure 3.15. Value objects should enforce their own invariants.

Let’s say we have a value object Age that has one integer value, as seen in figure 3.15. In Java, for example, an integer can by default take the values from -2³¹ to 2³¹-1. You would probably not consider that range to be typical for a person’s age. Therefore, you should model age as a value object with proper constraints, or invariants, so that its definition becomes clear. You could during your modeling come to the conclusion that the age of a person should be between 0 and 150 years.^[29] Or maybe your domain does not allow for young children, so the minimum age might be 18. Whatever range you choose, it will be a lot stricter than allowing the full range of a Java integer.

These types of invariants should be enforced within the value object itself and not be put into other domain objects or utility methods.

It is also worth noting that the types of invariants we’re talking about are not the types of checks that are commonly referred to as validation. Validation checks are typically performed when asserting that a domain object is valid for a certain operation (that is, that it’s possible to perform a specific action on it). An example of validation would be to check if an order is ready to be sent to the shipping system. The validation could include verifying that the order has been paid for and that it contains the necessary address information. This type of validation often involves multiple domain objects and is generally performed as late as possible.^[30]

3.2.3 Aggregates

When dealing with a model object that has a lifecycle, such as an entity, it’s important to make sure that its state remains valid throughout its entire lifecycle. This can require quite a bit of logic to implement and may involve code to handle locking mechanisms to support concurrent operations, and managed updates to persistent storage. Regardless of whether the entity is being persisted or not, the state change can be said to take place within a transaction.^[31] Transaction management is usually feasible when it comes to a single entity, but in reality your domain model is typically not that simple and involves many connections between various entities and value objects. This means the consistency you need to manage spans over multiple domain objects. Once faced with such a situation the question quickly arises of how to manage transactions that span multiple elements in the model. This is where the aggregate comes in.

An aggregate is a conceptual boundary that we use to group parts of the model together. The purpose of this grouping is to allow us to treat the aggregate as a unit during state changes. In other words, it is the boundary within which transactions must be managed. The boundary that is defined by the aggregate is not randomly chosen, or chosen from a technical point of view. It is carefully selected based on deep insights of the model. When modeling an aggregate, it must follow a strict set of rules for it to work as intended and to fulfill its purpose. These rules, as put forward by Eric Evans,^[32] are:

Every aggregate has a boundary and a root.
The root is a single, specific entity contained in the aggregate.
The root is the only member of the aggregate that objects outside of the boundary can hold references to. Thus:
- The root has global identity.
- The root controls all access to the objects within the boundary.
- Entities other than the root have local identity. Their identities don’t have to be known outside of the aggregate.
- The root may pass references to internal entities to other objects, but those references can only be used transiently and can never be held onto.
- The root may pass copies of value objects to other objects. Once copied, they can be used freely by others.
Invariants between the members of the aggregate are always enforced within each transaction.
Invariants that span multiple aggregates cannot be expected to be consistent all the time, but they can eventually become consistent.
Objects within the aggregate can hold references to other aggregates.

This is quite a comprehensive set of rules, and you might want to go through them again and think about their meaning and the implications each of these will bring to the design of not only your model but also your code. There are, however, a couple of traits that we would like to expand on a little to make things clearer.

The aggregate is a conceptual boundary, and it contains an entity that is the root of the aggregate. In general, when implementing aggregates the root entity and the aggregate will be the same object, so reasoning about them might become easier if you think of them as being the same.

The root of the aggregate is the only point of reference outside of the boundary. The root also controls all access to everything within the boundary. This makes the root the perfect place for upholding all the invariants that span across the objects within the boundary. And it can’t be bypassed as long as you stick to the rules on how to model aggregates.

Repositories

We will not delve into detail about repositories, but you can think of them as technology-agnostic storage for aggregate roots. You can put aggregate roots into a repository and then get them back at a later time. You can also use repositories to delete previously stored roots if your model supports that.

Another implication of the root being the only point of reference is that the root is the only thing that can be accessed through a repository (see sidebar). This, again, is a way to control how an aggregate is accessed and to make sure an entity within the aggregate cannot be manipulated directly by objects outside of the aggregate.

Aggregates, with their boundaries and upholding of consistent state, will turn out to be of importance when you start looking at how to use them to drive security in your code.

A simple aggregate

Figure 3.16. The company domain model

Let’s take a look at an example of how we could model an aggregate. Our example model consists of a company and its employees. We make the company an entity because it has a clear identity, and because our system can handle many companies it also need to be globally identifiable. The company has a name, but the name is merely a value, so we make it a value object. It also has employees who work at the company. An employee definitely has an identity, so it is also modeled as an entity. An employee always belongs to a company, so it becomes a child entity of the company. Each employee will have a specific role, or position, but that is also a value, so it becomes a value object. The resulting model can be seen in figure 3.16.

After discussing the nature of an employee together with the domain experts, we realize that it doesn’t have to be identifiable outside of the company. The employee object can have local identity. We also realize that when roles are assigned to employees within the company there are certain roles that can only be held by one person at a time. There can, for example, only be one CTO at any given point. The same goes for many other roles. To uphold these required invariants the company entity should control the assignment of roles to employees.

Figure 3.17. The company modeled as an aggregate

This leads us to the insight that the company, together with its child objects, should be modeled as an aggregate. We make the company the root of the aggregate, and you can see the result in figure 3.17. This means that the company, which is globally identifiable, can be referenced and looked up by others, but the only way to get to an employee is to go through the aggregate root, the company. The same goes for assigning new roles to employees. The role assignment is handled by a method on the company. Since all operations on the aggregate are controlled by the root, it becomes a straightforward task to uphold the invariants regarding the employees.

In this section you have learned the basics about the fundamental building blocks used to create domain models in DDD. We have gone through a lot of material so far, and it might take some time to digest all this information properly. But if you stay with us you will learn about bounded contexts, the next important concept from DDD that you need to be familiar with before you get into the remaining chapters of this book.

^[25] Fowler M., "AnemicDomainModel" (2003), http://www.martinfowler.com/bliki/AnemicDomainModel.html

^[26] Evans E., Domain-Driven Design: Tackling Complexity in the Heart of Software (Addison-Wesley Professional, 2004), p. 98

^[27] Cunningham W., "The CHECKS Pattern Language of Information Integrity: 1. Whole Value" (1994), http://c2.com/ppr/checks.html#1

^[28] Global Positioning System, a satellite based navigation system that provides accurate positioning on earth

^[29] 150 might be a bit of a stretch, but people are living longer and longer, so you may want to future-proof your model.

^[30] Cunningham W., "The CHECKS Pattern Language of Information Integrity: 6. Deferred Validation" (1994), http://c2.com/ppr/checks.html#6

^[31] Again, we’re not talking database transactions here but logical state transactions.

^[32] Evans E., Domain-Driven Design: Tackling Complexity in the Heart of Software (Addison-Wesley Professional, 2004)

3.3 Bounded contexts

Another interesting concept is the bounded context pattern, which defines the applicability of the domain model. As it turns out, it’s not only essential in DDD, it’s also important from a security perspective. Some complex attacks are easier to understand when using bounded contexts as a basis for the analysis. To see this, you need to fully understand the concept, and therefore we’ll start by diving into the semantics of the ubiquitous language.

3.3.1 Semantics of the ubiquitous language

Merriam-Webster defines ubiquitous as "existing or being everywhere at the same time." In DDD this translates to a language spoken everywhere at all times by everyone to promote clarity and common understanding — a ubiquitous language as illustrated in figure 3.18.

Figure 3.18. The ubiquitous language is present everywhere, at all times, to promote clarity and common understanding.

It’s easy to think that "everywhere at all times by everyone" means there should be a unified language with terms, operations, and concepts that capture the entire business — but that’s a huge misunderstanding. Anyone who has tried this knows it’s doomed to fail simply because it’s too complex. And the reason is semantics. A term or concept may have the same name in various parts of the business, but each usage may have different meaning. For example, consider the word "package." If you ask someone in the Shipping department they will say that it’s a box, while in the IT department, they’ll say that it’s a logical grouping of files in the codebase. In other words, the term "package" is used by both departments, but with different semantics. Trying to capture this in a unified language is probably not a good idea because it requires a new term that captures both meanings. The obvious conclusion is to allow two coexisting languages, instead of a unified language with imprecise semantics. With this in mind, let’s see how the ubiquitous language relates to the model and the bounded context.

3.3.2 The relationship between language, model, and bounded context

The relationship between language, model, and bounded context becomes clear when you see it from a semantic point of view. A model is an abstraction of the domain in which concepts, relationships, and terms of the ubiquitous language is found. This makes the language and model tightly coupled, not only though the terms and relationships but also through semantics — a concept found in the model must have the same meaning in the language and vice versa.

As long as the meanings of terms, operations, and concepts remain the same, the model holds. But as soon as the semantics change, the model breaks and the boundary of the context is found.

A similar relationship is found between the model and context in which the model applies. As long as the meanings of terms, operations, and concepts remain the same, the model holds. But as soon as the semantics change, the model breaks and the boundary of the context is found. Realizing this is important because this is where the meaning of a concept could change only because it crossed the boundary. That means that everything within the context adheres to the semantics of the model, but outside the boundary, the same concept may have different meaning. This certainly makes sense, but it feels a bit theoretical, so let’s dive into an example where we define the ubiquitous language, create a model, and use it to identify the semantic boundary of a context.

	The semantic boundary of a context is interesting from a security perspective
	Data crossing a semantic boundary is of special interest from a security perspective because this is where the meaning of a concept could implicitly change which could open up for security weaknesses.

3.3.3 Identifying the bounded context

When identifying a bounded context, a good starting point is to analyze the ubiquitous language. For example, let’s consider the following conversation between a developer and a domain expert in the Shipping department:

Developer: "So, what characterizes an order?"

Expert: "Well, an order contains products that are sellable and non-sellable items."

Developer: "Not sure I understand. What do you mean by non-sellable products?"

Expert: "Non-sellable products are items that are bundled with sellable products when shipped as a package to their destination."

Developer: "Oh, I see. So non-sellable items are products without a price?"

Expert: "No no, all products have a value, but bundled products have a price of zero so they get included for free."

Developer: "Hmm, OK, I guess that makes sense."

Up to this point, lots of confusion exist, but it’s possible to identify significant terms and manifest them as a raw version of a domain model as seen in figure 3.19.

Figure 3.19. Raw domain model

One of the core principles of the ubiquitous language is to avoid ambiguities, because these create a lot of confusion and misunderstandings. We see this in figure 3.19, where the model has lots of ambiguity and duplicated concepts. Let’s get back to the conversation and see how the language and model evolve:

Developer: "I’m a bit confused about the terminology. Could we agree on just using some of the terms?"

Expert: "Sure, any particular ones in mind?"

Developer: "It seems as we only have products. Is it OK to stop using words like items, things, non-sellable, sellable?"

Expert: "OK, that makes sense. So from now on we use the term product for all of them."

Developer: "Included and bundled mean the same thing, right?"

Expert: "Yes, so let’s only use bundled."

Developer: "What about price and value?"

Expert: "Same thing. Let’s use price."

Developer: "Why do we need to care whether a product is free or not?"

Expert: "You’re right. We don’t. Let’s not use free."

This "distillation" process results in a much tighter language and a refined domain model, as seen in figure 3.20.

Figure 3.20. A refined domain model with less redundancy

But sometimes distilling also uncovers missing terms, and this is the case here as well:

Developer: "So, an order may have one or more products?"

Expert: "Yes, that’s correct. But an order without products isn’t much of a package."

Developer: "Package?"

Expert: "Oh, sorry. Yes, a package is what we call the box in which we ship everything."

Developer: "OK, makes sense. So how do we know how many products we need to ship in a package?"

Expert: "Well, the quantity of each product is specified in the order."

Developer: "Ah, I see. Let’s introduce quantity and package in our ubiquitous language and add them to the model."

Figure 3.21. Final domain model

After this last revision, as seen in figure 3.21, the developer and expert are quite confident that they share the same view and understanding of an order. But how far does the model "reach" in the organization? When does the model no longer hold? Determining this is the key to finding the boundary of the context. The developer starts asking around, and everyone in the Shipping department seems to have a common understanding. But when talking to an expert in the Finance department, the model suddenly breaks:

Developer: "Could you please have a look at our model of an order?"

Finance Expert: "Sure. The model makes sense, but you seem to miss a lot of important concepts."

Developer: "Really? Please explain."

Finance Expert: "The payment information and due date are missing. Also, the reserved amount doesn’t seem to be represented."

Developer: "Aha. We seem to have a different definition of an order. Thanks for your time."

	The context boundary is found when the model no longer holds
	As soon as the semantics of the model no longer hold, the boundary of the context is found.

By trying to find where the model’s semantics didn’t hold anymore, the developer quickly realized that an order in the Finance domain is something different than in the Shipping domain. And this tells us where the context boundary is. This can be illustrated as two separate contexts, where an order is present in both but with different meaning, as seen in figure 3.22. But what happens if we need to communicate and pass an order between the contexts? Are there any other concepts that are similar but with different semantics? This brings us to the next topic — interactions between contexts.

Figure 3.22. Two contexts with the same concept but with different semantics

3.4 Interactions between contexts

The context boundary is especially interesting from a security perspective when you start thinking about interactions between contexts. When data crosses a boundary, it implicitly accepts the semantics of the receiving context’s ubiquitous language and model. This implies that every time no action is taken, a potential security vulnerability opens. Although this may be obvious, problems of this kind are surprisingly common and the root cause may ironically be the attempt to satisfy DRY (Don’t Repeat Yourself). Andrew Hunt and David Thomas^[33] defined the principle as:

	Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.
	-- Andrew Hunt and David Thomas The Pragmatic Programmer Addison-Wesley 2003

Many interpret this as avoiding "syntactic duplication" — for example, the result of copying and pasting code — but the principle is really about semantics. And this brings us back to the ubiquitous language. The ubiquitous language requires the semantics of the domain model to be unambiguous throughout the context. But if we apply a syntactic interpretation of DRY, the method of how you share data between contexts suddenly becomes a technical matter rather than semantic. And this is a huge problem, because if models are shared to reduce syntactic duplication but certain concepts mean different things, it opens the door to all sort of craziness — including security weaknesses, as discussed in chapter 12. To illustrate this, let’s revisit the example with a Shipping and a Finance context, but this time with a shared model to reduce syntactic duplication.

3.4.1 Sharing a model in two contexts

Both Shipping and Finance use concepts such as order, product, and price. Having a shared model is indeed compelling, as it minimizes duplication. But to do this, we need a few more concepts from the Finance domain, as seen in figure 3.23.

Figure 3.23. A unified order model that is shared between the Shipping context and the Finance context

At first, having a unified model is a great success: the only apparent "downside" is a rich model with some unused concepts. But let’s see what happens when a new business requirement is introduced in the Shipping context:

Expert: "To simplify customs declarations for international shipments, we need to list the actual value of a package."

Developer: "OK. So how should we treat bundled products?"

Expert: "Well, previously we made the product free by faking the price by setting it to zero, but that’s no longer OK."

Developer: "Right, so is it OK to just remove the faked price?"

Expert: "Yes, the sum of all prices is the actual value of the package, so that should work."

Developer: "And then we deduct the bundle prices from the total, right?"

Expert: "No, the reserved amount is what’s charged by Finance, so we don’t need to deduct anything."

Developer: "OK, that makes sense. I’ll only remove the faked price then."

Making the changes don’t require much effort, and initially everything works fine, but after a while, strange behaviors start to emerge in Finance. For some reason, every now and then the invariant reserved amount ≥ sum of all prices is false. This seems like a minor problem because it only happens when you have bundled products, but it does in fact start a full-blown security investigation. A violation of the invariant is the same as order tampering, and that’s a serious security problem!

The investigation did of course not show any security breach, but it’s interesting how a simple change could lead to all of this. Analyzing it further shows that the root cause is in fact having one model to represent two conceptual views of an order. The invariant reserved amount ≥ sum of all prices only makes sense in the Finance context, but as a direct consequence of sharing a model, the Shipping context is forced to respect the invariant even though it doesn’t make sense. Obviously this isn’t a good thing, because it prevents each context from being independent and master of its own model. But if we don’t share a model, how do we know what concepts that need special attention when communicating across context boundaries? The solution is to draw a context map.

3.4.2 Drawing a context map

A context map is a conceptual view of how different contexts interact. This could be a graphical drawing, described in text, or simply an understanding communicated between teams. Regardless of how it’s manifested, the key point is that it helps identify concepts that cross semantic boundaries. An incorrect mapping is often the root cause of misunderstandings that may become exploitable. Identifying the context boundary is therefore of great importance — but it may be easier said than done. If you don’t know where to start, a good strategy is to use Conway’s Law as a starting point.^[34] Mel Conway published the paper "How Do Committees Invent?" in 1967 with the thesis:

	Any organization that designs a system (defined broadly) will produce a design whose structure is a copy of the organization’s communication structure.
	-- Mel Conway How Do Committees Invent?

The implication of the thesis is that the communication structures that exist in the organization are reflected in the architectural design of the system. This also seems to apply to how bounded contexts are defined. As many teams tend to organize around subsystems, bounded contexts often follow the same rules. Laying out the teams is therefore a good starting point when trying to identify the bounded contexts for your map.

To illustrate what a graphical representation of a context map might look like, we need to revisit the Shipping and Finance contexts one more time. To gain deeper insight, a good starting point is to draw a simple, high-level picture of the system interactions when a new order is processed, as seen in figure 3.24.

Figure 3.24. Interaction between Finance and Shipping

Interaction between Finance and Shipping

Finance receives a new order.
The total value of the order is reserved on the account specified by the payment information.
Finance sends the order to Shipping for processing.
Shipping processes the order and ships it.
Shipping notifies Finance with an updated status.
Finance completes the financial transaction.

The interaction flow diagram is easily converted into a context map where it becomes clear that the Shipping context is downstream of the Finance context, as seen in figure 3.25. This may seem obvious, but the mere understanding that a Finance order must be translated to a Shipping order makes a huge difference. The relationship makes it clear that explicit mapping is required and that communication between the teams is needed to ensure success.

Figure 3.25. The Shipping context is downstream of the Finance context.

You’ve now gained a conceptual view of why bounded contexts are important and how context maps are created, but we still need to show how they relate to security. As you’ll see, there’re several examples in the upcoming chapters that become easier to understand when having bounded contexts in mind. For example, in chapter 7 when looking at failure handling, or in chapter 9 when analyzing XSS attacks, or in chapter 12 when dealing with monoliths. There are a few more, so please stay tuned. In the next chapter, you’ll learn about code constructs that promote security by using ideas from this chapter combined with concepts from other fields. As a result, you’ll be able to immediately apply them in your everyday work and learn how to spot exploitable weaknesses in your existing codebase.

^[33] Hunt T. and Thomas D., The Pragmatic Programmer (Addison-Wesley, 2003)

^[34] http://www.melconway.com/Home/Conways_Law.html

3.5 Summary

In this chapter you learned that:

Building domain models is a good way to promote deep learning about the domain.
A domain model should be a strict and unambiguous representation of the domain that captures only the most important aspects.
When creating a domain model you make a choice among many possible models.
The domain model forms a language for communicating about the system.
Entities, value objects, and aggregates are the basic building blocks for your domain model.
Entities have an identity that is consistent during their lifecycle and can contain other entities or value objects.
The uniqueness of entities always has a scope, and that scope depends on your model.
A value object does not have an identity but rather is defined by its value.
A value object must always be immutable and should form a conceptual whole.
An aggregate is a conceptual boundary that groups together other model objects and is responsible for upholding invariants among those objects. It always has an aggregate root, and in code that root is typically the same as the aggregate.
The aggregate root has global identity because this is the only part of the aggregate that other parts of the model can hold a reference to.
The ubiquitous language is spoken by everyone on the team, including domain experts, to ensure a common understanding.
The domain model is bound by the semantics of the ubiquitous language.
The bounded context is the context in which the semantics of the model hold. As soon as the semantics change, the model breaks and the boundary of the context is found.
Using Conway’s Law is a good starting point when trying to find the boundary of a context.
Data crossing a semantic boundary is of special interest from a security perspective because this is where the meaning of a concept could implicitly change.

3