1 First piece of the puzzle

published book

This chapter covers

  • Modularity and how it shapes a system
  • Java’s inability to enforce modularity
  • How the new module system aims to fix these issues

We’ve all been in situations where the software we’ve deployed refuses to work the way we want it to. There are myriad possible reasons, but one class of problems is so obnoxious that it earned a particularly gracious moniker: JAR hell. Classic aspects of JAR hell are misbehaving dependencies: some may be missing but, as if to make up for it, others may be present multiple times, likely in different versions. This is a surefire way to crash or, worse, subtly corrupt running applications.

The root problem underpinning JAR hell is that we see JARs as artifacts with identities and relationships to one another, whereas Java sees JARs as simple class-file containers without any meaningful properties. This difference leads to trouble.

One example is the lack of meaningful encapsulation across JARs: all public types are freely accessible by all code in the same application. This makes it easy to inadvertently depend on types in a library that its maintainers considered implementation details and never polished for public use. They likely hid the types in a package called internal or impl, but that doesn’t stop us from importing them anyway.

Then, when the maintainers change these internals, our code breaks. Or, if we hold enough sway in the library’s community, the maintainers may be forced to leave untouched code they consider internal, preventing refactoring and code evolution. Lacking encapsulation leads to reduced maintainability—for libraries as well as for applications.

Less relevant for everyday development, but even worse for the ecosystem as a whole, is that it’s hard to manage access to security-critical code. In the Java Development Kit (JDK), this led to a number of vulnerabilities, some of which contributed to Java 8’s delayed release after Oracle bought Sun.

These and other problems have haunted Java developers for more than 20 years, and solutions have been discussed for almost as long. Java 9 was the first version to present one that’s built into the language: the Java Platform Module System (JPMS), developed since 2008 under the umbrella of Project Jigsaw. It allows developers to create modules by attaching metainformation to JARs, thus making them more than mere containers. From Java 9 on, the compiler and runtime understand the identity of and relationship between modules and can thus address problems like missing or duplicate dependencies and the lack of encapsulation.

But the JPMS is more than just a Band-Aid. It comes with a number of great features we can use to develop more beautiful, maintainable software. Maybe the biggest benefit is that it brings every individual developer and the community at large face-to-face with the essential concept of modularity. More knowledgeable developers, more modular libraries, better tool support—we can expect these and more from a Java world where modularity is a first-class citizen.

I recognize that many developers will skip past multiple versions of Java when upgrading. For example, it’s common to go straight from Java 8 to Java 11. I’ll call attention to differences between Java 9, 10, or 11 where they occur. Most of the material in the book is the same for all versions of Java, starting with Java 9. In some cases, I write Java 9+ as shorthand for Java 9 or later.

This chapter starts in section 1.1 by exploring what modularity is all about and how we commonly perceive a software system’s structure. The crux is that, at a specific level of abstraction (JARs), the JVM doesn’t see things like we do (section 1.2). Instead, it erases our carefully created structure! This impedance mismatch causes real problems, as we’ll discuss in section 1.3. The module system was created to turn artifacts into modules (section 1.4) and solve the issues arising from the impedance mismatch (section 1.5).

livebook features:
highlight, annotate, and bookmark
Select a piece of text and click the appropriate icon to annotate, bookmark, or highlight (you can also use keyboard shortcuts - h to highlight, b to bookmark, n to create a note).

You can automatically highlight by performing the text selection while keeping the alt/ key pressed.
highlights
join today to enjoy all our content. all the time.
 

1.1 What is modularity all about?

How do you think about software? As lines of code? As bits and bytes? UML diagrams? Maven POMs?

I’m not looking for a definition but an intuition. Take a moment and think about your favorite project (or one you’re being paid to work on): What does it feel like? How do you visualize it?

1.1.1 Visualizing software as graphs

I see code bases I’m working on as systems of interacting parts. (Yes, that formal.) Each part has three basic properties: a name, dependencies on other parts, and features it provides to other parts.

This is true on every level of abstraction. On a very low level, a part maps to an individual method, where its name is the method’s name, its dependencies are the methods it calls, and its features are the return value or state change it triggers. On a very high level, a part corresponds to a service (did anyone say micro?) or even a whole application.

Imagine a checkout service: as part of an e-shop, it lets users buy the goods they picked out. In order to do that, it needs to call the login and shopping cart services. Again we have all three properties: a name, dependencies, and features. It’s easy to use this information to draw the diagram shown in figure 1.1.

Figure 1.1 If the checkout service and its dependencies are jotted down, they naturally form a small graph that shows their names, dependencies, and features.
c01_01.png

We can perceive parts on different levels of abstraction. Between the extremes of methods and entire applications, we can map them to classes, packages, and JARs. They also have names, dependencies, and features.

What’s interesting about this perspective is how it can be used to visualize and analyze a system. If we imagine, or even draw, a node for every part we have in mind and then connect them with edges according to their dependencies, we get a graph.

This mapping comes so naturally that the e-shop example already did it, and you probably didn’t notice. Take a look at other common ways to visualize software systems, such as those shown in figure 1.2, and graphs pop up everywhere.

Figure 1.2 In software development, graphs are ubiquitous. They come in all shapes and forms: for example, UML diagrams (left), Maven dependency trees (middle), and microservice connectivity graphs (right).
c01_02.png

Class diagrams are graphs. Build tools’ dependency output is structured like trees (if you use Gradle or Maven, try gradle dependencies or mvn dependency:tree, respectively), which are a special type of graph. Have you ever seen those crazy microservice diagrams, where you can’t understand anything? Those are graphs, too.

These graphs look different, depending on whether we’re talking about compile-time or run-time dependencies, whether we look at only one level of abstraction or mix them, whether we examine the system’s entire lifetime or a single moment, and many other possible distinctions. Some of the differences will become important later, but for now we don’t need to go into that. For now, any of the myriad of possible graphs will do—just imagine the one you’re most comfortable with.

1.1.2 The impact of design principles

Visualizing a system as a graph is a common way to analyze its architecture. Many of the principles of good software design directly impact how it looks.

Take, for example, the principle that says to separate concerns. Following it, we strive to create software in which each individual part focuses on one task (like “log user in” or “draw map”). Often, tasks are made up of smaller tasks (like “load user” and “verify password” to log in the user) and the parts implementing them should be separated as well. This results in a graph where individual parts form small clusters that implement clearly separated tasks.

Conversely, if concerns are poorly separated, the graph has no clear structure and looks like everything connects to everything else. As you can see in figure 1.3, it’s easy to distinguish the two cases.

Figure 1.3 Two systems’ architectures depicted as graphs. Nodes could be JARs or classes, and edges are dependencies between them. But the details don’t matter: all it takes is a quick glance to answer the question of whether there is good separation of concerns.
c01_03.png

Another example of a principle that impacts the graph is dependency inversion. At run time, high-level code always calls into low-level code, but a properly designed system inverts those dependencies at compile time: high-level code depends on interfaces and low-level code implements them, thus inverting the dependencies upward toward interfaces. Looking at the right variant of the graph (see figure 1.4), you can easily spot these inversions.

Figure 1.4 A system where high-level code depends on low-level code creates a different graph (left) than one where interfaces are used to invert dependencies upward (right). This inversion makes it easier to identify and understand meaningful components within the system.
c01_04.png

The goal of principles like separation of concerns and dependency inversion is to disentangle the graph. If we ignore them, the system becomes a mess, where nothing can be changed without potentially breaking something seemingly unrelated. If we follow them, the system can be organized well.

1.1.3 What modularity is all about

The principles of good software design guide us toward disentangled systems. Interestingly, although maintainable systems are the goal, most principles lead us there on paths that allow us to concentrate on individual parts. The principles focus not on the entire code base, but on single elements, because in the end their characteristics determine the properties of the systems they constitute.

We already glanced at how separation of concerns and dependency inversion provide two positive characteristics: focused on a single task and depending on interfaces, not implementations. The most desirable traits of a system’s parts can be summarized as follows.

Essential info

Each module, what I’ve called a part up to now, has clear responsibilities and a well-defined contract it implements. It’s self-contained, it’s opaque to its clients, and it can be replaced by a different module as long as that one implements the same contract. Its few dependencies are APIs, not implementations.

Systems built from such modules are more amenable to changes and, depending on how dependencies are realized, more flexible at launch and maybe even run time. And this is what modularity is all about: achieving maintainability and flexibility as emergent properties of well-designed modules.

livebook features:
discuss
Ask a question, share an example, or respond to another reader. Start a thread by selecting any piece of text and clicking the discussion icon.
discussions
Get The Java Module System
buy ebook for  $39.99 $27.99

1.2 Module erasure before Java 9

You’ve seen how the graph of interacting parts connects to a couple of nice properties that are generally summarized as modularity. But in the end, these are just ideas—ways to talk about software. The graph is just lines of code that, in the case of Java, are eventually compiled to bytecode instructions and executed by the Java Virtual Machine (JVM). It would be great if language, compiler, and JVM (which I’ll crudely and incorrectly summarize under the term Java) could see things like we do.

And often, they do! If you design a class or an interface, then the name you give it is what Java uses to identify it. The methods you define as its API are exactly what other code can call—with the exact method names and parameter types you define. Its dependencies are clearly visible, either as import statements or fully qualified class names, and the compiler and JVM will use classes with those names to fulfill them.

As an example, let’s look at the interface Future, which represents the result of a computation that might or might not yet be finished. The type’s functionality isn’t important, though, because we’re only interested in its dependencies:

public interface Future<V> {

boolean cancel(boolean mayInterruptIfRunning);
boolean isCancelled();
boolean isDone();
V get() throws InterruptedException, ExecutionException;
V get(long timeout, TimeUnit unit)
  throws InterruptedException,
    ExecutionException,
    TimeoutException;
}

Going through the methods Future declares, it’s easy to enumerate the dependencies:

  • InterruptedException
  • ExecutionException
  • TimeUnit
  • TimeoutException

Applying the same analysis to the types just identified, we can create the dependency graph in figure 1.5. The exact form of the graph isn’t relevant here. What’s important is that the dependency graph we have in mind when we talk about a type and the one Java implicitly creates for it are identical.

Figure 1.5 The dependency graph Java operates on for any given type coincides with our perception of the type’s dependencies. This graph shows the dependencies of the interface Future across the packages java.util.concurrent and java.lang.
c01_05.png

Because of Java’s strongly and statically typed nature, it will tell you immediately if something breaks. A class’s name is illegal? One of your dependencies is gone? A method’s visibility changed, and now callers can’t see it? Java will tell you—the compiler during compilation, and the JVM during execution.

Compile-time checks can be bypassed with reflection (see appendix B for a quick introduction). For this reason, it’s considered a sharp, potentially dangerous tool, only to be used for special occasions. We’re going to ignore it for now but will come back to it in later chapters.

As an example of where Java’s perception of dependencies and ours diverge, let’s look at the service or application level. This is outside Java’s scope: it has no idea what an application is called, can’t tell you there’s no “GitHab” service or “Oracel” database (oops), and doesn’t know you changed your service’s API and broke your clients. It has no constructs that map to the collaboration of applications or services. And that’s fine, because Java operates on the level of an individual application.

But one level of abstraction clearly lies within Java’s scope, although before Java 9, it was very poorly supported—so poorly that modularization efforts were effectively undone, leading to what has been called module erasure. That level is the one dealing with artifacts, or JARs in Java’s parlance.

If an application is modularized on this level, it consists of several JARs. Even if it isn’t, it depends on libraries, which might have their own dependencies. Jotting these down, you’ll end up with the already familiar graph, but this time for JARs, not classes.

As an example, let’s consider an application called ServiceMonitor. Without going into too much detail, it behaves as follows: it checks availability of other services on the network and aggregates statistics. Those are written to a database and made available via a REST API.

The application’s authors created four JARs:

  • observer—Observes other services and checks availability
  • statistics—Creates statistics from availability data
  • persistence—Reads and writes statistics to the database with hibernate
  • monitor—Triggers data collection and pipes the data through statistics into persistence; implements the REST API with spark

Each JAR has its own dependencies, all of which can be seen in figure 1.6.

Figure 1.6 Given any application, you can draw a dependency graph for its artifacts. Here the ServiceMonitor application is split into four JARs, which have dependencies between them but also on third-party libraries.
c01_06.png

The graphs include everything we discussed earlier: the JARs have names, they depend on each other, and each offers specific features by providing public classes and methods that other JARs can call.

When starting an application, you must list on the class path all the JARs you want to use:

$ java
 --class-path observer.jar:statistics.jar:persistence.jar:monitor.jar    #1  
 org.codefx.monitor.Monitor
Essential info

And this is where things go awry—at least, before Java 9. The JVM launches without knowledge of your classes. Every time it encounters a reference to an unknown class, starting with the main class specified on the command line, it goes through all JARs on the class path, looking for a class with that fully qualified name. If it finds one, it loads the class into a huge set of all classes and is finished. As you can see, there’s no run-time concept in the JVM that corresponds to JARs.

Without run-time representation, JARs lose their identity. Although they have filenames, the JVM doesn’t much care about them. Wouldn’t it be nice if exception messages could point to the JAR the problem occurred in, or if the JVM could name a missing dependency?

Talking about dependencies—these become invisible as well. Operating on the level of classes, the JVM has no concept for dependencies between JARs. Ignoring the artifacts that contained the classes also means encapsulation of those artifacts is impossible. And indeed, every public class is visible to all other classes.

Names, explicit dependencies, clearly defined APIs—neither compiler nor JVM cares much about any of the things we value in modules. This erases the modular structure and turns that carefully designed graph into a big ball of mud, as shown in figure 1.7. This is not without consequences.

Figure 1.7 Neither Java’s compiler nor its virtual machine has concepts for artifacts or the dependencies between them. Instead, JARs are treated as simple containers, out of which classes are loaded into a single namespace. Eventually, the classes end up in a kind of primordial soup, where every public type is accessible to every other.
c01_07.png
livebook features:
settings
Update your profile, view your dashboard, tweak the text size, or turn on dark mode.
settings
Sign in for more free preview time

1.3 Complications before Java 9

As you’ve seen, Java before version 9 lacked the concepts to properly support modularity across artifacts. And although this causes problems, they obviously aren’t prohibitive (or we wouldn’t use Java). But when they do rear their ugly heads, typically in larger applications, they can be hard or even impossible to solve.

As I mentioned at the beginning of the chapter, the complications that are most likely to affect application developers are commonly summarized under the endearing term JAR hell; but they aren’t the only ones. Security and maintenance problems, more of an issue for JDK and library developers, are also consequences.

I’m sure you’ve seen quite a few of these complications yourself, and over the course of this section we’ll look at them one by one. Don’t worry if you’re not familiar with all of them—quite the opposite, consider yourself lucky that you haven’t had to deal with them yet. If you’re familiar with JAR hell and related problems, feel free to skip to section 1.4, which introduces the module system.

In case you’re getting frustrated with this seemingly endless stream of problems, relax—there will be a catharsis: section 1.5 discusses how the module system overcomes most of these shortcomings.

1.3.1 Unexpressed dependencies between JARs

Has an application of yours ever crashed with a NoClassDefFoundError? This occurs when the JVM can’t find a class on which the code that’s currently being executed depends. Finding the depending code is easy (a look at the stack trace will reveal it), and identifying the missing dependency usually doesn’t require much more work (the missing class’s name often gives it away), but determining why the dependency isn’t present can be tough. Considering the artifact dependency graph, though, the question arises why we’re only finding out at run time that something’s missing.

Essential info

The reason is simple: a JAR can’t express which other JARs it depends on in a way the JVM will understand. An external entity is required to identify and fulfill the dependencies.

Before build tools gained the ability to identify and fetch dependencies, that external entity was us. We had to scan the documentation for dependencies, find the correct projects, download the JARs, and add them to the project. Optional dependencies, where a JAR might require another JAR only if we wanted to use certain features, further complicated the process.

For an application to work, it might only need a handful of libraries. But each of those in turn might need a handful of other libraries, and so on. As the problem of unexpressed dependencies is compounded, it becomes exponentially more labor-intensive and error-prone.

Essential info

Build tools like Maven and Gradle largely solved this problem. They excel in making dependencies explicit so they can hunt down each required JAR along the myriad edges of the transitive dependency tree. Still, having the JVM understand the concept of artifact dependencies would increase robustness and portability.

1.3.2 Shadowing classes with the same name

Sometimes, different JARs on the class path contain classes with the same fully qualified name. This can happen for a number of reasons:

  • There may be two different versions of the same library.
  • A JAR may contain its own dependencies—it’s called a fat JAR or an uber JAR—but some of them are also pulled in as standalone JARs because other artifacts depend on them.
  • A library may have been renamed or split, and some of its types are unknowingly added to the class path twice.

If the variants differ semantically, this can lead to anything from too-subtle-to-notice-misbehavior to havoc-wreaking errors. Even worse, the form in which the problem manifests itself can seem nondeterministic. It depends on the order in which the JARs are searched, which may differ across different environments: for example, between your IDE (such as IntelliJ, Eclipse, or NetBeans) and the production machine where the code will eventually run.

Take the example of Google’s widely used Guava library, which contains a utility class com.google.common.collect.Iterators. From Guava version 19 to version 20, the method emptyIterator() was removed. As figure 1.8 shows, if both versions end up on the class path and if version 20 comes first, then any code that depends on Iterators will use the new version, thus ending up unable to call 19’s Iterators::emptyIterator. Even though a class containing the method is on the class path, it’s effectively invisible.

Shadowing mostly happens by accident. But it’s also possible to purposely use this behavior to override specific classes in third-party libraries with handcrafted implementations, thus patching the library. Although build tools might reduce the chance of this happening accidentally, they generally can’t prevent it.

Figure 1.8 It’s possible that the class path contains the same library in two different versions (top) or two libraries that have a set of types in common (bottom). In both cases, some types are present more than once. Only the first variant encountered during the class path scan is loaded (it shadows all the others), so the order in which the JAR files are scanned determines which code runs.
c01_08.png

1.3.3 Conflicts between different versions of the same project

Version conflicts are the bane of any large software project. Once the number of dependencies is no longer a single digit, the likelihood of conflicts occurring converges to 1 with alarming speed.

If both versions are present on the class path, the behavior will be unpredictable. Because of shadowing, classes that exist in both versions will only be loaded from one of them. Worse, if a class that exists in one version but not the other is accessed, that class will be loaded as well. Code calling into the library may find a mix of both versions.

On the other hand, if one of the versions is missing, the program most likely won’t function correctly because both versions are required and by assumption not compatible, which means they can’t stand in for each other (see figure 1.9). As with missing dependencies, this manifests as unexpected behavior or as a NoClassDefFoundError.

Figure 1.9 Transitive dependencies on conflicting versions of the same library often aren’t resolvable — one dependency must be eliminated. Here, an old version of RichFaces depends on a different version of Guava than the application wants to use. Unfortunately, Guava 16 removed an API that RichFaces relies on.
c01_09.png

Continuing the Guava example from the section on shadowing, imagine some code depends on com.google.common.io.InputSupplier, a class that was present in 19 but removed in 20. The JVM would first scan Guava 20 and, after not finding the class, load it from Guava 19. Suddenly an amalgam of both Guava versions is running! As a finishing move, imagine InputSupplier calling Iterators::emptyIterator. What do you think—how much fun would it be to debug that?

Essential info

There’s no technical solution for this issue that doesn’t involve existing module systems or manually fiddling with class loaders. Build tools are generally able to detect this scenario. They may warn about it and usually resolve it with simple mechanisms like picking the most current version.

1.3.4 Complex class loading

Our examination of the class-loading mechanism in section 1.2 wasn’t complete. The described behavior is the default, where all application classes are loaded by the same class loader. But developers are free to add additional class loaders, delegating from one to the other to solve some of the problems we’re discussing here.

This is typically done by containers like component systems and web servers. Ideally this implicit use is hidden from application developers; but as we know, all abstractions are leaky. And in some circumstances, developers may explicitly add class loaders to implement features: for example, to allow users to extend the application by loading new classes, or to be able to use conflicting versions of the same dependency.

Regardless of how multiple class loaders enter the picture, they require you to take a deeper dive into this topic. And they can quickly lead to a complex delegation mechanism that exhibits unexpected, hard-to-understand behavior.

1.3.5 Weak encapsulation across JARs

Java’s visibility modifiers are great to implement encapsulation between classes in the same package. But across package boundaries, there’s only one visibility for types: public.

As you’ve seen, a class loader folds all loaded packages into one big ball of mud—with the consequence that all public classes are visible to all other classes. Due to this weak encapsulation, there’s no way to create functionality that’s visible throughout an entire JAR but not outside of it.

This makes it difficult to properly modularize a system. If some functionality is required by different parts of a module (such as a library or a subproject of your system) but shouldn’t be visible outside of it, the only way to achieve this is to put them all into one package and use package visibility. In an act of preemptive obedience, you erase the code’s structure instead of leaving this task to the JVM. Even in cases where package visibility solves this problem, there’s still reflection to get around that.

Weak encapsulation lets clients of an artifact break into its internals (see figure 1.10). This can happen accidentally if an IDE suggests importing classes from packages that documentation marks as being internal. More often, it’s done purposefully to overcome problems that seem to have no other solution (which is sometimes the case and sometimes not). But it comes at a high price!

Figure 1.10 The maintainers of Eclipse JGit didn’t intend the types in org.eclipse.jgit.internal for public consumption. Unfortunately, because Java has no concept of JAR internals, there’s nothing the maintainers can do to stop any com.company.Type from compiling against it. Even if it were only package visible, it could still be accessed via reflection.
c01_10.png

Now the clients’ code is coupled to the artifact’s implementation details. This makes updates risky for the clients and, if the maintainers decide to take this coupling into consideration, impedes changing those internals. It can go as far as to slow or even prevent meaningful evolution of the artifact.

In case this sounds like an edge case, it isn’t. The most notorious example is sun.misc.Unsafe, a JDK-internal class that lets us do crazy things (by Java standards) like directly allocating and freeing memory. Many critical Java libraries and frameworks like Netty, PowerMock, Neo4J, Apache Hadoop, and Hazelcast use it. And because many applications depend on those libraries, they also depend on these internals. That way, Unsafe became a critical piece of infrastructure even though it was neither intended nor designed to be.

Another example is JUnit 4. Many tools, especially IDEs, have all kinds of nice features that make testing easier for developers. But because JUnit 4’s API isn’t rich enough to implement all these features, tools break into its internals. This coupling considerably slowed JUnit 4’s development, eventually becoming an important reason to completely start over with JUnit 5.

1.3.6 Security checks have to be handcrafted

An immediate consequence of weak encapsulation across package boundaries is that security-relevant functionality is exposed to all code running in the same environment. This means malicious code can access critical functionality, and the only way to combat that is to manually implement security checks on critical execution paths.

Since Java 1.1, this has been done by invoking SecurityManager::checkPackageAccess—which checks whether the calling code is allowed to access the called package—on every code path into security-relevant code. Or rather, it should be invoked on every such path. Forgetting these calls led to some of the vulnerabilities that plagued Java in the past, particularly during the transition from Java 7 to 8.

It can, of course, be argued that security-relevant code should be double, triple, or quadruple checked. But to err is human, and requiring us to manually insert security checks at module boundaries poses a higher risk than a well-automated variant.

1.3.7 Poor startup performance

Did you ever wonder why many Java applications, particularly web backends that use powerful frameworks like Spring, take so long to load?

One reason is that the class loader has no way to know which JAR a class comes from, so it must execute a linear scan of all JARs on the class path. Similarly, identifying all occurrences of a specific annotation requires the inspection of all classes on the class path.

1.3.8 Rigid Java runtime

This isn’t really a consequence of the JVM’s big-ball-of-mud approach, but as long as I’m ranting, I’ll get it out there.

Although this may be of little relevance for medium-sized computing devices (such as desktop PCs and laptops), it’s obviously important for the smallest devices like routers, TV boxes, cars, and all the other nooks and crannies where Java is used. With the current trend of containerization, it also gains relevance on servers, where reducing an image’s footprint will reduce costs.

Java 8 brought compact profiles, which define three subsets of Java SE. They alleviate the problem but don’t solve it. Compact profiles are fixed and hence unable to cover all current and future needs for partial JREs.

livebook features:
highlight, annotate, and bookmark
Select a piece of text and click the appropriate icon to annotate, bookmark, or highlight (you can also use keyboard shortcuts - h to highlight, b to bookmark, n to create a note).

You can automatically highlight by performing the text selection while keeping the alt/ key pressed.
highlights
join today to enjoy all our content. all the time.
 

1.4 Bird’s-eye view of the module system

We’ve just discussed quite a few problems. How does the Java Platform Module System address them? The principal idea is pretty simple!

Essential info

Modules are the basic building block of the JPMS (surprise). Like JARs, they’re containers for types and resources; but unlike JARs, they have additional characteristics. These are the most fundamental ones:

  • A name, preferably one that’s globally unique
  • Declarations of dependencies on other modules
  • A clearly defined API that consists of exported packages

1.4.1 Everything is a module

There are different kinds of modules, and section 3.1.4 categorizes them, but it makes sense to take a quick look at them now. During work on Project Jigsaw, the OpenJDK was split up into about 100 modules, the so-called platform modules. Roughly 30 of them have names beginning with java.*; they’re the standardized modules that every JVM must contain (figure 1.11 shows a few of them).

Figure 1.11 A selection of platform modules. The arrows show their dependencies, but some aren’t depicted to keep the graph simpler: The aggregator module java.sedirectly depends on each module, and each module directly depends on java.base.
c01_11.png

These are some of the more important ones:

  • java.base —The module without which no JVM program functions. Contains packages like java.lang and java.util.
  • java.desktop —Not only for those brave desktop UI developers out there. Contains the Abstract Window Toolkit (AWT; packages java.awt.*), Swing (packages javax.swing.*), and more APIs, among them JavaBeans (package java.beans.*).
  • java.logging —Contains the package java.util.logging.
  • java.rmi —Remote Method Invocation (RMI).
  • java.xml —Contains most of the XML API word salad: Java API for XML Processing (JAXP), Streaming API for XML (StAX), Simple API for XML (SAX), and the document object model (DOM).
  • java.xml.bind —Java Architecture for XML Binding (JAXB).
  • java.sql —Java Database Connectivity (JDBC).
  • java.sql.rowset —JDBC RowSet API.
  • java.se —References the modules making up the core Java SE API. (This is a so-called aggregator module; see section 11.1.5.)
  • java.se.ee —References the modules making up the full Java SE API (another aggregator).

Then there’s JavaFX. A telltale sign that its high-level architecture is superior to AWT’s and Swing’s is that not only was it sufficiently decoupled from the rest of the JDK to get its own module, it was actually split into seven: bindings, graphics, controls, web view, FXML, media, and Swing interop. All of these module names begin with javafx.*.

Finally, there are about 60 modules whose names begin with jdk. They contain API implementations, internal utilities, tools (such as the compiler, JAR, Java Dependency Analysis Tool [JDeps], and Java Shell Tool [JShell]), and more. They may differ across JVM implementations, so using them is akin to using code from sun. packages: not a future-proof choice but sometimes the only option available.

You can see a list of all modules contained in a JDK or JRE by running java --list-modules. To get details for a single module, execute java --describe-module ${module-name}. (${module-name} is a placeholder, not valid syntax—replace it with your module of choice.)

Platform modules are packed into JMOD files, a new format created specifically for this purpose. But code outside the JDK can create modules just as well. In that case, they’re modular JARs: plain JARs that contain a new construct, the module descriptor, which defines the module’s name, dependencies, and exports. Finally, there are modules the module system creates on the fly from JARs that weren’t yet transformed into modules.

Essential info

This leads to a fundamental aspect of the module system: everything is a module! (Or, more precisely, no matter how types and resources are presented to the compiler or the virtual machine, they will end up in a module.) Modules are at the heart of the module system and hence of this book. Everything else can ultimately be traced back to them and their name, their declaration of dependencies, and the API they export.

1.4.2 Your first module

That the JDK was modularized is fine and dandy, but what about your code? How does it end up in modules? That’s fairly simple.

The only thing you need to do is add a file called module-info.java, a module declaration, to your source folder and fill it with your module’s name, dependencies on other modules, and the packages that make up its public API:

module my.xml.app {
    requires java.base;    #1  
    requires java.xml;
    exports my.xml.api;
}

Looks like the my.xml.app module uses the platform modules java.base and java.xml and exports a package my.xml.api. So far, so good. Now you compile module-info.java with all other sources to .class files and package it into a JAR. (The compiler and the jar tool will automatically do the right thing.) Et voilà, you’ve created your first module.

1.4.3 The module system in action

Let’s launch the XML application and observe the module system in action. To do so, fire off the following command:

java
    --module-path mods
    --module my.xml.app

The module system picks it up from here. It takes a number of steps to improve the situation over the ball of mud you saw in sections 1.2 and 1.3:

  1. Bootstraps itself
  2. Verifies that all required modules are present
  3. Builds internal representation of application architecture
  4. Launches the initial module’s main method
  5. Stays active while the application is running, to protect the module internals

Figure 1.12 captures all the steps. But let’s not get ahead of ourselves, and study each step in turn.

Figure 1.12 The Java Platform Module System (JPMS) in action. It does most of its work at launch time: after (1) bootstrapping, it (2) makes sure all modules are present while building the module graph, before (3) handing control over to the running application. At run time, it (4) enforces that each module’s internals are protected.
c01_12.png

Loading the base module

The module system is just code, and you’ve learned that everything is a module, so which one contains the JPMS? That would be java.base, the base module. In a considerable hen-and-egg mind-boggler, the module system and the base module bootstrap each other.

The base module is also the first node in the module graph that the JPMS builds. That’s exactly what it does next.

Module resolution: building a graph that represents the application

The command you issued ended with --module my.xml.app. This tells the module system that my.xml.app is the application’s main module and that dependency resolution needs to start there. But where can the JPMS find the module? That’s where --module-path mods comes in. It tells the module system that it can find application modules in the folder mods, so the JPMS dutifully looks there for the my.xml.app module.

Folders don’t contain modules, though: they contain JARs. So the module system scans all JARs in mods and looks for their module descriptors. In the example, mods contains my.xml.app.jar, and its descriptor claims it contains a module named my.xml.app. Exactly what the module system has been looking for! The JPMS creates an internal representation of my.xml.app and adds it to the module graph—so far, not connected to anything else.

The module system found the initial module. What’s next? Searching for its dependencies. The descriptor of my.xml.app states that it requires the modules java.base and java.xml. Where can the JPMS find those?

The first one, java.base, is already known, so the module system can add a connection from my.xml.app to java.base—the first edge in the graph. Next up is java.xml. It begins with java, which tells the module system it’s a platform module; so the JPMS doesn’t search the module path for it, but instead searches its own module storage. The JPMS finds java.xml there and adds it to the graph with a connection from my.xml.app to it.

Now you have three nodes in the graph, but only two were resolved. The dependencies of java.xml are still unknown, so the JPMS checks them next. It doesn’t have any dependencies other than java.base, though, so module resolution concludes. Starting with my.xml.app and the omnipresent base module, the process built a small graph with three nodes.

If the JPMS can’t find a required module, or if it encounters any ambiguities (like two JARs containing modules with the same name), it will quit with an informative error message. This means you can discover problems at launch time that would otherwise crash the running application at some arbitrary point in the future.

Launching the initial module

How did this process start, again? Ah yes, with the command ending in --module my.xml.app. The module system fulfilled one of its core functions—verifying the presence of all required dependencies—and can now hand control over to the application.

The initial module my.xml.app is not only the one where module resolution starts, it must also contain a public static void main(String[]) method. But you don’t necessarily have to specify the class containing that method when launching the app. I skipped past this, but you were diligent when packaging the .class files into a JAR and specified the main class then. That information was embedded in the module descriptor, which is where the JPMS can read it from now.

Because you used --module my.xml.app without specifying a main class, the module system expects to find that information in the module descriptor. Fortunately it does, and it calls main on that class. The application launches, but the JPMS’s work isn’t over yet!

Guarding module internals

Even with the application successfully launched, the module system needs to stay active to fulfill its second essential function: guarding module internals. Remember the line exports my.xml.api in my.xml.app’s module declaration? This is where it and others like it come into play.

Whenever a module first accesses a type in another module, the JPMS verifies that three requirements are met:

  • The accessed type needs to be public.
  • The module owning that type must have exported the package containing it.
  • In the module graph, the accessing module must be connected to the owning one.

When my.xml.app first uses javax.xml.XMLConstants (for example), the module system checks whether XMLConstants is public (), whether java.xml exports javax.xml (), and whether my.xml.app is connected to java.xml in the module graph (). Because all three pan out, my.xml.app can do its thing with XMLConstants.

This behavior fixes a critical deficiency of the ball-of-mud approach Java used to take with artifact relationships: that there was no way to distinguish code that’s internal to an artifact from code that can be used publicly. With exports in play, a module can clearly define which parts of its API are public and which are internal and can depend on the module system to enforce its decision.

A more complex example

As a less trivial example, figure 1.13 shows the module graph for the ServiceMonitor application introduced in section 1.2. Its four JARs—monitor, observer, statistics, and persistence—as well as its two dependencies—spark and hibernate—were turned into modules. JDK modules like java.xml and java.base are visible as well, because the application depends on some of them, too.

I find the comparison with figure 1.6, which depicts the dependencies between ServiceMonitor’s JARs, striking. Figure 1.6 shows our understanding of how the application is organized on an artifact level, whereas figure 1.13 shows how the module system sees it. That they’re so similar demonstrates how well the module system can be used to express an application’s architecture.

Figure 1.13 The module graph for the ServiceMonitor application is very similar to the architecture diagram in figure 1.6. The graph shows the four modules containing the application’s code, the two libraries it uses to implement its feature set, and the involved modules from the JDK. Arrows depict the dependencies between them. Each module lists some of the packages it exports.
c01_13.png

1.4.4 Your non-modular project will be fine—mostly

Developers of existing projects, particularly with large code bases, will be interested in migration paths. Although other module systems are usually “in or out,” meaning in order to use them, everything must be a module, this isn’t an option for the JPMS. To uphold backward compatibility, a regular application running from the class path on Java 8 or earlier must do the same on Java 9. Thus unmodularized applications must run on top of the modularized JDK, which implies that the module system must handle that case.

And it does. I already mentioned in passing that the module system handles JARs that weren’t yet turned into modules. This is the case precisely because of backward compatibility. Although migrating to the module system is beneficial, it’s not compulsory.

As a consequence, the class path, used to specify JARs or plain .class files for the compiler and JVM, works as on Java 8 and before. Even modules on the class path behave just like non-modular JARs. The underlying assumption is that the class path is in charge of accessing artifacts that want to be turned into the ball of mud discussed in section 1.3.

Parallel to that, a new concept was created: the module path. Here, the underlying assumption is that it treats all artifacts as modules. Interestingly, this is true even for plain JARs.

Essential info

The coexistence of the class path and the module path and their respective treatment of plain and modular artifacts is the key to incremental migrations of large applications to the module system. Chapter 8 explores this important topic in depth.

Another aspect of the module system that’s important, particularly to legacy projects, is compatibility. The JPMS entails a lot of changes under the hood, and although almost all of them are backward-compatible in the strict meaning of the word, some interact badly with existing code bases. For example:

  • Dependencies on JDK-internal APIs (for example, those in sun.* packages) cause compile-time errors and run-time warnings.
  • JEE APIs must be resolved manually.
  • Different artifacts that contain classes in the same package can cause problems.
  • Compact profiles, the extension mechanism, the endorsed-standards-override mechanism, and similar features were removed.
  • The run-time image layout changed considerably.
  • The application class loader is no longer a URLClassLoader.

In the end, regardless of whether an application is modularized, running on Java 9 or later may break it. Chapters 6 and 7 are dedicated to identifying and overcoming the most common challenges.

At this point, you may have questions like these:

  • Don’t Maven, Gradle, and others already manage dependencies?
  • What about Open Service Gateway Initiative (OSGi)? Why don’t I just use that?
  • Isn’t a module system overkill in times when everybody writes microservices?

And you’re right to ask. No technology is an island, and it’s worth looking at the Java ecosystem as a whole and examining how existing tools and approaches are related to the module system and what their relation might be in the future. I do this in section 15.3; you already know everything you need to understand it, so if you can’t let those questions go, why not read it now?

Section 1.5 describes the high-level goals the module system wants to achieve, and chapter 2 shows a longer example of what a modular application might look like. Chapters 3, 4, and 5 explore in detail how to write, compile, package, and run such applications from scratch. Part 2 of this book discusses compatibility and migration before part 3 turns to advanced features of the module system.

livebook features:
discuss
Ask a question, share an example, or respond to another reader. Start a thread by selecting any piece of text and clicking the discussion icon.
discussions
Sign in for more free preview time

1.5 Goals of the module system

In essence, the Java Platform Module System was developed to teach Java about the dependency graph between artifacts. The idea is that if Java stops erasing the module structure, most of the ugly consequences of that erasure disappear as well.

First and foremost, this should alleviate many of the pain points the current state of affairs is causing. But more than that, it introduces capabilities, new to most developers who haven’t used other module systems, that can further improve the modularization of software. What does this mean on a more concrete level?

Before we come to that, it’s important to note that not all of the module system’s goals are equally important to all kinds of projects. Many predominantly benefit large, long-lived projects like the JDK, for which the JPMS was primarily developed. Most of the goals won’t have a huge impact on day-to-day coding, unlike, for example, lambda expressions in Java 8 or var in Java 10. They will, however, change the way projects are developed and deployed—something we all do on a daily basis (right?).

Among the module system’s goals, two stand out as particularly important: reliable configuration and strong encapsulation. We’ll look at them more closely than the others.

1.5.1 Reliable configuration: Leaving no JAR behind

As you saw in section 1.4.3 when observing the module system in action, individual modules declare their dependencies on other modules and the JPMS analyzes these dependencies. Although we only looked at a JVM launch, the same mechanism is at play at compile time and link time (yep, that’s new; see chapter 14). These operations can thus fail fast when dependencies are missing or conflicting. The fact that dependencies can be found missing at launch time, as opposed to only when the first class is needed, is a big win.

Before Java 9, JARs with the same classes weren’t identified as being in conflict. Instead, the runtime would choose an arbitrary class, thus shadowing the others, which led to the complications described in section 1.3.2. Starting with Java 9, the compiler and JVM recognize this and many other ambiguities that can lead to problems early on.

1.5.2 Strong encapsulation: Making module-internal code inaccessible

Another key goal of the module system is to enable modules to strongly encapsulate their internals and export only specific functionality.

A class that is private to a module should be private in exactly the same way that a private field is private to a class. In other words, module boundaries should determine not just the visibility of classes and interfaces but also their accessibility.

—Mark Reinhold, “Project Jigsaw: Bringing the Big Picture into Focus” (https://mreinhold.org/blog/jigsaw-focus)

To achieve this goal, both compiler and JVM enforce strict accessibility rules across module boundaries: only access to public members (meaning fields and methods) of public types in exported packages is allowed. Other types aren’t accessible to code outside the module—not even via reflection. Finally we can strongly encapsulate libraries’ internals and be sure applications don’t accidentally depend on implementation details.

This also applies to the JDK, which, as described in the previous section, was turned into modules. As a consequence, the module system prevents access to JDK-internal APIs, meaning packages starting with sun. or com.sun.. Unfortunately, many widely used frameworks and libraries like Spring, Hibernate, and Mockito use such internal APIs, so many applications would break on Java 9 if the module system were that strict. To give developers time to migrate, Java is more lenient: the compiler and JVM have command-line switches that allow access to internal APIs; and, on Java 9 to 11, run-time access is allowed by default (more on that in section 7.1).

To prevent code from accidentally depending on types in indirect dependencies, which may change from one run to the next, the situation is even stricter: in general, a module can only access types of modules that it requires as a dependency. (Some advanced features create deliberate exceptions to that rule.)

1.5.3 Automated security and improved maintainability

The strong encapsulation of module-internal APIs can greatly improve security and maintainability. It helps with security because critical code is effectively hidden from code that doesn’t require its use. It also makes maintenance easier, because a module’s public API can more easily be kept small.

Casual use of APIs that are internal to Java SE Platform implementations is both a security risk and a maintenance burden. The strong encapsulation provided by the proposed specification will allow components that implement the Java SE Platform to prevent access to their internal APIs.

—Java Specification Request (JSR) 376

1.5.4 Improved startup performance

With clearer bounds of where code is used, existing optimization techniques can be used more effectively.

Many ahead-of-time, whole-program optimization techniques can be more effective when it is known that a class can refer only to classes in a few other specific components rather than to any class loaded at run time.

—JSR 376

It’s also possible to index classes and interfaces by their annotations, so that such types can be found without a full class path scan. That wasn’t implemented in Java 9 but may come in a future release.

1.5.5 Scalable Java platform

A beautiful consequence of modules with clearly defined dependencies is that it’s easy to determine running subsets of the JDK. Server applications, for example, don’t use AWT, Swing, or JavaFX and can thus run on a JDK without that functionality. The new tool jlink (see chapter 14) makes it possible to create run-time images with exactly the modules an application needs. We can even include library and application modules, thereby creating a self-contained program that doesn’t require Java to be installed on the host system.

This will maintain Java’s position as a key player for small devices as well as for containers.

1.5.6 Non-goals

Unfortunately, the module system is no panacea, and a couple of interesting use cases aren’t covered. First, the JPMS has no concept of versions. You can’t give a module a version or require versions for dependencies. That said, it’s possible to embed such information in the module descriptor and access it using the reflection API, but that’s just metainformation for developers and tools—the module system doesn’t process it.

That the JPMS doesn’t “see” versions also means it won’t distinguish two different versions of the same module. On the contrary, and in line with the goal of reliable configuration, it will perceive this situation as a classic ambiguity—the same module present twice—and refuse to compile or launch. For more on module versions, see chapter 13.

The JPMS offers no mechanism to search for or download existing modules from a centralized repository or to publish new ones. This task is sufficiently covered by existing build tools.

It’s also not the goal of the JPMS to model a dynamic module graph, where individual artifacts can show up or disappear at run time. It’s possible, though, to implement such a system on top of one of the advanced features: layers (see section 12.4).

livebook features:
settings
Update your profile, view your dashboard, tweak the text size, or turn on dark mode.
settings
Tour livebook

Take our tour and find out more about liveBook's features:

  • Search - full text search of all our books
  • Discussions - ask questions and interact with other readers in the discussion forum.
  • Highlight, annotate, or bookmark.
take the tour

1.6 Skills, old and new

I’ve described a lot of promises, and the rest of the book explains how the Java Platform Module System aims to achieve them. But make no mistake, these benefits aren’t free! To build applications on top of the module system, you’ll have to think harder than before about artifacts and dependencies, and commit more of those thoughts to code. Certain things that used to work will stop doing so on Java 9, and using certain frameworks will require a little more effort than before.

You can view this as similar to how a statically and strongly typed language requires more work than a dynamic one—at least, while the code is being written. All those types and generics—can’t you just use Object and casts everywhere? Sure, you could, but would you be willing to give up the safety the type system provides, just to save some brain cycles while writing code? I don’t think so.

1.6.1 What you’ll learn

New skills are required! Luckily, this book teaches them. When all is said and done, and you’ve mastered the mechanisms laid out in the following chapters, neither new nor existing applications will defy you.

Part 1, particularly chapters 3–5, goes through the basics of the module system. In addition to practical skills, they teach underlying mechanisms to give you deeper understanding. Afterward, you’ll be able to describe modules and their relationships by encapsulating a module’s internals and expressing its dependencies. With javac, jar, and java, you’ll compile, package, and run modules and the applications they form.

Part 2 of the book builds on the basics and extends them to cover more complex use cases. For existing applications, you’ll be able to analyze possible incompatibilities with Java 9 to 11 and create a migration path to the module system using the various features it offers for that purpose. Toward that end, and also to implement less straightforward module relationships, you can use advanced features like qualified exports, open modules, and services as well as the extended reflection API. With jlink, you’ll create pared-down JREs, optimized for a particular use case, or self-contained application images that ship with their own JREs. Finally, you’ll see the bigger picture, including how the module system interacts with class loading, reflection, and containers.

1.6.2 What you should know

The JPMS has an interesting character when it comes to skill requirements. Most of what it does is brand-new and comes with its own syntax partitioned off in the module declaration. Learning that is relatively easy, if you have basic Java skills. So if you know that code is organized in types, packages, and ultimately JARs; how visibility modifiers, particularly public, work across them; and what javac, jar, and java do, and have a rough idea of how to use them, then you have all it takes to understand part 1 as well as many of the more advanced features introduced in part 3.

But to really understand the problems the module system addresses and to appreciate the solutions it proposes requires more than that. Familiarity with the following and experience working with large applications make it easier to understand the motivation for the module system’s features and their benefits and shortcomings:

  • How the JVM, and particularly the class loader, operates
  • The trouble that mechanism causes (think JAR hell)
  • More advanced Java APIs like the service loader and reflection API
  • Build tools like Maven and Gradle and how they build a project
  • How to modularize software systems

But however knowledgeable you are, you may encounter references or explanations that don’t connect with something you know. For an ecosystem as gigantic as Java’s, that’s natural, and everybody learns something new wherever they turn (believe me, I know that first hand). So, never despair! If some fluff doesn’t help, chances are you can understand the technicalities purely by looking at the code.

With the background colored in, it’s time to get your hands dirty and learn the JPMS basics. I recommend you continue with chapter 2, which cuts across the rest of part 1 and shows code that defines, builds, and runs modular JARs. It also introduces the demo application that appears throughout the rest of the book. If you prefer learning the underlying theory first, you can skip to chapter 3, which teaches the module system’s fundamental mechanisms. If you’re driven by worry about your project’s compatibility with Java 9, chapters 6 and 7 cover that in detail, but those chapters will be hard to understand without a good grasp of the basics.

Summary

  • A software system can be visualized as a graph, which often shows (un)desired properties of the system.
  • On the level of JARs, Java used to have no understanding of that graph. This led to various problems, among them JAR hell, manual security, and poor maintainability.
  • The Java Platform Module System exists to make Java understand the JAR graph, which brings artifact-level modularity to the language. The most important goals are reliable configuration and strong encapsulation as well as improved security, maintainability, and performance.
  • This is achieved by introducing modules: basically, JARs with an additional descriptor. The compiler and runtime interpret the described information in order to build the graph of artifact dependencies and provide the promised benefits.
sitemap

Unable to load book!

The book could not be loaded.

(try again in a couple of minutes)

manning.com homepage
Up next...
  • Laying out a modular application’s source code
  • Creating module declarations
  • Compiling modules
  • Running a modular application