Happy Packaging!

gift2

Creating packages, assigning classes to packages and creating a package hierarchy is usually not a top priority in software designs. This however presents a missed opportunity to make your design more readable and maintainable. This article helps you take your Java Class packaging skills to the next level, not just for the holiday season.

Packages as Namespaces

The very first and basic usage for Packages is to have them as Namespaces for projects. Each project, module or library lives in its own naming scheme, which makes all Classes they contain uniquely named among all projects of the world, which enables a global marketplace of libraries like the Maven Central Repository.

Package names for projects usually start with the reverse domain name of the company or developer and end with the project name, like:

com.vanillasource.gerec
com.github.robertbraeutigam.tictactoe

While this is a very important aspect of Packages, it is not particularly interesting nor difficult to do, so this article is more about what happens on the inside of one application. Packages are hierarchical and do not end at the project level! The project itself is organized into sub-packages, which themselves may have sub-packages, etc. This is where it gets interesting…

Packages as Grouping

When we first begin to use packages most of us think of them as a way to create some order among classes. If there are too many classes in the same directory, we just want to split them up somehow to be able to find things.

So our first intuition is to group the classes into sub-packages according to some criteria we see fit. Let’s take a look at the Package structure in Project Weld, the official implementation of the CDI specification, included in WildFly JEE Application Server:

AbstractCDI.java
Container.java
annotated/
bootstrap/
logging/
manager/
xml/
...

This project has hundreds of classes, so obviously it needs to be split somehow into packages. The above excerpt shows that the developers created a mostly technical package structure. While bootstrap and annotated could be business-oriented perhaps, packages xml, logging and manager are clearly technical.

Let’s call this packaging strategy “grouping“.

A package structure on a given level is a grouping if the root package depends on any of its sub-packages.

The dependency graph for the above classes and packages shows that the root package does depend on some of its sub-packages:

weld-dependencies-small

The parallel arrows between most of these packages that point back and forth indicate circular dependencies. It means that every package is essentially dependent on every other package, which means changes will rarely be localized. This is a maintenance nightmare and should be avoided.

Circular dependencies are however not a direct consequence of “grouping”. A “grouping” can be a quite reasonable approach, but has the following more subtle properties:

  • It is mostly made for the convenience of the writer
  • It misses an opportunity to visualize the design
  • It misses an opportunity to communicate with the reader

Why are these points important? Let’s look at how code is read

 

Getting to know an Application

blabla

How does a reader navigate in code on the package level? Well, that depends on what the reader is after…

The reader might be a new colleague or someone yet unfamiliar with the codebase, trying to get his or her bearings, trying to understand key features and purpose of the application. For a reader like this it is probably a good idea to start reading at the beginning, with key abstractions that do not require additional understanding of the code, i.e. packages that have no further dependencies.

It would be convenient if the key abstractions which do not depend on anything else would be in the top (root) package, or else it would require a dependency analyzer to find the “entry point” into the application. So if somebody asks the question: What is this application/library about? The answer should be in the root package.

This also requires the root package to not just contain data structures or Java Beans, but the actual logic, even if just in abstract form as interfaces. If it would just contain Java Beans for example, it would be impossible to tell what the application actually does. A consequence is, that sub-packages can not contain “additional” functionality, only more details. If they would contain some functionality not seen one level up, then the reader could not build a complete picture of what the application does, so she would be forced to dive into all the sub-packages to make sure nothing is missed.

Of course if more detail is needed, then a reader might descend into sub-packages. At every level the situation should be the same. The current package would have some abstractions and all sub-packages would have more details of those and only those functionalities.

So that leaves us with the following rules to make our packages reader friendly:

  1. Packages should never depend on sub-packages
  2. Sub-packages should not introduce new concepts, just more details

Let’s look at an example from Project Gerec:

dependencies-gerec

It is a clear hierarchy, in which the root package does not depend on any of the sub-packages. This means a reader can start with the classes there, without having to understand what the sub-packages are about. Some of the top level classes are:

HttpRequest.java
HttpResponse.java
MediaType.java
Header.java
ResourceReference.java
...

All of these listed classes and most of the other ones on this level are pure interfaces with proper business methods, so the overall logic and feature set of the library can be understood just by looking at the root package. The sub-packages do not introduce new functionality, just more details, specializations, implementations of the root classes.

Maintaining an Application

gear

A second common use-case for reading code is trying to find the right place to apply a change. A change is usually some business-relevant detail that has to be modified or implemented. Sometimes the developer making the change does know exactly where to apply the change in code, or is familiar with the application enough to know the possible places to start looking.

If the application or the team is bigger however, the chance that the developer getting the task knows the exact places to search becomes less probable. In these cases the developer has to search through the packages.

The search has to start somewhere and it seems logical to start again at the root package. Select the correct sub-package to investigate from there, going down the right path to the package that contains the right classes. This however can only work, if the root package is the most abstract one and sub-packages only introduce more details. So the two rules from the previous chapter would help in this case too!

There is a finer point here also. We can help the reader with selecting the correct path by giving her more clues in the package names. If the package names contain technical terms like: “entity”, “service”, “manager”, “usecase”, “database”, “resource”, “view”, etc., we effectively assume that the reader will know our architecture. Maybe you think that is a reasonable assumption to make, and perhaps it is in most cases. It is however unnecessary to make this assumption.

We do know that the reader is here to make a change. This change most likely has a business context with which the reader will be somewhat familiar. So the following package names would probably help more and assume less prior knowledge: “shoppingcart”, “checkout”, “search”, “favorites”, etc.

Let’s see an example from the Spring-Framework Petclinic example project. These are the first sub-packages under the root package:

model/
repository/
service/
util/
web/

Which package would you pick for the following changes:

  • Change the date format for new visits
  • Mark pets as deceased instead of removing them
  • If more owners live in one household, list all pets for that household

Where would you look for the same changes in this hierarchy:

pet/
owner/
visit/

While this hierarchy might look unfamiliar to some and would take perhaps more effort to write, it does seem to be much easier to navigate.

Summary

Almost everyone starts by using packages as a tool to organize classes basically for themselves, applying a structure which seems logical at the time they write the code.

A packaging can be much more powerful however, it can contain more knowledge and more help for the reader, increasing the maintainability of software. This article highlights three rules to take your packaging to this next level:

  1. Packages should never depend on sub-packages.
  2. Sub-packages should not introduce new concepts, just more details.
  3. Packages should reflect business-concepts, not technical ones.

14 thoughts on “Happy Packaging!

    1. Thank you for your continued interest, much appreciated! I was busy preparing talks for some conferences in my region last months, but a new article is definitely coming soon. 🙂

      Like

  1. My thoughts keep returning to the proposed packaging strategy. I find it attractive, but I have not yet implemented it fully in any non-trivial project.

    I usually end up with a structure similar to this (let’s take the billing domain as an example):

    “`
    root/order/SingleProductOrder.java
    root/payment/ImmediatePaymentStrategy.java
    root/payment/EndOfMonthPaymentStrategy.java
    root/payment/CompositePaymentStrategy.java
    root/Order.java (interface)
    root/PaymentStrategy.java (interface)
    “`

    This works well:
    – interfaces in `root` package have no dependencies on other packages
    – implementation classes in `order` and `payment` packages only depend on ancestor packages (`root` in this case)

    Such a tree-like structure where each package depends *only* on its ancestors has many benefits (as argued in the article).

    However, every time I tried to use this strategy, I ran into at least one lower-level concept that was needed in multiple different branches of the tree (i.e. some low-level detail shared between implementations of two different concepts). Sometimes it was a domain concept, sometimes some technical implementation detail.

    In order to not break the “package should depend only on its ancestors” rule, I’d be forced to move that common concept up to the first common ancestor (typically `root`), and it would dilute the nice set of core interfaces there.

    Does this problem sound familiar? Have you found any approaches that help mitigate it?

    Like

    1. Well, the rule is “don’t depend on sub-packages”. You can depend on classes in same package, ancestor or *sibling* packages also. By sibling packages I mean anything that is in a different branch from your class. This might be what you’re looking for.

      For example you’re quite free to depend on “org.apache.log4j.Logger”. This is probably not an ancestor to your class, but is still completely fine. Similarly you can move your object that is needed in multiple packages to a different branch altogether that both packages now can depend on.

      Note, that is not as easy as it sounds because of the other rules. Your new package should still describe something valuable to the business, you can’t just lift any object in some package, it has to make sense there and it should not introduce a new concept to that tree. Also, now you have to watch out for circular dependencies between siblings.

      You can make some exceptions of course. For example if this object is of technical nature, you can create a technical package of course, but only if that package could be lifted out to some library at least theoretically. I.e. it doesn’t contain any business-specific things. Then that package could be considered external to the application itself therefore not in the “domain” of the application.

      Like

      1. This is exactly the problem I was having for some time, and basically how I managed (although it might be too early to say so) to solve it.

        I saw that once you have a clean package tree structure, a concept which needs to be shared across the tree branches appears, like Gediminas described. Usually a technical one, but not necessarily. Seems like when you want to organize a domain in a tree structure, there are multiple dimensions in the domain, along which you can structure the tree. Say, you have a payments domain with certain payment instruments. You could then have a tree structure with `CreditTransfer` and `DirectDebit` interfaces in root package, and `ct` and `dd` packages as subtrees. But they both might need settlement.

        So you could have a `Settlement` interface in root package with subtree packages of `rtgs` for Real Time Gross Settlement and `ns` for Net Settlement. But then you would need both Credit Transfer and Direct Debit in both `rtgs` and `ns` (assuming both instruments need both settlement types).

        So what we have here is two possible trees, both needing to reuse a component across branches. What we can do is pick the main tree and assign to it the root package. Create other packages in the root package and put other trees in them. The main tree can depend on the secondary trees, but not vice versa. So if the application is really more about the instruments and not settlement types, the structure could be like this:

        /CreditTransfer
        /DirectDebit
        /ct/CreditTransferInitiation
        /dd/DirectDebitInitiation
        /settlement/Settlement
        /settlement/trgs/…
        /settlement/ns/…

        The idea is for these subtrees to be independent in a way similar to a separate library, as far as package structuring goes.

        What do you think?

        Like

        1. Yes, that sounds right.

          The main thing is to never depend on sub-packages, and to help the reader understand what is going on right at the root at every scale. With the option to dig into sub-packages for more details (and only details, not new concepts). That’s it.

          Like

    1. Thanks for commenting.

      Your first solution is actually something that I often do. Push the conditional delegation into an implementation into the sub-package. With this you are actually using an already existing abstraction to implement some feature, which is always a good thing. Makes the abstraction stronger, validates the design somewhat.

      Note also that with this, your top-level implementation does not have to refer (instantiate) a concrete layout specification. It should not depend on anything specific anymore. You can have a ctor parameter for the layout specification interface. Basically push the actual instantiation further “up”. This is exactly what we want, so when I read your top-level package, I know I can understand what happens without reading the sub-packages at the same time.

      If you just want to provide some default plugging together code, move it somewhere else. That doesn’t need to be in this hierarchy at all. Or you can have it in the existing sub-package, maybe as factory methods in these implementations. Depends obviously on your exact case.

      All in all, I don’t think this is a valid exception and your first option would actually correctly solve this issue.

      Liked by 1 person

      1. Ah, yes, the changes you described make it possible to have a structure that does not violate the packaging rules. Thank you. (PR showing the changes: https://github.com/grimsa/practice–packaging/pull/3)

        One consequence though seems to be that you can not just push the instantiation “up”, it must be “up and then sideways a little” 🙂

        I mean, if we have all the classes in a single package tree that follows the packaging rules (e.g. `com.company.app` in my example) then we *must* also have another package outside of this package (e.g. `com.company.appconfig`) which would be responsible for plugging everything together (otherwise the higher-level package would still have to instantiate objects from a lower-level package).

        I’ll experiment with this some more, but I suspect this will likely lead to having two top-level packages:
        – One for the object hierarchy that models the solution to the domain problem, i.e. the essence of the app, or an entrypoint to the domain (equivalent to `com.company.app` in my example)
        – Another for the technical entrypoint to the app (Main class), the “plugging everything together code” (e.g. Spring configuration), and possibly code that addresses non-domain concerns (e.g. a health-check endpoint).

        Maybe `domain` and `app` respectively would be good names for these.

        > Or you can have it in the existing sub-package, maybe as factory methods in these implementations.

        I’m not sure I understood what this would look like. Could you elaborate a little bit?

        Like

        1. In some projects I do have a separate package for “Main”, like “app” or something similar. In libraries this is mostly not necessary, although in some I do have “builders” that can plug together things based on some parameters set in the builder. Sometimes these things make sense to have, sometimes they are necessary technical “evil”.

          Regarding factory methods. You can build relatively simple things in factory methods. Like having a static method `createLayout()` or something in the tilt-aware layout specification. Basically invert the relationship. Instead of instantiating the lower-level stuff from higher-level, instantiate the higher-level thing from the lower-level class with the lower-level object already configured.

          Liked by 1 person

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s