I remember approximately 15 years ago when I started learning Java. I read a lot about this ‘package’ thing and ‘namespaces’ and I totally didn’t get it.
Sad thing is: While some aspects of package are understood by pretty much everybody in the industry others aren’t. So lets have a look what packages are good for.
Namespaces: By prefixing all your packages with a domain you control, you make sure that your class names are unique. This is essential for the success of an unbelievable number of open source projects. Every project can (and probably does at some stage) define a ‘Filter’ class without having that class interfere with all the other classes of that same name (apart from the poor developer that copied some code without import statements from the web and now has to figure out which Filter class was actually referenced). This one is pretty well understood and I haven’t seen any relevant usage of the root package in ages.
Organization: My son has a huge box of Lego bricks. Probably multiple thousands maybe tens of thousands of them. When he looks for a simple 2×4 brick this is not a problem. But when he is searching for that special brick that only exists 4 times in the collection or even just once!? It might take a loooooong time to find it. Compare that to an apothecary cabinet. Hundreds of drugs and it normally takes only seconds to find the right one. And they don’t even use Google for that! They just have a strict ordering principle where each drug belongs, including a rule how the right box for a new drug is determined. Since everybody involved knows that principle it is easy to determine the correct box where the drug is to be found. Such an ordering principle is tremendously helpful when established early in a project.
When defining such a principle one criteria isn’t sufficient most of the time. But if you use more then one make sure they don’t interfere, by making the rules orthogonal. This means don’t have a rule saying: “All database access code has to go in package x” and another rule stating “All code related to customers has to go into package y”. Otherwise you won’t know where to put the CustomerDAO. Instead apply orthogonal rules on different depths of the package tree. My default package structure looks like this:
<organisational-prefix>.<application>.<deployment-unit>.<module>.<layer>.<optional further substructure if needed>
This results in package names like
If you look at a package structure like this, it becomes pretty obvious where a new class belongs, or where to look if something like this already exists. It gets even better when you avoid names like util or misc which can hide more or less everything. Also you can look at these packages and immediately learn something about the architecture. As soon as you see a level of packages named client, webserver and batchserver you’ll form a model in your head how the application is structured and if the names are picked well it is probably close to the real thing. Since in each module the same rules for layers apply you can find out more about the structure of the application in the lower packages as well.
The modules in between communicate the kind of domain the application deals with. Quite naturally the important concepts get their own package and thereby make a statement to everyone inspecting the code: this is an important concept in this application.
I also like adding the rule “A package should contain a-b classes, but must not contain c or more” with appropriate values for a, b and c. This forces the creation of new packages as the application grows, keeping each package to a manageable size.
Of course on smaller applications the structure might get scaled down. For example if there is just one deployment unit there is no need for a separate package level for that classification.
The last usage for package is the most ignored: the intermediate modeling block: Joe Average Developer concerns herself mostly with classes and methods and single lines of code, while trying to come up with a code structure on that level that fits the needs of the application. Often there is some kind of architect who figures out how to deploy the application and thereby determines the necessary deployment units (think separate jars). If you look at the scale of these artifacts something interesting might catch your eye:
1 method consists of approximately 10 lines of code.
1 class consists of approximately 10 methods.
1 jar consists of approximately 100 – 1000 classes.
If nobody takes care of packages there is at least one, often two levels of structure missing! This gap can and should be filled with packages. This doesn’t only mean the packages should exist and be of reasonable size, it also means they should adhere to common design guide lines. Especially Single Responsibility Principle and proper handling of dependencies:
Single Responsibility Principle With the naming scheme proposed above, a lot of work toward honoring the SRP is done. If the contents of the package does what its name says everything is fine on that front.
Managing of Dependencies is a tougher beast. Java currently doesn’t offer a proper system to control dependencies between packages and especially super packages, i.e. packages that contain multiple other packages. There is OSGI, but I found it a pain in the neck to work with, especially since I never needed all the dynamic loading stuff but suffered from the resulting class loader issues. There is also Jigsaw but this is not there yet. Therefore I prefer homegrown tests for defining and verifying the package structure of applications I work with. My tool of choice is JDepend. It gives you lists of dependencies between packages and you can use those to compare them to rules you define. Somebody creates a dependency from package A to package B that should not exist? Boom, the test turns red.
So what are useful rules for package dependencies? First: No cycles. Not on the package level, but also not on the layer level nor on the module level as used above. Second: Modules and Layers have a strict order in which they can depend on each other, everything else is forbidden.
These rules considerably limit the degrees of freedom one has as a developer. But in my experience it smokes out violations of the Single Responsibility Principle, which often surfaces as cyclic dependencies. For example if you have an Order module and a Customer module it feels like these two need to know each other. If you have an Order, you want to know the Customer it belongs to. If you have a Customer you must be able to tell her the Orders she placed. Right? Yes, probably. But do you need the full blown objects and functionality on both side? Probably not. By coming up with an interface package for example containing only the very core for the customer functionality needed by the Order module and and a separate full Customer module that has the references the orders one can break these dependencies AND achieve a stronger separation of concern in your package structure.
This in turn helps when you try to evolve your application. What today is a package might grow into a deployment-unit someday and if you have circular dependencies between deployment-units you’ll have some serious problems. Or maybe your team grows into multiple teams. With a clean package structure as described above you have a obvious bounds where you can split and also an obvious criteria when the teams have to sit together to discuss changes on a package used by multiple teams.
Take care (of your package structure).