Developing for Maintainability

Just as Supportability, Maintainability is one of these Non Functional Requirements, that is or should be required from every software development project. So what does that mean? Wikipedia defines it as

the ease with which a software product can be modified in order to:

  • correct defects
  • meet new requirements
  • make future maintenance easier, or
  • cope with a changed environment;

Wow, that’s great. Because it lends itself for an ‘easy’ test for maintainability. Take the completed product, make up some new requirements and measure how long it takes to implement those. If you do this with different products for same task, you can compare the time needed, thus comparing maintainability. Unfortunately most of the time, the requirement is stated before any software is written. And only one piece of software gets written, so there is nothing to compare to. So what to do?

Since one can’t realistically measure maintainability we once more should concentrate on characteristics making software easier to maintain, i.e. easier to change. There are plenty of measures around that are thought to be linked to maintainability. Cyclomatic Complexity being possibly the best known. I personally prefer toxicity, which is based on Cyclomatic Complexity and a couple of other measures, which contribute to the toxicity with the amount they are above a certain threshold:

Metric Level Threshold
File Length file 500
Class Fan-Out Complexity class 30
Class Data Abstraction Coupling class 10
Anon Inner Length inner class 35
Method Length method 30
Parameter Number method 6
Cyclomatic Complexity method 10
Nested If Depth statement 3
Nested Try Depth statement 2
Boolean Expression Complexity statement 3
Missing Switch Default statement 1

Compared to a single simple measure, this has the benefit of being less prone to optimization for the measurement, although of course this is still possible.

So does a low, even zero toxicity guarantee maintainable code? Hell no. For starters there are a couple of different things to consider:

  • Is the code well covered with tests?
  • Are the current requirements specified in a suitable manner (e.g. using tests)?
  • Is the code written in a consistent style
  • Is all the code including the documentation under control of a version control system?
  • Is the structure of the application, its architecture defined and documented?
  • Is there an automatic process for building new versions of the software (i.e. ant, maven, make scripts or similar)?
  • Is the code written in a language that is well known and understood and which has a large user community?

Everything but the last bullet point is pretty much identical with what I consider base practices for any serious software developer.

So I propose: The next time you encounter the vague requirement of maintainability, replace itybe useful and well testable requirements, based on specific practices and metrics. It still won’t guarantee maintainability. But it will increase the chance for it.

Developing for Supportability

Support beams and wires of  a bridge

Support beams and wires of a bridge

When reading the specification of a piece of software to be written, you are bound to find some non functional requirements. Among these there will be, or at least should be Supportability. But what the heck does that mean? How do you install supportability? Let me present some ideas, what you can do to improve supportability.

Let your application log in a well defined reliable way, to a location that is easily accessible. Flatfiles on a server qualify. If you must log on a client, consider implementing a way to automatically transfer the log file to a support person. If you log into a database, make sure the support person can access it easily.

Make your application easy to shut down and start. This sounds trivial, but it is easy to break this ability. Considere the following check list:

  • What happens when you start two different versions of your application against the same database? A nicely supportable application should notice this and react accordingly.
  • What happens when you stop your application while it is processing a request?
  • If you have batches or batchlike processes, what will happen with those when you try to stop your application while the batch runs? Do you have the 5hours, until the batch finishes? Or will the batch stop and rollback, so you have to wait an hour for the rollback to happen? Or will it stop nicely within a minute, and pick up its work automatically after restart?
  • Are you stuffing stuff into a database or a queue? What happens when the queue or the database gets started after the application?
  • Do you receive messages from a queue? What happens when the first message arrives, before your application is available? What does happen when you receive a message that you already processed? What does happen when you receive a message for which you already processed a later message?
  • How long does it take to shutdown and restart an instance of your application when it is under full load? If this takes more then a few minutes, is it possible to stop and restart only parts of your application?

Make the state of your application visible for support personnel. Most applications just report arbitrary errors when a component of the system is down. It’s up to the supporters to guess if it is the database, a queue, a webservice or the application itself which is causing the problem. Identify the resources your application depend on. Write a check which tests all these resources, and make this check available, for example as a special health check webpage.

Put components that are likely to fail behind some kind of buffer. Your database might be so important for the application that this doesn’t work for it. But if you are posting stuff to a queue (or webservice or …), consider using a local queue as a buffer, so your application can work as usual even when the target queue isn’t available.

Last but not least: Document your application. The agile manifest says that working software is more important then documentation. It doesn’t say you don’t need documentation. And I’d say the documentation for servicing your application might be the most important one. The normal user who uses your application everyday will figure out a way to get along. If not he will call you or your boss. But the poor support person has to support dozens of the applications and since your application just works he’ll encounter it only a few times a year. He will know nothing about it except the stuff documented in the manual. So make sure there are instructions on how to interpret the logs, how to shutdown and restart the application, how to analyze the internal state of the application and what happens when some connected component fails.

Have you noticed something? The vague non functional requirement ’supportability’ turned into a nice set of very functional requirements. You can attach a price to that, decide what pieces of it you’ll really need and measure if it really works. And I claim this works with all the much hated non functional requirements.

Social Media and Agile

Flip up Shades

Flip up Shades

For about a decade now everybody in IT talks about Agile, but hardly anybody else does. There is a somewhat similar concept of ‘Lean’ in other industries, but hardly anybody in a ‘real’ industry considers things like ‘release early’ a viable strategy. And that’s a good thing. I don’t want a second iteration car. “Ooops sorry, the story about braking is still in the backlog”. I am confident that if those kind of companies would learn agile from the books, they’d need quite some time until they realize the fine difference between ’ship’ and ’shippable’.

But maybe these companies won’t learn it from the books, but from their tools. As said before companies will learn about social media, and to some extend it will invade  daily life and work at these companies. But if you start doing your documentation in a wiki, for everybody to see or spurt out your idea on a internal twitter account, then you are presenting stuff as potentially shippable. Not the car you will eventually build. But maybe the concept for a new feature.

Everybody will be able to judge the results, comment, provide feedback. And without realizing it big old companies will become just a little bit agile.

Posted in: Agil by Jens Schauder 2 Comments ,

Versioned Data

1214482_77411415

Time Flies

In about one of two projects the customer comes up with the requirement of ‘historization’ of data. And more often then not this lead to an unholy back and forth of discussions, prototypes and complaining. The reason for this as far as I can tell is: This is not a well defined requirement. It can mean many things, and depending on what is meant a different implementation is due. So here is my list of possible implementation approaches along with the circumstances in which they might be useful. Note: Back in the times when this blog was mostly German I wrote an article on the options of versioning or ‘historization’. This article you you just read is basically a translated rerun of that article.

Audit Trail

The audit trail is the simplest of all approaches. When ever an object (or row in the database) changes, a copy of that object (row) gets stored in a separate table, together with a time stamp. This is so easy, it can be implemented with database tools alone, namely using triggers. With this approach it is fairly easy to identify the state of a given row at a given time. But it gets complicated to identify the same for a whole object graph, since the objects that are part of that object graph change at independent times. So you can’t just write a join. Also if you are not so much interested in the state of a row at a certain time, but at the changes that happen, the next approach might be more usefull to you.

Change Log

The Change Log is a variation of the audit trail. In both a record is produced on every change in a database row. But a change log, keeps track of the change, not so much of the state the row was in. So it would contain at least for the changed attributes the old and the new value. This way it gets easy to find out what kind of changes happen how often. A question often asked in data minining settings. Obviously you can and should fine tune your implementation between audit trail and change log to your exact needs.

Snapshot

In the Snapshot approach you take a snapshot of a complete object graph when an element of that graph changes. Note that now we are operation on objects and no longer on tables and rows. So there is no easy way to do this kind of thing in the database. Since you are creating copies of object graphs, you can navigate and query these graphs just as you can navigate and query the original graph. On the other hand the copies contain objects that didn’t change at all, so it is easy to end up with huge amounts of data fast.  Since these tables contain large amounts of duplicated data, compression might be really effective. Since you are creating the complete object graph for your history objects, nothing forces you to use the same model as with the live data. So your history schema might look very different.

All approaches so far are concerned with the past state of objects. But sometimes this isn’t enough. Sometimes what starts as ‘historization’ ends up being full blown …

Versioned Objects

This is the most powerful approach: You model the versions i.e. the changing state of your objects as part of your domain model. Like so:

public class Thing {
  1. private Set<thingversion>versions;
  2.     public ThingVersion getVersion(Date timestamp){}
  3.     // …
  4. }
  5.  
  6. public class ThingVersion {
  7.     private Date validFrom;
  8.     // …
  9. }

Depending on the Context you might use references to the ‘Thing’ or to a specific Version or to the version, which is valid at a specific point in time. Let me give you an example. When dealing with trains, a lot of rules have to get considered. Certain combination of good must not travel in wagons right next to each others. Wagons with certain goods have to have  certain features. And so on. These Rules change. And of course you need proof that the train you send on a track one year ago obeyed all the rules valid at that point in time. So far this sounds like a case for on of  the first three options. But when these rules change, you will get upfront notice of that change. But if you have a train that you plan for end of the month, which will obey the rules valid at that time, although it might violate rules which are in effect right now. In order to model this, you need Rules, that exist in Versions. And the train does reference that rule and uses it’s time of departure to request the appropriate rule version, which is valid at that day.

This will definitely make your model much more complex. So don’t use it if you don’t need to. But if you are dealing with planned changes this has proven to be a useful approach for me.

The Software Development is like the Evolution of Life

Evolution

Evolution

Software development has been compared to many things. I’d like to propose another comparison: Evolution.

Why another metaphor?

A metaphor enables you to think about a problem in a different way, thus possibly gaining new insight. It is also useful for explaining something to someone who otherwise wouldn’t understand what you are talking about. Or to use software development wording: It is an abstraction.

Why evolution?

The ideas of evolution are well known and considered valid in the scientific sense. It is also area of abundant scientific work, which actually might be useful in the context of software development. Compare this to other metaphors like gardening and crafting.

And of course I think it fits.

Let’s start in the beginning. A new software development project starts with ideas. Ideas about what the software will be able to do, ideas about all the features it will have and ideas about in which way the software will be implemented. But on their own these ideas are nothing. The moment you stop thinking about them they cease to exist. If you don’t push the ideas forward, nobody else will.

This is similar to the primordial soup. In which many pieces, i.e. molecules needed for life existed. But these molecules couldn’t yet reproduce themselves. And many of these molecules fell apart before finally some where ‘lucky’ enough to form droplets which met the requirements of the definition of life, although they where so extremely primitive, they can hardly be compared with our ideas of life. Consider the description wikipedia gives.

These would combine in ever-more complex fashions until they formed coacervate droplets. These droplets would “grow” by fusion with other droplets, and “reproduce” through fission into daughter droplets, and so have a primitive metabolism

This could only happen because they existed in a suitable environment: They had enough energy (from the sun and vulcanism) but not so hot that everything was boiled immediately. All the needed materials where there suspended in water, so they could interact easily often and fast.

Compare that to your ideas of the next software project. Maybe you don’t have the energy to drive your idea forward. Maybe you don’t have the right partners to work with, maybe you don’t have the necessary tools. All these might be reasons a software project dies, before its first steps. But if you manage to bring your ideas into a suitable environment, you’ll be able to create a first prototype, a sketch, of what your product will be. This sketch is important. If you make it to complicated you won’t get there within this century, just as the dinosaurs weren’t created directly from the primordial soup.

Once you have this first prototype, things get really interesting. Again you need the suitable environment. But if you have that, features will get added to your application, and it will grow, both in size and in complexity. This in itself will create new ideas for features. Sometimes these features will be added to your piece of software, but sometimes they will form into an independent software project. While your project moves a long and gathers features and complexity, it will move with different speed. Sometimes features will get added on a daily bases. Sometimes you will be busy cleaning up stuff, doing small tuning for days in a row, with hardly any new features, and sometimes you (hopefully) will decide, that a cluster of features really does not help the application, and throw it from your code base.

Again the similarities to evolution are abundant. Life sure grew in size and complexity. And many kind of life could only appear because other forms of life where already there: Almost all animals need oxygen, which only exists in the atmosphere because of plants. Carnivores would get extinct pretty fast if there weren’t other animals for food. Similar things are true for parasites. Evolution also doesn’t proceed with a steady pace. There where times, when new species evolved extremely fast, and of course we all know about the extinction of dinosaurs, which is only one case of many cases where nature decided to drop a couple of features.

I hope you enjoyed my comparison so far. But I claimed, that metaphors like this are actually useful. So let’s see, what we can get out of this one:

  1. The environment is important. There is no life on Venus, because it is to hot. There is most probably no life on Pluto, because it is to cold, and there is hardly any life if at all on mars, because it is to dry (among other things). I’d say the same is true for software development. You need a good (integrated development) environment, office, coworkers and boss, to develop great software.
  2. If you want to create something really great, a detailed plan is only useful if you are a god who can control everything, and event that is disputed.
  3. You can’t expect to control a software development project by changing one aspect of the environment. If you have fish, and want evolve them into animals living on land, drying up the oceans will kill the fish, but not create mammals. But making land available for testing, getting plants there for food, and convincing some of the fishes, that they will be safe from sea-born predators on land will eventually convince some of them to develop lungs. This will take time and constant pushing.
  4. Simpler is better. Humans seem to think they own the world. The opposite is true. There are many more species, individuals and kilograms of bacteria then humans. Same is true for ants.
  5. How ever careful you engineer the environment, sometimes accidents like the human species just happen. In these cases starting over might be your only option.

The Social Web is a Tool, Fool

Digging prohibited

Digging prohibited

Many companies lately come to the conclusion, that they have to do something about this social web thingy. But when I listen to the discussions, I feel like traveling back in time to the end of the last millenium, when everybody thought the web (release 1.0 at that time) was the way to print money. The news is: it wasn’t back then and it isn’t now either.

For two reasons I think this time it is even worse!

1. For many companies it isn’t at all clear, what they are trying to achieve.
And as we all know from our last motivational training: If you don’t have a goal, it is difficult to reach it. So what are viable goals in the social web world?

  • Better customer service
  • Better internal communication
  • Higher employee satisfaction
  • Higher customer loyality
  • Higher visibility
  • Utilize the long tail
  • Higher sales

These are attractive goals, and each might get supported by leveraging web2.0 technology. But all the web2.0 stuff is just tooling. Which brings me to the second, more important point:

2. Many companies have to change a lot in order to become web 2.0 compatible:

  • If a customer says it’s broken, it’s broken. Web2.0 is all about communication. One partner of communication is the customer. But if you do not intend to listen to what the customer says you might as well stop right now.
  • If you want your employees to communicate, start by listening to them. Blogs seem to become popular communication tools for some managers, trying to communicate with their employees. But if you want your blog to be read you either need to get involved in the discussion in the comments, or you must write something that really is interesting for the readers. For both you must listen to the other side of the communication channel.
  • If you want motivated employees, don’t give them the feeling, they are walking problems which you will get replaced by a piece of software as soon as possible. You would never do that? Great. Unfortunatly I have seen exactly that just to often. Processes requiring to write 200page documents, which get reviewed, corrected, approved and ignored. Organizational structures where a bright idea has less chances for growth then a snowflake in hell.
  • If you want your customer to be loyal, you’ll have to give them a reason. A great blog is cool. Intersting tweets? Nice! But customers need products and services in order to become customers. And if products and services suck, the customers will notice and take the money somewhere else.
  • Nobody works for free, including the authors at wikipedia. The impression that the authors of wikipedia work for free is a wiede spread illusion. They might not take money for their time. But they take pride in their work. If you are used to consider the employees of your companies as little more then slaves, you are heading for trouble with ‘the community’.
  • Great products and great marketing still needs a lot of work. Just because you use Twitter for listening to your customer complaints, just because you use a blog to tell your customers about the features of your new product, just because the documentation has the form of a wiki, doesn’t mean there goes less work into it. Quite the opposite. Office times don’t exist on twitter.

If one point or the other matches the situation at your company, what are you gonna do? Forget about web2.0? Hell no, just embrace the change, because web2.0 will affect your company, better be ready for it.

By the way: For the readers wondering, why I am addressing them as if they where the manager of their own company. I am convinced that in a company that is ready to embrace web2.0 YOU are just as important as the management.

Posted in: The Rest by Jens Schauder 1 Comment , ,

Organize Tree Structures

Bonsai Tree

Bonsai Tree

Trees are everywhere in software! The file system is a tree structure, menus form a tree structure, the archive widget on the right side of this blog is a tree structure. We use trees all the time. And very often they get completely messed up. I’d say 9 out of 10 project folders are a complete mess. The only people finding anything in these structures are the ones who put it there in the first place. Why is it so difficult to find anything in most of these trees? And how can we make it more easy?

Web2.0 and Google fanatics would probably say: forget about the tree structure. Just tag everything and use a search engine. I’d say this is a good approach for really solving the problem. But in many cases this is not an option. So let me present a really easy solution.

But first lets clarify the problem. Just yesterday I found something like this inside a project folder:

  • meeting minutes
  • status report
  • sales
  • architecture

Question: Why doesn’t this work? Answer: Where do the minutes of an architecture meeting go? We don’t know. The underlying problem is that there are two criterias applied for identifying folders: document type (meeting minutes, status report) and job roles (sales, architecture).

The simple trick for obtaining an easy to manage tree structure is: “Use exactly one criteria for distinguishing the branches on the next level.” In a project folder, you might use criteria like ‘document type’, ‘project phase’, ‘document owner’, ‘date’, ‘module’ but in every branch use only one criteria for the subfolders. In reality it is actually pretty difficult to find good criteria, but it is worth the effort, because in such a structure, it is easy to find the correct spot for a document. It is easy to find a document. And it is easy to decide that a new folder is needed.

Of course the challenge remains, to make everybody understand and maintain the structure. Feel to point them to this blog post.

Posted in: The Rest by Jens Schauder 1 Comment , ,

Are You a Software Developer or a Dabbler

Tools

Tools

When reading blogs you get the impression, that everybody works in high end environments, using the latest greatest distributed version control system. Writing tons of tests, before they even dream about writing actual code and of course the tests a executed by the continuous integration system after every commit, which happens about 30 times per day and developer. But when I look around in the real world, this is not what I see. Instead the way people work on their code like ancient ‘doctors’. Drilling holes in heads in the hope it will reduce the headache of the patient. It probably did. In many cases in a very final way. I urge you: Don’t let that happen to your code (or your career). Practice solid software development. And in order to help you with that I compiled a simple list of things you really really should do. Those are basic practices. If you don’t even adhere to those then I have only two possible explanations: You are not involved in software development at all. Go away, this blog isn’t for you. Or … you are a dabbler.

  • Use a Version Control System (VCS). I am not even going to comment on this one.
  • Commit your changes at least once a day. This does not mean you should commit what ever is on your filesystem at 5pm, but you should break your tasks in so small pieces, that you finish one or two of them on a normal day.
  • Tag or label everything in the VCS you hand out to somebody outside your team (testers, salespersons and of course customers)
  • Have a complete and working build script. This means you can build everything you hand of to the customer by getting the source code out of the VCS and start the script. Necessary adjustments are made in a tiny file which contains settings for the local machine. A well commented template for such a file is in the VCS. And NO, hitting the compile button in your IDE is not the same as a build script.
  • You must have the complete environment necessary to run your application under your control. That means if your application needs a database, you have a database available. One per developer that is, or at least one schema per developer. If you need a queue or ten queues, you have those, again once for every developer. If you have systems that you interface to, that can’t be installed once per developer, you have mocks, stubs or similar available. In the year 2009 a single developer database for 5 developers is no longer acceptable.
  • You have an extensive set of automatic Unit tests. These test should cover at the minimum the main execution paths.
  • Let a Continuous Integration system execute your build script, i.e it compiles your code, executes the automatic tests and builds a setup or jar, or what ever you are deploying.
  • Have a specification of what a piece of software is supposed to do, before you try to write the software. You don’t need a full specification of the complete application upfront. Maybe you have just the first user story for the first feature, or only a test. But you must have something that tells you where to go. And where not to go. Flying blindfolded is not the same as being agile.
  • Adhere to a style guide that describes your naming conventions, indenting and so on. Ideally this gets enforced by the IDE and automatic tests. Fighting about the content of such code conventions is useless. Not having one is dumb. Remember that your tests are code too, so the conventions apply to tests as well.
  • Document how your code is structured. It should at least describe the different layers, and the way two layers may depend on each other, and it should not allow for circular dependencies. These rules as well should be enforced by automatic tests. Even if the language you use doesn’t support packages you should use a similar concept, possibly through naming conventions.

These are my points. What did I miss?

Teach

Are you aware that I know a lot about you? You don’t believe me? Here you go: You are smart! You are constantly trying to improve your skills! You are probably better at what you do for a living than most of your colleagues!

Wow, how did I do that? Simple: You are reading my blog. Since my blog isn’t funny or especially well written you are here for the content. Since my blog isn’t that popular (yet) you probably read quite some blogs like this. Everything else follows from that.

Since I know so much about you, let me give you some advice: Use your knowledge and teach others. Write a blog, better yet go to people and teach them stuff. Tell people about interesting things you found on the web. Answer questions in forums. Participate in discussions. Moderate workshops. Join discussions.

Teach your colleagues, teach the kids at the local school and of course your own. Teach the people at a nearby conference.

It is an easy chance to really make a difference without becoming rich or famous first, but possibly later on.

And although the feeling of doing something good is pretty nice, you don’t have to do it for the greater good alone. When you prepare for teaching, you will learn a lot about the topic. Also teaching will introduce you to a lot of people, which in turn may help you in many situations, including in the hunt for the next job or project.

And by the way somebody telling you that the workshop you hold was really nice is almost as awesome as getting told that you are smart :)

Posted in: The Rest by Jens Schauder 1 Comment

Hibernate has Problems, but where is the Alternative?

875413_47541979

Scale

In a late blog post Stephan Schmidt vents his problems with hibernate and declares “ORMs are a thing of the past

I agree to some extend:
- The SQL generated by Hibernate by default is horrible. Huge joins, with hundreds of columns, many unneeded.
- Annotations feel like dirt in your code, and maintaining XML mappings is just painful.
- LazyInitializationExceptions are a pain in the a.. neck.

But does that justify the conclusion, that ORMs will go away in the near future? I don’t think so. Of course my perspective is biased, but in the applications I build I deal typically with 200-500 tables. Just typing the basic CRUD Statement, and wrapping them into usable objects is a pain, and a lot of work, which I gladly will hand over to an ORM.

Tracking changes in objects is another task, I gladly handover to an ORM.

Even writing annotations, while being far from perfect is better then most alternatives I know.

Implementing my own caching logic? No thanx. The ORM can do that pretty well.

I think the crucial point lies in Stephans last paragraph, where he sketches some ideas for alternatives. ORMs aren’t bad in themselves. They are just difficult to get right. Hibernate did a great job. It is the first one to get wide spread usage. And most points mentioned above are a weakness in Hibernate (or JPA) or possibly Java. So my claim is: ORMs aren’t dead, they aren’t even grown up.

So what properties might a grown up ORM have?

  • Usage of persistence independent annotations: The problem with annotations is: they don’t belong in the business domain, where your classes live. They are part of the persistence layer, and should stay there. But when you look at the JPA annotations you’ll find a lot that is pretty usefull in GUI context as well. For example the length of a field isn’t only important for the database but for the GUI as well. Same for validations. So I’d think, in a couple years from now we’ll have annotations (or other language features), that let us specify features of properties and references, which truely live in the domain world. And the ORMs will just utilize that information for 90% of their needs
  • When the references will have more information attached to them. E.g. if it is just a association, or an aggregation, or if the M in an 1:M relation is ‘big’. And ORMs will use that to optimize there SQL.
  • The management of Sessions will move toward a declarative style, just as transactions did in the last decade.
  • RDBMS vendors will finally realize the power of ORMs, and will provide more efficient protocols for the ORMs to communicate with.
  • ORMs will monitor how they are used for a certain application, and will use that information to improve the SQL used.

But as so often, the most important change will be a people change: People will finally let databases drop into the background where they belong, and manage the schema through the ORM. I always wonder “Why?” when I hear people describe how the build a database first, and then map classes to the tables. That is the wrong way around: Build a strong domain model first. Let the ORM create a database schema for you from that. And then get someone with strong database knowledge involved to tweak it where necessary. This is when you start seriously gaining something for your struggle with the ORM. Most importantly: you get strong support for database refactorings.

If you want to read more about the whole ORM discussion, you might be interested in this article by Debasish Ghosh.

Posted in: Softwaredevelopment by Jens Schauder 11 Comments , ,