More on JUnit Theories

In my last blog post, I described how to use JUnit Theories to create large amounts of test runs, with very limited amount of work, like so:

import static org.junit.Assume.assumeTrue;
  1. @RunWith(Theories.class)
  2. public class TheorieTest {
  3.  
  4.  @DataPoint
  5.  public static String a = "a";
  6.  
  7.  @DataPoint
  8.  public static String b = "bb";
  9.  
  10.  @DataPoint
  11.  public static String c = "ccc";
  12.  
  13.  @Theory
  14.  public void stringTest(String x, String y) {
  15.   assumeTrue(x.length() > 1);
  16.  
  17.   System.out.println(x + " " + y);
  18.  }
  19. }

The trick is simple to provide data points for every parameter type of the test method. The JUnit Theories Runner will call the test method with every possible combination of datapoints. If you think a little about it you will soon realize some of the limitations of this approach:

  • You’ll soon end up with lots of data point fields cluttering your code
  • Parameters of the same type will receive the same set of parameters, even when the usable range of inputs is completely different.

Fortunately the developers of JUnit provided really nice solutions to these problems.

Instead of specifying single data points, you can provide a full array of datapoints using the @Datapoints annotation, like so (add imports for good measure):

  1. @RunWith(Theories.class)
  2. public class TheorieTest {
  3.  
  4.  @DataPoints
  5.  public static String[] a = { "a", "bb", "ccc" };
  6.  
  7.  @DataPoints
  8.  public static Integer[] j = { 1, 2, 3 };
  9.  
  10.  @Theory
  11.  public void someTest(String x, Integer y) {
  12.   assumeTrue(x.length() > 1);
  13.  
  14.   System.out.println(x + " " + y);
  15.  }
  16. }

This of course is much less verbose. Instead of an array you may provide a method returning an array, or at least it looks like this should be possible. But when I tried it JUnit seemed unable to handle the types correctly resulting in IllegalArgumentExceptions. Guess I’ll have to file a bug when finished with this article …

But we still need to take care of parameters which have the same type, but very different meaning and therefore different useful values. The clean OO way of doing things would be to get rid of the generic types like String and use stronger types like CreditCardNumber or Name instead. But then in a perfect world we wouldn’t need tests, because our programs wouldn’t contain any bugs to begin with. So lets try this instead (Again imports omitted):

  1. @Retention(RetentionPolicy.RUNTIME)
  2. @ParametersSuppliedBy(CreditCardSupplier.class)
  3. public @interface AllCreditCards {}
  4.  
  5. //—————————————————————–
  6.  
  7. @Retention(RetentionPolicy.RUNTIME)
  8. @ParametersSuppliedBy(NameSupplier.class)
  9. public @interface AllNames {}
  10.  
  11. //—————————————————————–
  12.  
  13. public class CreditCardSupplier extends ParameterSupplier {
  14.  
  15.  @Override
  16.  public List<PotentialAssignment> getValueSources(
  17.    ParameterSignature signature) {
  18.  
  19.   ArrayList<PotentialAssignment> result = new ArrayList<PotentialAssignment>();
  20.  
  21.   result.add(PotentialAssignment.forValue("Amex", "Amex"));
  22.   result.add(PotentialAssignment.forValue("Master", "Master"));
  23.   result.add(PotentialAssignment.forValue("Visa", "Visa"));
  24.  
  25.   return result;
  26.  }
  27. }
  28.  
  29. //—————————————————————–
  30.  
  31. public class NameSupplier extends ParameterSupplier {
  32.  
  33.  @Override
  34.  public List<PotentialAssignment> getValueSources(
  35.    ParameterSignature signature) {
  36.  
  37.   AllNames annotation = signature.getAnnotation(AllNames.class);
  38.   System.out.println("just wanted to show that I can access it "
  39.     + annotation);
  40.  
  41.   ArrayList<PotentialAssignment> result = new ArrayList<PotentialAssignment>();
  42.  
  43.   result.add(PotentialAssignment.forValue("Alf", "Alf"));
  44.   result.add(PotentialAssignment.forValue("Willie", "Willie"));
  45.   result.add(PotentialAssignment.forValue("Tanner", "Tanner"));
  46.   result.add(PotentialAssignment.forValue("Cat", "Cat"));
  47.  
  48.   return result;
  49.  }
  50. }
  51.  
  52. //—————————————————————–
  53.  
  54. @RunWith(Theories.class)
  55. public class SuppliedByTest {
  56.  
  57.  @Theory
  58.  public void imagineThisIsATest(@AllCreditCards String x, @AllNames String y) {
  59.   System.out.println("consider " + x + " / " + y + " tested.");
  60.  }
  61.  
  62.  @Theory
  63.  public void testIntegers(@TestedOn(ints = { 2, 3, 4, 7, 13, 23, 42 }) int i) {
  64.   System.out.println(i);
  65.  }
  66. }

Wow, thats a lot of code. Just look at the last piece and see what appears in the console when we run it:

just wanted to show that I can access it @de.schauderhaft.junit.theories.AllNames()
consider Amex / Alf tested.
consider Amex / Willie tested.
consider Amex / Tanner tested.
consider Amex / Cat tested.
just wanted to show that I can access it @de.schauderhaft.junit.theories.AllNames()
consider Master / Alf tested.
consider Master / Willie tested.
consider Master / Tanner tested.
consider Master / Cat tested.
just wanted to show that I can access it @de.schauderhaft.junit.theories.AllNames()
consider Visa / Alf tested.
consider Visa / Willie tested.
consider Visa / Tanner tested.
consider Visa / Cat tested.
2
3
4
7
13
23
42

Have a look at the row beginning with: “consider”. Obviously the Theory imagineThisIsATest gets fed with the values from the CreditCardSupplier and NameSupplier. The parameters and the ‘Suppliers’ are connected by the two annotations @AllNames and AllCreditCards. So whenever you have a parameter to a theory where the type alone is not sufficient for identifying the kind of values that should get used, you can simple create an annotation, which itself is annotated with a reference to a ParameterSupplier class and you are all set. You might think this is a lot of code for supplying a handful of parameters. You are right, but remember, that you can reuse your suppliers wherever you need names or credit card values in your tests.

Now let’s look at the first line of the output:
just wanted to show that I can access it @de.schauderhaft.junit.theories.AllNames()
It simply shows of that you get access to the annotation (and actually the signature of the compete test method. This can be very useful, when you want your supplier to behave differently for different theories. Have a look at the NameSupplier above to see how this works.

JUnit actually comes with an example where this is used, and I demonstrated it with the other theory in the demonstration code above. The @TestedOn annotation takes an array of values to be used as data points for the annotated parameter.

Thats it for today. I hope the power of theories became obvious, as well as the power you have as a developer to extend that mechanism. Again be warned: All this nice stuff is in a package named experimental for good reason. If you use it, you might find bugs, and thing will likely change at least in name in an upcoming version. Taking about versions, I am using junit4.8.1 for the examples.

For next week the conclusion of the little series about JUnit theories is planned, with a few thoughts on use and danger of this kind of testing.

New Feature of JUnit: Theories

A couple of months ago I blogged about JUnit Rules, one of the new features in JUnit. While fooling around with JUnit Rules, I found a couple more features that you might be interested in. So here it comes: Theories! It turns out Theories are really a piece of cake. Try this:

import static org.junit.Assume.assumeTrue;
  1. import org.junit.experimental.theories.DataPoint;
  2. import org.junit.experimental.theories.Theories;
  3. import org.junit.experimental.theories.Theory;
  4. import org.junit.runner.RunWith;
  5.  
  6. @RunWith(Theories.class)
  7. public class TheorieTest {
  8.  
  9.  @DataPoint
  10.  public static String a = "a";
  11.  
  12.  @DataPoint
  13.  public static String b = "bb";
  14.  
  15.  @DataPoint
  16.  public static String c = "ccc";
  17.  
  18.  @Theory
  19.  public void stringTest(String x, String y) {
  20.   assumeTrue(x.length() &gt; 1);
  21.  
  22.   System.out.println(x + " " + y);
  23.  }
  24. }

When you run this you’ll get this as an output (and one successfull test):

bb a
bb bb
bb ccc
ccc a
ccc bb
ccc ccc

So what is going on? The first thing to note is that this test is executed by a specialized runner of type Theories. This runner executes all the public methods annotated with @Theory. Differently from normal tests theories have parameters. In order to fill these parameters with values, the Theories runner uses all the public fields of matching Type and annotated with @DataPoint. When you take a look at the output, it should be obvious why this is more powerful then parameterized Tests: Every combination of values is tried, thus with e.g. 4 parameters with 4 distinct values each, you end up with 256 test runs.

The idea is to specify a theory about the object under test, that holds for a large class of states and parameters.  Then you provide those as parameters to the test method which will test the theory.

A probably very common case is that a theory is known not to be valid for certain cases. You can exclude these from a test using the Assume class. If an assumption doesn’t hold, the test is silently ignored. This is used in the example above to prevent ‘a’ to be used as a first parameter.

There are some more tweaks to Theories in JUnit, which I will cover in a later blog. Until then enjoy the new feature. But keep in mind the package in which theories reside up to now: org.junit.experimental.theories so I’d expect some changes in the API and at least a change in package name.

Posted in: Softwaredevelopment by Jens Schauder 2 Comments , ,

Mixins, Inheritance and Delegation

Blender

Blender

A week ago I started learning Scala. One of the features I found pretty interesting are mixins and traits. That was just the point of time, when I read this little tweet of GeekyL:

“i am still not sure if mixins are super cool or dark magic.”

Of course I was instantly reminded of the time when the dinosaurs dominated the world and I was learning the first little bits of OO and C++. I thought inheritance and polymorphism was great and the solution for every possible programming problem there is. It turned out that was not the case. Actually inheritance can result in pretty ugly code.

So GeekyL’s tweet got me thinking: Do mixins have the same problem? What exactly are the problems with inheritance anyway? Time for a new blog post.

What are the problems of inheritance?

Inheritance for code reuse: It is tempting to have a class inherit from a superclass just because the superclass has some useful feature. Like this.

import java.beans.PropertyChangeSupport;
  1.  
  2. public class PropertySupport {
  3.  protected PropertyChangeSupport propSup;
  4. }
  5.  
  6. public class UserEntity extends PropertySupport {
  7.  
  8.  private String firstName;
  9.  
  10.  public void setFirstName(String aFirstName) {
  11.   String oldValue = firstName;
  12.   firstName = aFirstName;
  13.   propSup.firePropertyChange("firstName", oldValue, firstName);
  14.  }
  15. }

The problem with such code is that inheritance suggests a is-a-relationship, which in this case is just wrong. A UserEntity is not a PropertySupport, it just uses one.

Multiple inheritance: At least in java a class can only inherit from one superclass. But many classes are many things. For example, a Cat is a Carnivor, it is a FourLeggedAnimal and a FurryAnimal. Without multiple inheritance there is just no way how to model this with class inheritance. With multple inheritance you can extend all these classes at the same time, but now you inherit the top class Animal three times, which at least is ugly.

So the proper way to do this kind of stuff in java is to use delegation and interfaces:

public class Cat implements Furry, FourLegged {
  1.  FurryDelegate fd;
  2.  FourLeggedDelegate fld;
  3.  
  4.  @Override
  5.  public void pet() {
  6.   fd.pet();
  7.  }
  8.  
  9.  @Override
  10.  public void runFast() {
  11.   fld.runFast();
  12.  }
  13. }

Which works in the sense that you can pretty much express the kind of things you need to do, and which you might be tempted to solve with class inheritance. Of course the draw back is that you’ll have to create all theses trivial delegate methods.

In Scala you have the option to use Traits. Like so

class Animal
  1.  
  2. class Cat extends Animal with Furry with FourLegs{
  3.  
  4. }
  5.  
  6. trait Furry{
  7.  def pet() {
  8.   println("purrrrr")
  9.  }
  10. }
  11.  
  12. trait FourLegs{
  13.  def runFast() {
  14.   println("I'm gone")
  15.  }
  16. }

Obviously this contains much less code duplication then the java version. Does it have drawbacks? You bet.

Mixins are firmly tied to the traits they use. Imagine a more complex trait, which itself uses many other clases and resources. Once you mix in such a trait you have a strong dependency. Therefor I suggest the following pattern for getting the reduction of code duplication from traits and the flexibility of delegates: Use Traits, which don’t have a real implementation, but use a delegate for doing all the real stuff, this way you can switch the implementation with touching just a single place:

class Superclass
  1.  
  2. class Mixin extends Superclass with Trait {
  3.     // … more stuff goes here
  4. }
  5.  
  6. trait Trait {
  7.   val delegate = new TraitImpl // this should be some kind of DI or lookup in a real system
  8.  
  9.   def aMethod() {
  10.     delegate.aMethod()
  11.   }
  12. }
  13.  
  14. class TraitImpl extends Trait {
  15.   override def aMethod(){
  16.     println("you just called me")
  17.   }
  18. }

Summary: Traits can be used to solve some of the problems of inheritance, but they might introduce strong coupling, which you can easily be avoided.

8 Reasons why the Estimates are too low

Hourglass

Hourglass

One of the most difficult tasks in a software development project is estimating the size of the project. Unfortunatly very often  you have to do it at the very beginning of a project, when you have the least information. The result at the end is very often a large difference between the original estimate and the actual time and money needed.

If the difference is positive as often as it is negative this is kind of OK. But in some teams estimates are too low almost all the time! The obvious strategy often employed is to add a certain percentage to the estimate. But of course that is just fixing the symptoms, because most of the  time nobody knows the reason. So in this article I am going to show a couple of possible reasons for bad estimates, how to identify these and also a possible fix.

These are the kinds of reasons I identified so far:

Super Hero Estimates: Often estimates are done by a very experienced developer (Super Hero). When they estimate a task they imagine themselves doing it. But while the esimate might be correct for the Super Hero in the project more average developers work on the task. And since they aren’t Super Heros they need longer.

You can identify this kind of situation by letting different people do the estimates. Including average developers. If the estimate of the Super Hero is constantly below that of the others, the fix becomes obvious: Use the estimate of the other developers. Don’t exclude the Super Hero from estimating though. Experienced developers a good at identifying things others tend to forget.

Wrong Team: This one is similar to the Super Hero Estimate, in that the estimate is correct for a team of good (possibly excellent) developers. It differs though, because in this scenario, this is the actual team that should do the project. But for one reason or the other the project gets staffed with a different team. Maybe developers that aren’t that good. Or maybe just a team that doesn’t no the domain, the framework or the programming language that well.

In order to identify this situation, talk to the people doing the estimates. Let them specifiy the assumption they had in mind when they made the estimate, and compare it to what actually happend in the project.

If this is actually the problem you have two choices: The first is: stop switching teams. This doesn’t work most of the time, since you don’t switch them just for fun, right? So you are stuck with the second option: Even when you plan to use your top team, make the estimates on the assumption that an average team will do the job. Be honest to yourself about what an average team is.

Guessing in the dark: Often the information available about the project at hand is just not sufficient for a reliable estimate. There is only a vague description available.

In order to identify this as the underlying problem, break the project down into tasks, which you estimate. For each task put down the assumptions you are making: what technology are you using, how complex is the logic to implement. During the project check if your assumptions hold or if there are a lot of changes. This is also most of the solution to the problem. Make the description of your assumption part of your estimate. If the assumptions are wrong, the customer may correct those, resulting in a new more reliable estimate. I he changes his mind later on, you have a clean basis for a change management process, which ensures, that changes are possible, but that the one requesting the change is also paying for it.

Forgetting stuff: This one looks very similiar to the previous. Stuff doesn’t get included in the estimate. But this time it’s not because the people doing the estimate don’t know about it, but because they forget about it.

Using the same strategy as for Guessing in the dark should make this kind of mistake obvious. As a remidy, let multiple people do estimates, and compare the results. You should do this based on the broken down tasks, so missing tasks in the different estimates become obvious.

Ball of Mud Projects: When projects base on a specific code base take too long each and every time, a reason might be that tasks keep getting more complex then anticipated. A typical reason for that would be an overly complex and convoluted code base, where nobody really can predict the effects a given code change may have.

In such a project I’d expect a lot 80% syndrome, few test and many swearing developers. Do a code analysis of the code base, with a concentration on architecture and dependency management. If you find a lot of crap, there is just one solution: Clean it up and pay your technical debt. I.e. invest time and money to improve the architecture. I also recommend implementing tests, that prevent the architecture from starting to rot again.

Multitasking: People are bad at multi tasking. So you can’t put your top developer for 20% in five different projects and expect her to work just as efficient as normal. Once you start thinking about it, it should be really obvious if you have the problem or not. And the same should be true for the solution.

Mythical Man Month: Conceptional very similar is the Mythical Man Month effect. Check if your estimates get processed like this: “Frank says he would need 6 months. We need to get done within 1 month, so let’s put 6 developers in the project.” If this is the case you are missing on all the overhead that a bigger team causes. In order to get a handle on that kind of problem read the book Mythical Man Month. I do think that book is overrated, but the chapter about the actual mythical man month is right to the point.

This problem is almost guaranteed to hit you when you do a project that is a magnitude larger then your normal project. You will underestimate by a large amount, if you do not factor in the exponential behaviour of communication over head.

Lazy Developer: Of course there is always the possibility that the developers just aren’t working hard enough. In my experience, most of the time, this isn’t a problem. But I guess is it does happen. Symptoms could be lots of open browser tabs, on sites that have no relation to the work at hand, acompanied by hectic wiondow/tab switching, when the boss drops by. If this is a problem with one developer, it might be that the developer has the power to change it. But if many developers spend their time procrastinating it is most probably a management problem. Procrastination is not fun. Getting stuff done is. So if everybody is procrastinating something in the environment is probably highly demotivating. Find out what it is and get it out of the way.

New Project’s Resolution

Fireworks

In a couple of days I’ll start working on a new project. Actually it is an project that I worked one or two years ago. I think I did decent job last time. But there is always lots of room for improvement. So today I want to list a couple of things I want to do bettern than the last time.

More Tests – When working on that project I was the only person with real experience in automated testing. I am proud that we had a test suite and that this test suite still exists and put to good use and actually grew during the last months. Yet I consider the coverage of the tests not sufficient by far. So for every code I write, I will add an extensive set of test. I’ll also add tests for existing code, when refactoring.

Earlier Tests – One reason we didn’t produce as many tests as I would wish today is that we didn’t do TDD. It probably was a correct decission the last time, since it would have been just to much. But this time I will practice TDD as good as I can.

Cleaner Code – One of the first tests I wrote back then was a test to ensure that we don’t have circular dependencies. We succeeded in this, but still parts of the system a pretty tangled. We gonna improve that. Appart from the knowledge I took from Clean Code I’ll plan to use the toxicity metric as an important tool for that.

Dependency Injection – We considered using Spring and decided against it, because we already had about 20 libraries we hardly knew in the project. Again at that time it was a reasonable decision. But in the meantime I learned what we could have gained by using Spring. I’ll propose using it, but even when the team declines again, I’ll be able to use the concept in order to improve the code base.

So these are the things I want to improve on in the next project. What are the things you want to improve on in your next project?

Is ISO 9001 obsolete?

A rusty car

Rusty Car

I just finished ‘Here Comes Everybody’, a must-read for anybody trying to understand what is going on with all this social media stuff. One point Clay Shirky makes, is that  the various web2.0 tools make failing cheap.

If in the 80s you had an idea for a group to form, you had a lot of things to do, many of which cost money: Finding a room to meet, printing fliers or ads to promote your idea and so on. If your idea failed, all that money was wasted.

Compare that to the situation today. Create a webpage, and promote it using facebook, twitter, xing or many of the other tools is free. All you have to invest is your own time. If it fails, nothing is lost. Actually something that fails today might get picked up tomorrow by somebody else and brought to success. Like a wikipedia article which is a stub at first, but then evolves in a well written article, through many mostly small changes an improvements. If one of the changes is bad, again the cost of this is minimal. Somebody will notice it and revert the change. Cost: a couple minutes of online time for bad article and 5 minutes work. Compare that to a typo in a printed encyclopedia. Fixing a typo would cost thousands of euros. So it won’t get fixed until the next revision which might take years to come.

So where is the relationship to the title of this post, namely the ISO 9001? One of the corner stones of modern quality management is to prevent errors to happen. This is a good thing under the assumption that fixing errors is way more expensive then preventing them. For building a car, this is probably still true: “Oh the brakes didn’t work? Probably because we installed any in the first place. Let me fix that right away. What do you mean, you don’t want them anymore? Oh you are dead … I see.” But for building software, for creating documentation for the software, for gathering requirements of the software, this assumption is wrong in many cases.

I am trying to write clean code, tested code, working code all the time and I expect my coworkers to do the same. But that doesn’t prevent me from checking in less then perfect code into the version control system. Because as soon as it available for others to see and to use, they might find bugs, or even fix some. They might provide feed back for improvement or further development of the piece. I would not accept a rule that disallows less then perfect code (or documentation) to be seen by others.

So, is ISO 9001 obsolete? I’d say NO for the following reasons:

  • While it is OK to have broken stuff around, it is not OK to leave it that way. An implementation of ISO 9001 may help with making this clear for everybody.
  • The agile / social media way of doing things does lead some people to the impression that it is OK to write crappy software. Although it is hard to understand how they come to that conclusion when reading the agile manifesto or any of the well known literature on agile software development, it still happens. Fixed rules inside a company of what must be done before releasing software or any development artifact to a customer.
  • While the ISO 9001 is build on the idea of preventing errors it actually isn’t hard coded in a form that requires anything like a Waterfall approach.
  • Even in a software developing company there are processes that need attention and that don’t need, and maybe shouldn’t be as flexible as a wiki page (consider the processes of paying the salaries with correct taxes and all)

But there are a couple of things that pop up around ISO 9001: Lengthy specifications and lengthy review processes of that specification. Many companies like to create those, and many people seem to think they are a requirement of the ISO 9001. This is not true. The norm requires that you know what you need, before you build it. A specification of the complete system certainly fits that requirement. But since you don’t create a complete new CRM in an afternoon, you don’t need the complete specification.

If you agree with a customer on a couple of user stories to be implemented in the next two weeks, and write those down on a whiteboard or in a webbased application or on some sheets of paper, any auditor of ISO 9001 conformance will have a problem criticizing that. If you use a whiteboard, you might want to make a photo of that. And if you write it down as a Fitnesse test, the auditor will probably be impressed.

So no, ISO 9001 is not obsolete. What is obsolete are ways of implementing the ISO 9001 that are damaging the reputation of that norm.

Posted in: Quality Management by Jens Schauder 2 Comments ,

Developing for Maintainability

Just as Supportability, Maintainability is one of these Non Functional Requirements, that is or should be required from every software development project. So what does that mean? Wikipedia defines it as

the ease with which a software product can be modified in order to:

  • correct defects
  • meet new requirements
  • make future maintenance easier, or
  • cope with a changed environment;

Wow, that’s great. Because it lends itself for an ‘easy’ test for maintainability. Take the completed product, make up some new requirements and measure how long it takes to implement those. If you do this with different products for same task, you can compare the time needed, thus comparing maintainability. Unfortunately most of the time, the requirement is stated before any software is written. And only one piece of software gets written, so there is nothing to compare to. So what to do?

Since one can’t realistically measure maintainability we once more should concentrate on characteristics making software easier to maintain, i.e. easier to change. There are plenty of measures around that are thought to be linked to maintainability. Cyclomatic Complexity being possibly the best known. I personally prefer toxicity, which is based on Cyclomatic Complexity and a couple of other measures, which contribute to the toxicity with the amount they are above a certain threshold:

Metric Level Threshold
File Length file 500
Class Fan-Out Complexity class 30
Class Data Abstraction Coupling class 10
Anon Inner Length inner class 35
Method Length method 30
Parameter Number method 6
Cyclomatic Complexity method 10
Nested If Depth statement 3
Nested Try Depth statement 2
Boolean Expression Complexity statement 3
Missing Switch Default statement 1

Compared to a single simple measure, this has the benefit of being less prone to optimization for the measurement, although of course this is still possible.

So does a low, even zero toxicity guarantee maintainable code? Hell no. For starters there are a couple of different things to consider:

  • Is the code well covered with tests?
  • Are the current requirements specified in a suitable manner (e.g. using tests)?
  • Is the code written in a consistent style
  • Is all the code including the documentation under control of a version control system?
  • Is the structure of the application, its architecture defined and documented?
  • Is there an automatic process for building new versions of the software (i.e. ant, maven, make scripts or similar)?
  • Is the code written in a language that is well known and understood and which has a large user community?

Everything but the last bullet point is pretty much identical with what I consider base practices for any serious software developer.

So I propose: The next time you encounter the vague requirement of maintainability, replace itybe useful and well testable requirements, based on specific practices and metrics. It still won’t guarantee maintainability. But it will increase the chance for it.

Developing for Supportability

Support beams and wires of  a bridge

Support beams and wires of a bridge

When reading the specification of a piece of software to be written, you are bound to find some non functional requirements. Among these there will be, or at least should be Supportability. But what the heck does that mean? How do you install supportability? Let me present some ideas, what you can do to improve supportability.

Let your application log in a well defined reliable way, to a location that is easily accessible. Flatfiles on a server qualify. If you must log on a client, consider implementing a way to automatically transfer the log file to a support person. If you log into a database, make sure the support person can access it easily.

Make your application easy to shut down and start. This sounds trivial, but it is easy to break this ability. Considere the following check list:

  • What happens when you start two different versions of your application against the same database? A nicely supportable application should notice this and react accordingly.
  • What happens when you stop your application while it is processing a request?
  • If you have batches or batchlike processes, what will happen with those when you try to stop your application while the batch runs? Do you have the 5hours, until the batch finishes? Or will the batch stop and rollback, so you have to wait an hour for the rollback to happen? Or will it stop nicely within a minute, and pick up its work automatically after restart?
  • Are you stuffing stuff into a database or a queue? What happens when the queue or the database gets started after the application?
  • Do you receive messages from a queue? What happens when the first message arrives, before your application is available? What does happen when you receive a message that you already processed? What does happen when you receive a message for which you already processed a later message?
  • How long does it take to shutdown and restart an instance of your application when it is under full load? If this takes more then a few minutes, is it possible to stop and restart only parts of your application?

Make the state of your application visible for support personnel. Most applications just report arbitrary errors when a component of the system is down. It’s up to the supporters to guess if it is the database, a queue, a webservice or the application itself which is causing the problem. Identify the resources your application depend on. Write a check which tests all these resources, and make this check available, for example as a special health check webpage.

Put components that are likely to fail behind some kind of buffer. Your database might be so important for the application that this doesn’t work for it. But if you are posting stuff to a queue (or webservice or …), consider using a local queue as a buffer, so your application can work as usual even when the target queue isn’t available.

Last but not least: Document your application. The agile manifest says that working software is more important then documentation. It doesn’t say you don’t need documentation. And I’d say the documentation for servicing your application might be the most important one. The normal user who uses your application everyday will figure out a way to get along. If not he will call you or your boss. But the poor support person has to support dozens of the applications and since your application just works he’ll encounter it only a few times a year. He will know nothing about it except the stuff documented in the manual. So make sure there are instructions on how to interpret the logs, how to shutdown and restart the application, how to analyze the internal state of the application and what happens when some connected component fails.

Have you noticed something? The vague non functional requirement ’supportability’ turned into a nice set of very functional requirements. You can attach a price to that, decide what pieces of it you’ll really need and measure if it really works. And I claim this works with all the much hated non functional requirements.

Social Media and Agile

Flip up Shades

Flip up Shades

For about a decade now everybody in IT talks about Agile, but hardly anybody else does. There is a somewhat similar concept of ‘Lean’ in other industries, but hardly anybody in a ‘real’ industry considers things like ‘release early’ a viable strategy. And that’s a good thing. I don’t want a second iteration car. “Ooops sorry, the story about braking is still in the backlog”. I am confident that if those kind of companies would learn agile from the books, they’d need quite some time until they realize the fine difference between ’ship’ and ’shippable’.

But maybe these companies won’t learn it from the books, but from their tools. As said before companies will learn about social media, and to some extend it will invade  daily life and work at these companies. But if you start doing your documentation in a wiki, for everybody to see or spurt out your idea on a internal twitter account, then you are presenting stuff as potentially shippable. Not the car you will eventually build. But maybe the concept for a new feature.

Everybody will be able to judge the results, comment, provide feedback. And without realizing it big old companies will become just a little bit agile.

Posted in: Agil by Jens Schauder 2 Comments ,

Versioned Data

1214482_77411415

Time Flies

In about one of two projects the customer comes up with the requirement of ‘historization’ of data. And more often then not this lead to an unholy back and forth of discussions, prototypes and complaining. The reason for this as far as I can tell is: This is not a well defined requirement. It can mean many things, and depending on what is meant a different implementation is due. So here is my list of possible implementation approaches along with the circumstances in which they might be useful. Note: Back in the times when this blog was mostly German I wrote an article on the options of versioning or ‘historization’. This article you you just read is basically a translated rerun of that article.

Audit Trail

The audit trail is the simplest of all approaches. When ever an object (or row in the database) changes, a copy of that object (row) gets stored in a separate table, together with a time stamp. This is so easy, it can be implemented with database tools alone, namely using triggers. With this approach it is fairly easy to identify the state of a given row at a given time. But it gets complicated to identify the same for a whole object graph, since the objects that are part of that object graph change at independent times. So you can’t just write a join. Also if you are not so much interested in the state of a row at a certain time, but at the changes that happen, the next approach might be more usefull to you.

Change Log

The Change Log is a variation of the audit trail. In both a record is produced on every change in a database row. But a change log, keeps track of the change, not so much of the state the row was in. So it would contain at least for the changed attributes the old and the new value. This way it gets easy to find out what kind of changes happen how often. A question often asked in data minining settings. Obviously you can and should fine tune your implementation between audit trail and change log to your exact needs.

Snapshot

In the Snapshot approach you take a snapshot of a complete object graph when an element of that graph changes. Note that now we are operation on objects and no longer on tables and rows. So there is no easy way to do this kind of thing in the database. Since you are creating copies of object graphs, you can navigate and query these graphs just as you can navigate and query the original graph. On the other hand the copies contain objects that didn’t change at all, so it is easy to end up with huge amounts of data fast.  Since these tables contain large amounts of duplicated data, compression might be really effective. Since you are creating the complete object graph for your history objects, nothing forces you to use the same model as with the live data. So your history schema might look very different.

All approaches so far are concerned with the past state of objects. But sometimes this isn’t enough. Sometimes what starts as ‘historization’ ends up being full blown …

Versioned Objects

This is the most powerful approach: You model the versions i.e. the changing state of your objects as part of your domain model. Like so:

public class Thing {
  1. private Set&lt;thingversion&gt;versions;
  2.     public ThingVersion getVersion(Date timestamp){}
  3.     // …
  4. }
  5.  
  6. public class ThingVersion {
  7.     private Date validFrom;
  8.     // …
  9. }

Depending on the Context you might use references to the ‘Thing’ or to a specific Version or to the version, which is valid at a specific point in time. Let me give you an example. When dealing with trains, a lot of rules have to get considered. Certain combination of good must not travel in wagons right next to each others. Wagons with certain goods have to have  certain features. And so on. These Rules change. And of course you need proof that the train you send on a track one year ago obeyed all the rules valid at that point in time. So far this sounds like a case for on of  the first three options. But when these rules change, you will get upfront notice of that change. But if you have a train that you plan for end of the month, which will obey the rules valid at that time, although it might violate rules which are in effect right now. In order to model this, you need Rules, that exist in Versions. And the train does reference that rule and uses it’s time of departure to request the appropriate rule version, which is valid at that day.

This will definitely make your model much more complex. So don’t use it if you don’t need to. But if you are dealing with planned changes this has proven to be a useful approach for me.