Skip to content

Post-mortem of a Java project: My developer perspective

TL;DR If you don’t want to read the whole post these are the main ideas. While it describes a particular project, the lessons learned are applicable to virtually any project. Spoilers ahead! Working with legacy code is hard and ugly… pain ensues. Use checkstyle and Findbugs. Netbeans Platform is very powerful but has a very steep learning curve. SVN is bad, move on to any DVCS, we chose git. Simplify build process and add automatic dependency management, we chose Maven. TDD is good… we used jUnit and Mockito. Javadoc is dangerous, prefer self-explaining code and unit tests. Don’t reinvent the wheel, leverage existing standards and well tested libraries.

I’ve been involved for two years in a Java project and now that my involvement has come to an end it is time to look back to the development and see what went well, what went wrong and why.

When I arrived to the project there was already a lot of legacy code that not many people knew how it really worked. It took me two weeks to be able to configure the development environment and produce a build! The software could only be built with eclipse after having checked out the code from the SVN repository into a very particular directory tree structure. If there is hell, it cannot be much different than that!

I had worked with Java a few years before working on this project and Eclipse back then was horrible. No matter how beefy your machine was, Eclipse would manage to bring it to a crawl. Things have improved very much with the latest versions and I’m sure that cheap RAM plays also a role. Fact is that now I can run Eclipse and consider it a functional piece of software and a decent IDE. But I disgress. I was worried that using Eclipse would be a problem but actually it was one of the really useful tools I had at my disposal to work.

One of the best things we had was a set of files for Checkstyle and Findbugs. If you are working on a project I think it is a must to have the IDE (or the build system if you can) check and enforce consistent style usage by all the members in the team. Using these tools helped us find and fix many potential issues in the software.

One of the first things I did was to create a script that would check out the codebase from the SVN repository into the “correct” directory tree structure. I don’t mind if I have to fight with the code to develop new things or fix bugs, but at least I want to be able to checkout the code and have it in a building state without having to spend hours figuring out why a fresh copy from the repository won’t build.

The next step was to automate the build with ant. To be precise, I should say, to fix the ant build scripts that were present but had fallen out of sync with the projects structure and were not building anymore.

Having a system that builds and after reading “Working Effectively with Legacy Code” I started to do meaningful work on the project.

While the initial design was well thought, the development was not always matching the design. The requirements, features and use cases changed frequently. Many times things had been hacked to work as they were supposed to in a certain moment. To make matters more interesting, there were virtually no unit tests, which made any change to the code a game of russian roulette. You never knew if it would work and even if it worked, you had no certainty that you had not broken something else.

I would like to say that I followed all the recommendations of the book “Working Effectively with Legacy Code” but the task was so daunting that I just focused on two things: Understanding how the software worked by talking and asking questions to the people still involved in the project and trying to make a release version that would be stable enough for further testing and serve as a proof-of-concept validation.

While I was working on that quest (believe me, it was nothing short of an epic endeavour!) another team worked on a new GUI. The software was in dire need of a better looking and user friendly GUI so it was really great that some people wanted to work on it. Nevertheless, we had three problems with this.

  • They were a rather big team and we were not many to give them support. If we would not respond fast enough they would move on and make decisions that were not always correct due to their limited knowledge of the system as a whole.
  • They decided to build the new GUI using Netbeans and the Netbeans Platform.
  • Given that the software was a mess (no separation between business logic and GUI) they created an API that matched their new GUI’s needs and stubbed it for testing and demoing. The API was thoroughly documented with Javadoc.

That meant that they came up with a working GUI in a relatively short period of time that showed dummy information but looked really good. Unfortunately, the API they had come up with was not a perfect match for the software needs in some cases and completely wrong in others. To add insult to injury, there was no easy way to plug the new GUI into the old code. I was the only one that had a partial view of how the old code worked and the GUI guys didn’t want to touch that the old code. I was left alone with the old codebase and the new GUI with the task to integrate them. Awesome!

If you have never used Netbeans and the Netbeans Platform, let me just say that it is really powerful and you can get things done very quickly… if you know how to do it and are willing/able to bend to its needs. The learning curve is steep. Extremely steep when you get code you don’t understand in a platform you have never worked with before.

At this point we had a “stable” version of the software that some users were able to test. It was far from perfect but it did a basic set of things it was supposed to do. A decision had to be made in order to continue the development. Should we try to use the new GUI or should we forget about it and evolve/revamp/burn-and-rebuild the one we had? Given the characteristics of the project (educational and mostly made by students) I pushed to continue using the new GUI. Those guys had put lots of effort in the GUI and it would be very demotivating for them to see that their hard work would be trashed. I hoped that by keeping them involved they would help with the integration and eventually would be interested in seeing their work succeed. To this day I still have mixed feelings about this decision. On one hand we contracted a very big technical debt by using the Netbeans Platform that no one in the team knew about, on the other hand it forced us to refactor the codebase to be able to integrate it as I’ll explain later.

After we decided to go for the new GUI, I started refactoring the old codebase. I took advantage of the opportunity to refactor everything and separate the code in different layers. Initially you could find horror stories like GUI button actions that would do network messages serialization and deserialization inside the GUI! After the refactoring everything was clearly separated. Everything that came from the network would be handled in one module, all the business logic was in another and there was an API where GUIs could plug. As a proof of concept and to help the process of separating the GUI from the business logic I built a GUI that was just a console interface. It was extremely useful as it allowed not only to test quickly and with minimal overhead that the software was working, but it allowed complex actions to be scripted.

Once an API for the GUI was available it was time to go back to the Netbeans-based GUI and plug it. Well, it took ages to get that working. Probably because we didn’t do everything “The-Netbeans-way” from the beginning, but eventually after many hours of frustration we managed to get it to do something. Getting it to a state in which we could say that it worked took a lot longer. One of the big problems we had was that the embedded Javadoc documentation slipped out of sync with the code. The methods signatures were normally corrected, but all the gratuitous literature that accompanied was more often than not completely wrong. Many things did not make any sense and when new people came to the project and wanted to use it they would get completely confused!

Part of the problems were caused by my lack of knowledge of the Netbeans Platform and the cryptic language they use. It is very difficult to find information online if you don’t know all the Netbeans lingo. Eventually we managed to tame the beast and have something that was doing things and not crashing constantly.

When users began to install the new version of the software we faced another wave of problems. People had high expectations for this new software, in terms of new functionalities, while most of the effort had been put on refactoring the codebase and plugging the new GUI. Things that worked before were not working now and things that people didn’t know that were not working before were fixed in the new version.

In preparation for an open source release a new code repository was created and a Jenkins continuous integration server was deployed. For the code repository we decided to use git and we migrated the old SVN repository to git.

A final push in the development was done. The goal was to fix problems that had been identified in the version and add a couple of new features but the resources allocated were insufficient to cover them all. Therefore it was agreed that it would be undertaken on a best-effort basis.

As I started to tackle the bugs I realised that many of them were originating in the object type conversions that I introduced during the refactoring. Let me give you some background to the story. To ensure that classes were not trespassing the bounds I created different types of objects representing roughly the same information in the network layer, the business logic and the GUI. Originally the network objects (XML beans) had been used everywhere in the code up into the GUI. When I plugged the Netbeans GUI it brought a new set of API objects that had been defined by the GUI team. Given that none of them were a good match for the model and not being feasible to refactor all those pieces at the same time, I decided to introduce a type of object that would come as close as possible to the data model in the business logic and have converters that would translate between the object types at the boundaries. While adding lots of overhead it proved to be a very useful approach to decouple all the parts. Suddenly all those conversion related bugs were concentrated in just a few classes that I could unit test and mock. For mocking I used the fantastic Mockito and PowerMock for the rare cases where I needed to mock static methods. Instead of just adding unit tests to cover the problematic cases that had been reported, I wrote tests to fully cover the converters. By doing this some extra bugs that had not been reported were surfaced and fixed.

The new features were developed following the TDD approach. Write the test, (red), write the code (green), refactor (green). And it made a world of difference knowing that you had a safety net from the start. During the development we had added unit tests to the codebase, but when you add the tests later it always feels as if you are missing something.

Additionally, I realised that I don’t need commented code if I have tests. They provide examples of how to use a method and what the expected output is, and best of all, they stay always in sync with the code!

This new stage of the development was done using git as the version control system and I cannot understand how there can be people still working with SVN! I had my share of horror stories with branching and merging in SVN. I don’t want to be developing in trunk and be afraid of committing something that breaks the build! Working with other in SVN was certainly an improvement over the days of CVS and even some of the early versions of SourceSafe, but after having used a DVCS I’m not going back. Creating a new branch for each new feature, merging them to master when needed, having the full repository history on your machine… those are things that are too important not to take advantage of.

Still, there were two issues that were nagging me. The first one was that the network layer was overtly complex and not really working in the way it should. For a more detailed description of the problems you can read how I ended up migrating it to XMPP. The second one was that the build process was complex. We had the old codebase that was developed in Eclipse, the new GUI was Netbeans, everything built using ant. It kind of worked. You would get a final build on one go, but the moment you had to change a dependency it was hell on earth. You had to find the binaries, replace the binaries in the repository (yes, we had all those third-party jars in the repository), update your ant build scripts in a non-negligible number of places, … I decided to migrate the build process to Maven for a number of reasons. First it allows a very easy definition of dependencies and can automatically infer the correct build order of the modules in your project. Dependencies are downloaded from the Internet when needed and changing a dependency version requires a single change in a single file. No more need to have the binaries in the repository!

As a side benefit it forced a restructuring of the code tree. Maven favors convention over configuration. While in ant you can place your code more or less in any way you want, in Maven there is a convention that makes sense and allows you to forget about many of the builds details as long as you place the files in the expected place (eg. source files in src/main/java/… ; test files in src/test/java/… ; resources in src/main/resources/…). While it was messy to move everything around, it certainly paid off. The build process is executed with a single command, dependencies are handled automatically by the build system and anyone that knows how Maven structures a project knows immediately where to look for the files they want. It reduces the project structure learning curve for new developers.

It’s very difficult to explain the frustration one feels when a project that was meant to be open sourced is still kept behind a walled garden. All the work that has been done with the refactoring, the integration of the new GUI, the migration to git and maven is not always visible nor understood, but if one day the project is finally open sourced, new developers will not have to suffer as much as I did when I began. It is by no means a work of art and I’m certain that it can be very much improved, but given the circumstances I’m happy to leave the project in a much better shape than it was when I started to work on it.

Related posts:

  1. Oracle Java 6 End-of-Life in July 2012

Post a Comment

Your email is never published nor shared. Required fields are marked *