Michael Eriksson
A Swede in Germany
Home » Software development | About me Impressum Contact Sitemap

Strive for re-use

One of the main differences between good and poor software developers: The former strive for re-use; the latter re-invent the wheel. Not only is re-use usually a time saver, but it also typically reduces the complexity of the code, diminishes the number of bugs, allows for central points of control, makes it likelier that others will understand the code from the go, and increases the chance that there will be decent documentation.

Unfortunately, all-too-many developers do not understand when re-use is beneficial—in fact, might not even recognize the opportunity, except in the most obvious cases.


Side-note:

The first version of this text was published in 2009, while, e.g., the anecdote immediately below took place in (likely) 2001. Correspondingly, some claims, e.g. relating to Java or performance characteristics, might be dated at the respective time of reading.

The general principles illustrated, however, are largely independent of time.


To take a specific example: My first task at E3, a long time ago, was to provide a basic queuew for use in Java. In a first step, I looked through the Java collections classes to check whether a queue was already available. This was not the case (at the time; later java.util.Queue was added). I then had a look at other collections, made a few brief combinations in a wrapper class (probably just adding and removing elements to/from a LinkedList), and had my queue done with little own coding and most of the “leg work” done in the collections framework. I could, in particular, rely on the very thoroughly tested standard collections having eliminated most of the potential bugs that a hand-coded solution would have risked. As a bonus, extensions and modifications, e.g. to allow for other data types than Strings (to which the original requirements were limited), would be easy. I even remember being proud of myself for having so neatly solved the problem.

My boss took one look at the code, made a statement along the lines of “That is nice, but the task was to write your own queue. Use an array of Strings as the underlying data structure.”, and sent me back to the drawing board—thereby demonstrating a highly disputable judgement.

Now it had become clear from context that the task was not just intended to produce code for the application being developed, but also had the unstated purpose of testing my abilities. Such a purpose is perfectly valid; however, had he had his priorities straight, he would have had a very positive answer to the test: I am a developer who strives for re-use and low complexity, and who does not unnecessarily re-invent the wheel. In fact, trying to implement own data-types for everything is a beginner’s mistake that I committed and grow out of during my undergraduate days... As is, I suspect that he drew entirely wrong conclusions: Michael was uncertain how to handle the task on his own, and resorted to re-use to cover up this fact.


Side-note:

A secondary point is the benefit of being open about the purpose of a task—had he stated upfront that there was an aspect of test, and what the test aimed at, my approach would have been different and the result would have better reflected what he wanted to test. Knowing the “why”, be it with software requirements (of which this was a special case) or more generally, often allows us to do a better job, to make the counterpart happier, and to avoid misunderstandings.


Being able to handle complexity is good; avoiding complexity is better. Being able to write functionality from scratch is good; re-using functionality is better. (The respective formers might, arguably, be prerequisites for being a software developer; however, that a F1 driver can handle skidding does not imply that he should spend most of a race skidding.)

But what about the speed of the solution? Usually not an issue: Firstly, optimization should be done when it is actually needed; secondly, many ready-made components (including Java’s collections) are already sufficiently optimized that a hand-made solution often needs considerable tweaking before it can compete.

To illustrate the same principle with a similar example: How does one take an existing List and provide a counterpart with unique elements? Go through it checking for duplicates? Use a hashing mechanism? No; create a Set based on the original List, and the rest is automatically taken care of. If a List is required as output, just create a List based on the Set... Something like new ArrayList(new HashSet(originalList)) should do the trick. In the unlikely event that this turns out to be too much overhead, then consider using e.g. a hand-crafted loop. Possible exceptions include when very long lists are involved, the code is called very often, the lists are sorted, and/or strict real-time responses are needed—calling this code with a list of ten elements a hundred times per second might not even have a measurable effect, let alone a noticeable one, on a typical application on a modern computer.


Addendum:

At the time of original writing, I used a measurability scale based on milliseconds, which through System.currentTimeMillis() was long the standard for measurements in Java—and even this was often optimistic: Windows NT, e.g., refused to do better than hundredths of a second.

As indirectly pointed out by a reader who disagreed with my statement about measurability, System.nanoTime() is now available, and higher standards for what is measurable should be used. (Beware that, as with NT of old, there is no guarantee that any individual system will make full use of nanosecond resolution and accuracy. A brief test on my own computer gives nothing better than microseconds; which would still be more than enough to measure the above, however.)

Even so, there is a fair chance that the corresponding extra cost would have no noticeable effect in even a typical desktop application, let alone the ones discussed immediately below in the original text.


Certainly, in the applications that I have typically been involved with, it would have been a triviality: JEE based server-side applications with file and database access, JDBC overhead, dynamic generation of HMTL from JSP, latencies for the Internet connection to the user, ...


Side-note:

While I do not advocate being wasteful with CPU resources, it astounds me how many alleged professionals have no concept of the current turn-over in a CPU, and how much more e.g. a typical file access costs—or what difference an algorithm with a higher or lower level of (computational) complexity can make. Notably, the above method of creating a list with unique elements is linear in the size of the list, just like a manual filtering would be—and the actual use of the filtered and unfiltered list is likely to be at least linear in the first place...

(In contrast, if we wanted to find the smallest element of a list by first sorting the list and then picking the first element, the complexity would be Nlog(N) or higher; whereas a direct search for the smallest element could be done in linear time—here an a priori accusation of wastefulness could be justified. OTOH, even here it would take a rather large list before the effects became noticeable.)