Why you should never ever copy code - An example of complete chaos 20188367001

Why you should never ever copy code – An example of complete chaos

It seems innocent, a
few lines of code used twice in the same application. But before you know know
it those lines appear 272 times in your application! You guessed it right and
you’re probably wondering how could that happen. I’ll try to explain what
happened and how to prevent it.

When I started on the
Locatus project my colleague Erik warned me to never copy any code. What’s
wrong with twice the same lines I thought at that time, but luckily I followed
his advise and the Locatus application is still very maintainable thanks to
Erik, JUnit and Spring. When you start copying, it never ends. Three times is
still not much and quite maintainable, but what’s the limit? A few months later
I found out that one occurence is the limit, just don’t copy anything.

Right now I’m working
at a customer doing some PL/SQL. The customer bought a large web application
(not from AMIS by the way) and is making changes to it. I was asked to add a
column to a html-table. The table was rendered by ASP and the data was coming
from a refcursor in PL/SQL. That doesn’t sound too difficult I thought. I still
have to learn some PL/SQL so I thought it was possible to do it in one day. When
I found the procedure that returned the refcursor I was a bit surprised. The
PL/SQL procedure contained 15.000 lines of code. How am I ever going to
understand so many lines of code I thought. Well that was quite easy. The
procedure returns a refcursor, so I searched for OPEN [cursorname] FOR SELECT
in the procedure. 200 hits! That search covered about 10.000 lines.

But how and why you
are thinking. The html-table contains about 8 sortable columns, columns can be
sorted ascending and descending (2 possibilities) and you can choose whether
you want paging in the table or just all the data (another two possibilities).
At the top of the table you can select 6 options from a dropdown box, this
dropdown box filters the data.

8 * 2 * 2 * 6 = 192.
That’s about the same as the 200 hits I found. For every possibilty a new query
is written! The only differences between the queries are the sort order and
column, a where clause for every filter and the query wrapped in a paging
query. I still don’t believe what they’ve done. But the guys at the customer
thought it was quite normal and 200 times wasn’t that much, 1000 copies was
much. The only explanations I have is that the guys that built the original
application are paid per line of code or just don’t understand the
possibilities of PL/SQL.

 

But there are still
5000 lines left. Those lines fill some temporary table (not a real temporary
table, just a normal table that is emptied once in a while). The tables are
filled with 14 queries and after that updated with 7 update statements. At the
moment I found the 200 occurences of the refcursor I was wondering why they did
it, right now I don’t even want to know why you fill a fake temporary table a
few times and then build 200 huge queries (with joins!) to get data from that
table.

 

I took me about 7 days
to understand what was happening and now I’m almost finished building it. I
hope I didn’t break anything, that’s always really dangerous with these kind of
applications. Well, I think I have to check again if everything is still working,
I just don’t trust myself enough when I make changes to code I don’t fully
understand (read the some numbers section why it is impossible to understand
the entire application)

 

How to prevent it

The only way you can
prevent this is to be very careful when you copy code around. It’s better to
make a method of the code you want to copy and use that method twice. This way
your application is much better maintainable and it is very easy and cheap to
add a third call to that method. When you’re about to hit the ctrl-c keys and
selected a few lines of code you have to wonder : “Am I starting what I read in
the blog a read a while ago and will other people hate me for doing this”. Then
try to solve your problem without ctrl-c (and don’t print the code and re-type
it 😉 ). When you’re absolutely there is no other way go ahead, but I warned
you!

You also have to read
the book “Refactoring: Improving the Design of Existing
Code” written by Fowler. You only have to read one chapter and you’re already a
better programmer. This book shows all the traps the programmers of the
application I talked about walked right into. Don’t read the book and throw it
away afterwards. Just read a chapter every week and when you’re finished start
over again, it will save you so much time and you can make so many people happy
😉

 

Conclusion

I added the colum to the html-table the same
way the original programmers did it. Firstly because it’s is company policy to
do it the same way ‘the other guys’ did it. This policy is quite ok if you
think about it, otherwise you get 20 solutions for the same problem and it’s
easier to understand only one solution. Of course it is totally wrong and Erik
is probably screaming WHY! Secondly, when I try to improve chaos I probably
break something. Before I can do some proper refactoring I need reliable unit
tests, a reliable application and time to refactor. With these kind of
applications I’m just too scared to make any changes. Most of the time I was
looking for the implications of my changes.

Finally, it just makes no sense. We have to
change very small parts of the application that will probably never ever be
touched again. Check the ‘some numbers’ section why refactoring this
application is very expensive.

 

To summarize this article:

  • Unit testing
  • Book about refactoring
  • Someone with more experience who isn’t afraid
    to say there is a better way to do it

 

Some numbers

  • 22.500 .asp files
  • 10.000 .js files
  • 60 MB of PL/SQL code

 

4 Comments

  1. Ramin Orujov February 17, 2010
  2. Jeroen van Wilgenburg April 29, 2007
  3. p3t0r April 19, 2007
  4. Patrick Wolf April 18, 2007