Why you should never ever copy code – An example of complete chaos

4

It seems innocent, a
few lines of code used twice in the same application. But before you know know
it those lines appear 272 times in your application! You guessed it right and
you’re probably wondering how could that happen. I’ll try to explain what
happened and how to prevent it.

When I started on the
Locatus project my colleague Erik warned me to never copy any code. What’s
wrong with twice the same lines I thought at that time, but luckily I followed
his advise and the Locatus application is still very maintainable thanks to
Erik, JUnit and Spring. When you start copying, it never ends. Three times is
still not much and quite maintainable, but what’s the limit? A few months later
I found out that one occurence is the limit, just don’t copy anything.

....

Right now I’m working
at a customer doing some PL/SQL. The customer bought a large web application
(not from AMIS by the way) and is making changes to it. I was asked to add a
column to a html-table. The table was rendered by ASP and the data was coming
from a refcursor in PL/SQL. That doesn’t sound too difficult I thought. I still
have to learn some PL/SQL so I thought it was possible to do it in one day. When
I found the procedure that returned the refcursor I was a bit surprised. The
PL/SQL procedure contained 15.000 lines of code. How am I ever going to
understand so many lines of code I thought. Well that was quite easy. The
procedure returns a refcursor, so I searched for OPEN [cursorname]FOR SELECT
in the procedure. 200 hits! That search covered about 10.000 lines.

But how and why you
are thinking. The html-table contains about 8 sortable columns, columns can be
sorted ascending and descending (2 possibilities) and you can choose whether
you want paging in the table or just all the data (another two possibilities).
At the top of the table you can select 6 options from a dropdown box, this
dropdown box filters the data.

8 * 2 * 2 * 6 = 192.
That’s about the same as the 200 hits I found. For every possibilty a new query
is written! The only differences between the queries are the sort order and
column, a where clause for every filter and the query wrapped in a paging
query. I still don’t believe what they’ve done. But the guys at the customer
thought it was quite normal and 200 times wasn’t that much, 1000 copies was
much. The only explanations I have is that the guys that built the original
application are paid per line of code or just don’t understand the
possibilities of PL/SQL.

 

But there are still
5000 lines left. Those lines fill some temporary table (not a real temporary
table, just a normal table that is emptied once in a while). The tables are
filled with 14 queries and after that updated with 7 update statements. At the
moment I found the 200 occurences of the refcursor I was wondering why they did
it, right now I don’t even want to know why you fill a fake temporary table a
few times and then build 200 huge queries (with joins!) to get data from that
table.

 

I took me about 7 days
to understand what was happening and now I’m almost finished building it. I
hope I didn’t break anything, that’s always really dangerous with these kind of
applications. Well, I think I have to check again if everything is still working,
I just don’t trust myself enough when I make changes to code I don’t fully
understand (read the some numbers section why it is impossible to understand
the entire application)

 

How to prevent it

The only way you can
prevent this is to be very careful when you copy code around. It’s better to
make a method of the code you want to copy and use that method twice. This way
your application is much better maintainable and it is very easy and cheap to
add a third call to that method. When you’re about to hit the ctrl-c keys and
selected a few lines of code you have to wonder : “Am I starting what I read in
the blog a read a while ago and will other people hate me for doing this”. Then
try to solve your problem without ctrl-c (and don’t print the code and re-type
it ;-) ). When you’re absolutely there is no other way go ahead, but I warned
you!

You also have to read
the book “Refactoring: Improving the Design of Existing
Code” written by Fowler. You only have to read one chapter and you’re already a
better programmer. This book shows all the traps the programmers of the
application I talked about walked right into. Don’t read the book and throw it
away afterwards. Just read a chapter every week and when you’re finished start
over again, it will save you so much time and you can make so many people happy ;-)

 

Conclusion

I added the colum to the html-table the same
way the original programmers did it. Firstly because it’s is company policy to
do it the same way ‘the other guys’ did it. This policy is quite ok if you
think about it, otherwise you get 20 solutions for the same problem and it’s
easier to understand only one solution. Of course it is totally wrong and Erik
is probably screaming WHY! Secondly, when I try to improve chaos I probably
break something. Before I can do some proper refactoring I need reliable unit
tests, a reliable application and time to refactor. With these kind of
applications I’m just too scared to make any changes. Most of the time I was
looking for the implications of my changes.

Finally, it just makes no sense. We have to
change very small parts of the application that will probably never ever be
touched again. Check the ‘some numbers’ section why refactoring this
application is very expensive.

 

To summarize this article:

  • Unit testing
  • Book about refactoring
  • Someone with more experience who isn’t afraid
    to say there is a better way to do it

 

Some numbers

  • 22.500 .asp files
  • 10.000 .js files
  • 60 MB of PL/SQL code

 

Share.

About Author

4 Comments

  1. Hi, i work in telecom industry. We have a lot of legacy PL/SQL code. I have a package of 2500 lines of PL/SQL that at least 2-3 developers worked on it before me.
    This package works on production server. Now I have to change it, add some new business logic and also refactor it in order to pass QA tests.
    Without reliable unit testing, a lot of time and work it is very hard to refactor. Because it is related to billing system, changes are very risky.

    I have created some plan for me:

    1.Research of PL/SQL unit testing tools
    2.Read “Oracle PLSQL Best Practices, 2nd edition by Steven Feuerstein
    3.Read some books on refactoring (general concepts)
    4.Research on PL/SQL refactoring
    5.Develop unit tests for each procedure/function.

    I can share these resources:

    1. There are some PL/SQL unit testing tools
    CodeTester
    utPLSQL
    Pluto
    PLUnit
    dbFit

    I liked Quest Code Tester but it is not free (595$ per user), our company is going to buy it.
    For open source solutions try
    a. utPLSQL
    b. Oracle SQL Developer – http://www.oracle.com/technology/obe/11gr2_db_prod/appdev/sqldev/sqldev_unit_test/sqldev_unit_test_otn.htm

    2. http://www.amazon.com/Oracle-PL-SQL-Best-Practices/dp/0596514107
    3. Practical Best PL/SQL Video Series produced by Steven Feuerstein
    http://www.toadworld.com/EXPERTS/StevenFeuersteinsPLSQLObsession/PracticalBestPLSQLVideoSeries/tabid/154/Default.aspx

    Best Regards,
    Ramin Orujov.

  2. Wow… those numbers just don’t make any sense… How does anyone find his/her way in such a huge amount of templates (I guess strongly types Java would be more doable). By the way… wikipedia does actually contain some occasions in which it could be OK to break the DRY (don’t repeat yourself) principle (http://en.wikipedia.org/wiki/Don't_repeat_yourself)