Wednesday, July 18, 2012

Comments

Consider the following code:

//Increment data count here
dataCount++;

The developer responsible for this legacy code heard that comments are good. What he was not aware of is DRY (Don't Repeat Yourself) principle. It is obvious what the code does. What is not obvious is why it is incremented here and not at the end of the function. I'd like to know why it is incremented unconditionally. As maintenance developer I'd like to see variable name that clarifies what kind of data it counts. Unfortunately there is no comment to help me.

Sunday, July 15, 2012

Phunction names.

Somewhere in code there was a function named "abtain_list_lock", That's right, with "a". A few years ago a rather large group of developers branched out from main version to work on a another set of special awesome features. The developers working on side branch quickly renamed the function to "obtain_list_lock" and continued working. Over the next three years much work done; lots of new functions that obtain list locks were written. It was time for a merge between branches and resulting version included awesome new features from both branches. It also included both "abtain_list_lock" and "obtain_list_lock" having the same function body. One day this will be fixed, but not today.

Why?
1. Developers do not read function and variable names carefully, IDE allows code navigation without reading function names.
2. Developers on side branch expected that the typo on main branch will be fixed by the next person who uses that function, extrapolating own quality standards on others. It also highlights some bureaucracy in branch ownership and commit policies in larger software corporations.
3. C language allows implicit function declarations. Many users of "abtain" did not include the relevant header file where function is declared. Thus removal of function with typo led to errors while linking, not during compilation. Due to complicated custom build system such errors did not show verbose error messages by default. A simple preprocessor macro in header file wouldn't be enough to make it link.
4. The merge was done in haste. People were under pressure to release product that contains all features from both branches as soon as possible.

Wednesday, June 13, 2012

Software maintenance and selection bias

As maintenance developer you often look at old code. And old code is ugly. If you think that all old code is ugly you are wrong. Good old code simply does not need your attention. Don't let selection bias influence your overall opinion on company code..

Wednesday, March 14, 2012

Rotational matrix as an example of Data Driven Design.

Every self respecting linear algebra teach causes students to hate matrices. If you can't you are not qualified. Only now, after three years without formal study of matrices, I learned to appreciate them.

I made a small computer graphics project at school. I had a separate function for X axis rotation, separate function for Y axis rotation, Z rotation and XYZ scaling were separate function, so was XYZ translation. My teacher told me to use matrices but I simply could not understand why it is better. I mean, matrices are creepy and functions are simple. Rotation around arbitrary axis was a bit cheaty (and buggy); this is fine for a small project though.

Now I understand the beauty of matrices. Now I see what is wrong with transformations stored as list of function objects. 8 years ago I was too young to understand the value of simple matrix that replaces arbitrarily long list of functions with parameters.

My friend recently showed me a piece of code from his hobby project. The code was full of conditional and switch cases. After little thought we realized that confusing branching in code can be replaced with a single matrix. It was the most satisfying change I ever made in code.

Saturday, February 25, 2012

The value of well placed goto.

Some legacy programming languages do not support nor destructors nor exceptions. You must carefully check the returned error codes in case of error you must carefully free all resources. Often this leads to code like this:
error_code =some_func(some_params);
if (error_code!=NOERROR) {
  goto cleanup;
}
error_code =other_func(some_params);
if (error_code!=NOERROR) {
  goto cleanup;
}
error_code =yetanother_func(some_params);
if (error_code!=NOERROR) {
  goto cleanup;
}
//... More code here


There is much code like this. At some point a well-intentioned developer who heard that goto is evil changed the code to look like this:
do {
  error_code =some_func(some_params);
  if (error_code!=NOERROR) {
    break;
  }
  error_code =other_func(some_params);
  if (error_code!=NOERROR) {
    break;
  }
  error_code =yetanother_func(some_params);
  if (error_code!=NOERROR) {
    break;
  }
  //... More code here
} while (0);


In my opinion the do-break-while(0) is much uglier and harder to read than goto.
1. Here break is just a goto in disguise.
2. If you see the beginning but not the end you think there is a loop here.
3. In original version goto is used to control the flow in exceptional cases. Goto does not change the structure of the code under non-exceptional scenario.

What do you think?

Saturday, February 18, 2012

More about globals.

In addition to possibility of an API that silently relies on global variables there are more potential pitfalls.

Suppose that you have a global variable which is modified within ten different if-else blocks. You wrote two tests - in one all ifs are true, in another all ifs are false. Now you believe that you have 100% code coverage.

The real coverage, however is not so good. 10 ifs that manipulate the same globals are semantically similar to a huge switch-case statement with 2^10=1024 possible cases. It will take 1024 tests to give true full coverage.

Tuesday, February 14, 2012

assert(g_everything->isEvil())

I'm in development for 3 months now. I must learn I understood something important. Something that I learned long ago but understood only now. Global vars are evil.

I saw many books about it, but I did not understand. A well placed global variable is very convenient and there is no harm in it. This is what I though when the largest project I experienced was around 2000 lines of code.

If you think what I thought 3 months ago you will hopefully learn from this blog post.

Look at this function signature:
error_code_t loadItemFromDefaultDir(const std::string filename);

As you can see function loads item and returns error code. Where is the item?

if(loadItemFromDefaultDir("it42.item")!=NO_ERR) {
throw load_exception;
}
sub_item_t* mySubItem = g_everything->repository->collections->getSub("it42", MY_SUB_CODE);


You may think there is no problem here, and you are partially right. The problem begins when the item is loaded in one place but the function that uses it is in different module. And the loader does not check the error code. And the item name is not constant, but dynamically created string... You got the idea.

Sometimes you write code like this and it does not work for your argument, but works for any other argument.
convertAndNormilize(&myConvertibleString);

You spend much time looking at the code and finally veteran developer feels pity on you and explains that the string you pass is defined in g_everything->exceptions->conversion_exempt_strings and you must first call
g_everything->exceptions->override_convert_exc_when_norm(myConvertibleString)

Monday, February 6, 2012

C++11 factory

Today I was playing with awesome new features of C++11. In the examples below B is base class and D is derived.
The following code works fine:
std::function<std::shared_ptr<B>()> factory;
factory = [](){return std::make_shared<D>();};

The following code doesn't work:
auto factory = std::make_shared<B>;
factory = std::make_shared<D>;

The first code works because polymorphic wrappers for function objects are smart; the second does not work because function pointers are not smart at all. You can not cast from function that returns derived shared pointer to function that returns base shared pointer, you need function wrappers for it.
I tried to implement parametrized factory method. VC10 does not support initializer so the following did not work yet:
std::map<std::string, std::function<std::shared_ptr<B>()>> factory { 
    std::make_pair("D", [](){return std::make_shared<D>();}),
    std::make_pair("D1", [](){return std::make_shared<D1>();})
};

I'll test this trick with gcc4.6 tomorrow. In VC10 you can initialize the map by hand:
std::map<std::string, std::function<std::shared_ptr<B>()>> factory;
factory.emplace(std::make_pair("D", [](){return std::make_shared<D>();}));
factory.emplace(std::make_pair("D1", [](){return std::make_shared<D1>();}));

Please note the use of emplace method.
You can use factory like this:
auto Bptr = factory["D"]();
//Do something with D object...
Bptr = factory["D1"]();

Unfortunately std::make_unique is not a part of new standard, you can copy the code for make_unique from Sutter's blog. Factories returning unique_ptr are considered better.