Thursday, April 15, 2010

Is this a software bug defect?

Suppose you find a piece of C code like this in something you're supposed to maintain:
11    char dst[MAXDLEN];
..    ...etc...
42    /* String in "buf" mustn't exceed size of "dst"; if it does, ignore it. */
43    if (strlen(buf) < MAXDLEN) { 
44        char *p = strchr(buf, '_');        /* strip any suffix matching '_.*' */
45        if (p != NULL) {
46            strncpy(dst, buf, p-buf);      /* copy up to suffix */
47            dst[p-buf] = '\0';             /* because strncpy may not NUL-terminate */
48        } else {
49            strcpy(dst, buf);              /* copy the whole of "buf" */
50        }
51    } else {
52        dst[0] = '\0';                     /* "buf" has nothing legal. */
53    } 
..    ...etc...
Here's my question: Is the call to strncpy in line 46 a bug or defect? Or is it OK? A few possibilities:
  1. What do you mean, bug? There's no incorrect behavior here! We copy however many bytes come before the underscore character, then line 47 adds the terminating NUL. The code is just fine.
  2. Well, it's not a software defect in terms of behavior, but the code is pretty ugly. The point of line 46 is to copy a known number of bytes, so it should be coded memcpy(dst, buf, p-buf); but I wouldn't go so far as to say it's a bug, really.
  3. There may not be a defect in the computer's behavior, but there's certainly a defect in the programmer's writing. This code is a violation of Strunk & White's rule #13: Omit needless words. What the programmer wants to say is:
    Copy p-buf bytes from buf to dst.
    which may be coded as memcpy(dst, buf, p-buf); but instead they have coded strncpy(dst, buf, p-buf); which being interpreted means:
    Copy up to p-buf bytes from buf to dst, but if a NUL byte is found, stop copying from buf and instead write additional NUL bytes to dst until a total of p-buf bytes are written to dst.
    I've color-coded the needless words with this pretty background color to show that over thirty (30) needless words have been dumped onto the otherwise concise sentence "Copy p-buf bytes from buf to dst." -- more than 4 needless words for every 1 in the original sentence! This indicates that something's surely wrong in how the programmer writes code..
It's probably not too hard to tell where my sympathies lie, but if you take position #1 above, let me ask you whether this is likewise perfectly fine:
    for (i=0; i<5; ) {
             ... code that doesn't read or write i ...
            if (some_condition) {
                break;
            }
             ... more code that doesn't read or write i ...
    }
The "for (i=0; i<5; ) {" should rather be coded as "for (;;) {" or maybe "while (1) {", but coding it as it is -- well, that's just wrong. What was the programmer thinking?

If you think that this "for (i=0; i<5; ) {" is a perfectly OK piece of code, I would love to hear why you think so, because a loop like this baffles me, as does strncpy where memcpy would do.

1 comment:

-ln said...

Nice write-up.

I'm wondering if there is any performance consideration. I'd expect memcpy to be faster than strncpy.

I like the "Omit needless words" principle, and I would connect it with "readability".

The above examples are difficult to read, and can be misleading. Needless words may not be seen as needless, and readers would ponder what is the hidden meaning behind the use of strncpy rather than memcpy, or whether the i variable should be used anywhere. At best it is a waste of time; in the worst case, a maintainer would misunderstand the original intent and introduce or reveal a latent bug.