Spot the problem

Just for once, a vaguely work-related post, but without any work.

What is wrong with this (its in C, of course):

switch (stoat)
{
    bool goat = TRUE;
    case weasel:
        goat = FALSE
        /* Fallthrough */
    case ferret:
        mustelid(goat);
        break;
    ...
}

where you should assume that “mustelid” represents a block of code large enough to be worth not repeating. You should find this not very hard to solve, once I’ve presented it in this format.

For bonus points, what does the compiler say? What does your editor do?

[Update: OK, so the answer is that the initialisation of goat is skipped. Its a statement, but it isn't behind a case label, so it is ignored. But the second part of the answer is that it is ignored silently, which I've got used to the compiler not doing: if you do something stupid which has no effect, it will normally warn you. quokka was the first to get both parts of the answer (those who said "there is a semi-colon missing" were correct, but fell for the "hide the hard answer behind the easy answer" test that I accidentally set. I wouldn't bother write a post about a missing semi-colon, and anyway the compiler easily spots that and fails to compile it).

There is a third part, which is that not only is the non-functional statement not warned about, but it also fails to warn that "goat" may be used uninitialised, which is very naughty of it indeed. That may be gcc4 brokenness.

Someone else who investigated this more thoroughly than I wrote:

"GCC 4 (4.1.2, 4.4.4 and 4.6.1) won't warn you about this even with -Wall and -W (and -O which is necessary for some of the analyses to work). For GCC 4.1.2 and 4.4.4 it takes the specific command line option -Wunreachable-code to tell you that the first assignment won't be executed. However, -Wunreachable-code is not normally recommended. The GCC manual says:

This option is not made part of -Wall because in a debugging version of a program there is often substantial code which checks correct functioning of the program and is, hopefully, unreachable because the program does work. Another common use of unreachable code is to provide behavior which is selectable at compile-time.

GCC 4.6.1 doesn't provide -Wunreachable-code but the new option -Wjump-misses-init finds the problem (this is not part of -Wall or -Wextra)."

Isn't C interesting?]

Comments

  1. #1 Steve
    2011/09/06

    Missing ;.

    [Well spotted. You found the trivial error I accidentally inserted in order to make the real error harder to spot :-) -W]

  2. #2 David B. Benson
    2011/09/06

    What’s wrong? Its in C.

    [Ah, now I sooo nearly said "and don't say its because its in C :-)" -W]

  3. #3 Markk
    2011/09/06

    bool isn’t base C99 anyway. It is from a header file stdbool.h
    That header has has true and false defined, not TRUE and FALSE.

    Also If you are using a C90 level compiler you will have to define the type.

    [Fair point, but not the real one. Yes, bool is typedef'd as a uint -W]

  4. #4 HenkL
    2011/09/06

    The line
    bool goat = TRUE;
    should be before the switch{}

    Inside switch{} only ‘case xx:’ and ‘default:’ are allowed.

    [Very nearly correct. The line should be before the switch; but it is allowed inside -W]

  5. #5 rturpin
    2011/09/06

    You shouldn’t mix goats and weasels. And should put a semicolon on the end of most every C statement. Except after a right curly brace.

  6. #6 Greg Laden
    2011/09/06

    You can pass a weasel as a stoat but when you try to ferret out a goat, that’s an otter matter.

  7. #7 Rambling T. Wreck
    2011/09/06

    I’m thinking goat isn’t going to be initialized (but since I don’t even have a compiler installed I’m just going to have to leave it at that).

    [Yes, that is correct -W]

  8. #8 Arthur Smith
    2011/09/06

    Does that compile? Gotta love C’s generosity, I guess. At least in Java something like
    goat = FALSE case ferret: mustelid(goat);
    would get you a syntax error.

    [You spotted the "deliberately" inserted easy error to put you astray. No, it fails with "error: expected ',' or ';' becore 'case'" -W]

  9. #9 Joel
    2011/09/06

    It’s news to me that a ferret is both a goat and a mustelid. Come to think of it, I wasn’t aware of any goat-mustelid combinations.

    But the real problem is probably the variable declaration in a switch statement, before any case statement. That’s would probably cause some weirdness. I’m not sure what would happen though…

  10. #10 quokka
    2011/09/06

    goat is never initialized. gcc doesn’t care and compiles it without warning. asm output confirms goat not initialized.

    g++ reports an error “jump to case label: crosses initialization of ‘int goat’ which about sums things up.

    [Ah, at last the full answer: yes, the initialisation is skipped, but it is skipped silently. I was doing this with gcc 4-something -W]

  11. #11 andrew
    2011/09/07

    What’s right with it?

    The first thing wrong is that you should have put an apostrophe in “its”. The second thing wrong is your claim that it’s in C. It’s certainly not in proper C, which has no data type called “bool”, so I’ll assume you’re using C99. In which case, you must have included stdbool.h, which defines “true” and “false”, but not “TRUE” and “FALSE”, so I assume you have #defines or enums for this. Obviously weasel and ferret need to be const-exprs, so I assume you have #defines or enums for them as well. In which case, why are they in lower case and the boolean values in upper case?

    Also, stick a semicolon in after “FALSE”.

    This version in proper C works for me:

    #include

    #define TRUE 1
    #define FALSE 0
    typedef int bool;

    enum
    {
    weasel = 0,
    ferret = 1
    };

    void mustelid(int goat)
    {
    printf(“stoaty is a knob\n”);
    }

    void nurdge()
    {
    int stoat = 0;

    switch (stoat)
    {
    bool goat = TRUE;
    case weasel:
    goat = FALSE;
    /* Fallthrough */
    case ferret:
    mustelid(goat);
    break;
    }

    }

    int main ()
    {
    nurdge();
    }

    [Tut tut, you've got carried away with the trivial problem and missed the real one -W]

  12. #12 andrewt
    2011/09/07

    The initialization of goat is never executed. The program is nonconforming if stoat == weasel because an uninitialized value is passed to mustelid.

    I wouldn’t be surprised if a compiler doesn’t assist you with a warning because this as an unusual thing to do.

    [Almost correct, except I don't think the code is non-conforming (depending on what you mean by that). I don't think there is any language requirement that variables have to be initialised -W]

  13. #13 Oxford Kevin
    2011/09/07

    I’d guess goat doesn’t get initialized with TRUE

    Kevin

  14. #14 Steve
    2011/09/07

    > (those who said “there is a semi-colon missing” were
    > correct, but fell for the “hide the hard answer behind
    > the easy answer” test that I accidentally set.

    Or did not want to show off :-)

    > I wouldn’t
    > bother write a post about a missing semi-colon, and
    > anyway the compiler easily spots that and fails to
    > compile it)

    I did wonder why we were compiling code for you…

  15. #15 Anonymous
    2011/09/07

    By nonconforming I mean your program doesn’t conform to the C99 standard which says “The behavior is undefined in the following circumstances: … The value of an object with automatic storage duration is used while it is indeterminate”. I constantly remind students a C implementation can do anything it likes given an undefined program, and show them unexpected behaviour from undefined programs produced by simple errors. You’d be unlucky to get bizarre behaviour in this case, but a useful & conformant C implementation could halt with an error message.

    [Ah, fair enough, I hadn't realised that using an undefined variable fell into that category. I agree that you shouldn't do it, obviously. That the compiler (well, gcc4) doesn't warn you is a bit worrying -W]

  16. #16 andrewt
    2011/09/07

    valgrind would detect the error for you (at runtime).

  17. #17 Nick Barnes
    2011/09/07

    If stoat == weasel, [I assert without actually checking that] the call to mustelid with uninitialized goat is “undefined behaviour”, in the language of the C standard. Which means the standard doesn’t rule out any possible consequence, including dumping core or (in an image frequently invoked on comp.std.c back in the 90s) causing warthogs to come flying out of your nose.
    GCC is right to not warn on unreachable code, even with -Wall, because it breaks a lot of legitimate, useful, and commonplace macro trickery. You need to catch this sort of stuff with coding standards, lint-like tools, and/or inspection.

    [I agree its right not to warn about unreachable code, for the reasons it gives. But what about its failure to warn about may-not-be-init? That doesn't seem excusable -W]

  18. #18 David Jones
    2011/09/07

    Nick passed this to me and I spotted it within seconds, of course I have no proof of that now.

    Declaring a variable in a block in a switch is one of those things I would never do or recommend, but wouldn’t have (until now) thought to prohibit it in a coding standard. I have occasionally experimented with “only declare variables at the top-level of a function” (and this would’ve been a violation of that) but it’s too useful to break that rule. So it would get broken a lot.

    To be honest the whole “I’m going to pretend ‘bool’ is a good honest standard type” thing set alarm bells ringing.

    1 bonus mark for remembering to put “Fallthrough” in the comment though.

    I’d always assumed that gcc’s warnings about “failed to initialise” and “unreached code” were mostly because they were too lazy to do the analysis well, although it’s true that such warnings are often annoying. Mostly when compiling other people’s code.

  19. #19 Nick Barnes
    2011/09/07

    One thing to remember about compilers is that they are generally not built for defect discovery. If a compiler generates a warning, it is usually because whatever analysis the compiler has to perform as part of its compilation job suggests a potential defect which is easy to report. So if GCC, for instance, has an uninitialised use warning, then that is probably generated if, and only if, it falls naturally out of (something such as) the dataflow analysis which the compiler already performs for optimising register/slot assignment.

  20. #20 Arthur Smith
    2011/09/07

    Has C ever cared about uninitialized variables? Most compilers I think set any newly allocated variable to zero, though I’ve run into some that just use whatever random data was in that memory location. So if FALSE == 0 then your code definitely would do something unexpected, but yeah, I wouldn’t expect any compile-time or run-time errors for that.

    [I wouldn't expect initialisation by default of automatic variables in C, except if you had some sort of debug setup turned on, and indeed don't get it -W]

  21. #21 Rattus Norvegicus
    2011/09/07

    The program is perfectly conforming as shown here. The compiler is also behaving correctly, although having a warning for this, umm…, case.

  22. #22 HenkL
    2011/09/07

    Compiled the code in MSVC2010. Result:

    error C2360: initialization of ‘goat’ is skipped by ‘case’ label

    Not a warning, an error.
    I know, MS etc etc. But for making Windows programs, it’s not a bad tool.

    [Wash your mouth out :-) -W]

  23. #23 David B. Benson
    2011/09/07

    I’m sure it enhances job security for weasels.

    Or goats as the case maybe.

  24. #24 David Jones
    2011/09/08

    Hmm. You don’t have some C nargs. Me included of course. Some of them are even right.

    #20: “Most compilers I think set any newly allocated variable to zero” This is dangerous crap. Commonly held belief though. Stop reading Schildt.

    #21: “The program is perfectly conforming” Syntactically, yes. But not conforming semantically.

    #22: MSVC is actually a pretty good compiler. Produces better code than GCC whenever I’ve bothered to inspect it.

  25. #25 Anonymous
    2011/09/08

    [Tut tut, you've got carried away with the trivial problem and missed the real one -W]

    Fair comment, although, since you didn’t trouble yourself to tell us what the program was supposed to do, it still seems a reasonable mistake. In my defence, I can only say that I posted shortly after a very heavy lunch, and am surprised to find that I even managed to string a few words together, let alone use the apostrophe correctly.

    I was surprised to see a variable declaration inside a switch – why would you even think of doing that? – but since my compiler let it by, I assumed that the initialisation was kosher. I’m still a little surprised that the compiler would allow you to declare a variable, and use it later, and yet ignore the initialisation part. But the semantics of the C “switch” statement are murky indeed. Hence Duff’s device.

    Re #22, under MSVC if your file extension is c the source will be treated as C, if it’s cpp (the default when creating a new source file) the source will be treated as C++. In the latter case you get the C2360 error.

    Right, I’m off to wash my mouth out as well. MSVC is a good tool, although the 2008 version is a distinct improvement over its successor.

  26. #26 John Mashey
    2011/09/08

    As David says, auto’s have *never* been set automatically to zero. [that would have been overhead completely foreign to C. Initial values of statics were recorded in the executable, not requiring code or execution thereof.

    the C99 spec, S.6.8.5, p.135 has an example of some similarity, but with no animals.
    "switch (expr)
    {
    int i = 4;
    f(i);
    case 0:
    i = 17;
    /* falls through into default code */
    default:
    printf("%d\n", i);
    }
    the object whose identifier is i exists with automatic storage duration (within the block) but is never
    initialized, and thus if the controlling expression has a nonzero value, the call to the printf function will
    access an indeterminate value. Similarly, the call to the function f cannot be reached."

    [Aha, thanks; I was looking for an example of this elsewhere, and didn't find one -W]

    Now, pursuant to David’s #1:
    a) 1973 C did not allow declarators in inner blocks.
    That got changed around 1977, I think, as per “new things in C”:
    “All compound statements (blocks) can now have a declarator list with initialization.”
    and the 1977 C Reference Manual says:
    “Any initializations of auto or register declarations are performed each time the block is entered at the top.”

    The weirdness of all this is that auto declarations have block scope, and the compiler needs to allocate space for them on block entry (or altogether on function entry) … but the *initializations* are not gathered and done at block entry, but where they are located.
    (Of course, this makes simple code generation easier.)

    But of course, it means switch is *different* from other blocks…

    if (something) {int X = 1; ….} should work fine, and only allocates X for the duration of that block, which seems a good thing, not having to promote to the enclosing block.

    likewise while(something) {int X = 1; ….) seems OK.

    SO, for all blocks except switch, a declarator with initializer at beginning block does exactly what one would expect, because if the block gets executed, that code gets executed. Switch doesn’t work that way, which more or less makes declarations somewhat useless inside switch blocks.
    Of course, for this to work intuitively, the compile would need to move the initializer code back ahead of the switch code to make it appear on all paths.

    b) Put another way switch() {} *looks* like a block, but even in simple cases (with no fallthroughs) it is more like multiple blocks.

    [The reason I ended up putting the code there was to limit the scope of the variable, which seemed good to me. It is a bit of a trap -W]

  27. #27 Collin
    2011/09/09

    I thought that a declaration that specifies a value is equivalent to the declaration by itself followed by a statement assigning the value. At least it always worked that way in Borland. Do other compilers do this differently?

    [In this case, it still is. If you do declaration and assignment, the compiler will still recognise and allow the declaration, and still ignore the assignment -W]

  28. #28 John Mashey
    2011/09/09

    re: 26
    [The reason I ended up putting the code there was to limit the scope of the variable, which seemed good to me. It is a bit of a trap -W]

    Yes … the real problem is that switch {} is a fairly peculiar (albeit very useful) control-flow mechanism, but it means the that unlike the rest of C, the {} is NOT a single-entry block of code, in which one can guarantee that initializers have a clear place.

    One cannot even replicate the code at the beginning of each case, given the use of fall-throughs.

    For example, see difference with Pascal:
    Each case, in effect has an implicit (C) break, so that each case has a single entry. I.e., that lets you do:

    if (a == b || a == c …) action 1
    else if (a == d || a == e … action 2
    else if (a == f || …) action 3
    else action 4.

    But it doesn’t let you do:
    switch (a) {
    b:
    c:
    action 1 (fallthrough)
    d:
    e:
    action 2 (more fallthrough)
    f:
    action 3; break
    default:
    action 4

    As usual in C, switch can be a big help in writing terse code … but people have sometimes omitted breaks :-)

    Anyway: again, the problem is that {} blocks are single-entry chunks of code elsewhere in C, but not in a switch.
    See Switch statement, Wikipedia.

  29. #29 GirlCoder
    2011/09/10

    Perhaps my years away from from C and having programmed in C# I assumed even my own assumptions. :D

    Firstly, I found the compiler error of the missing semi-colon.

    However, I did not find any other error since, in the switch statement, there is the assumption that “goat” was already initialized to be true once the “mustelid(goat);” statement was reached.

    Upon second look, the initialization of goat as a bool should actually be an error since it is inside the switch statement but not within the case statements.

    I’ll read the rest of the responses now. :)

  30. #30 David Jones
    2011/09/12

    As John Mashey (more or less) points out, and I, with the wits of a half-stunned bee, realised a day or two ago…

    An initialiser inside a switch block is _always_ completely useless. So the fact that the compiler doesn’t warn, is just more evidence for Nick’s position that the compiler warnings are just more or less benign side effects of whatever it was the compiler was doing anything. They have very little to do with useful diagnostics for producing robust code.

    John forgets that (ordinary) blocks can be entered by a goto, so are not always entered at their top. More goto madness: if you put a label before your declaration at the top of the switch block, then you _will_ be able to execute them if you goto the label. Just don’t.

  31. #31 John Mashey
    2011/09/16

    re: gotos: well of course, but this was already bad enough,
    and somewhere I recall:
    ‘Go To Statement Considered Harmful’

  32. #32 Nick Barnes
    2011/09/18

    Goto has its place (for instance: to avoid over-deep control-flow nesting, or duplication of clean-up code, especially in languages such as C which lack exception handling), but this is not it. :-)

  33. #33 Matt McIrvin
    2011/09/18

    I think the IDE I use with Java (Eclipse) would flag that line as never-executed code right in the editor.

  34. #34 John Mashey
    2011/09/18

    Well, I suppose I should have put a :-) after the comment about harmful.
    Goto’s indeed have their place.

    Of course, this is one of those problems that arise when language definition gets intertangled with assumptions on compiler technology.
    Logically, any declarations and initializers ought to happen before all paths through the switch (ignoring truly crazed gotos). But that code generation for that would require gathering the initializers and hoisting them into code executed before the actual switch. That would be simple and natural in all-out global optimizers, but not in the original C compiler, and most others, which don’t do sophisticated data flow analysis and aggressive code motion.

  35. #35 Alexander Harvey
    2011/09/20

    I am no longer familiar with C but:

    If a language does not include support for statements of the form:

    goat = datatype mustelid();

    for the particular data type of goat

    passing goat by reference without unnecessary initialisation is an equivalent. Worth a warning if you can turn that warning off when you intended to do this.

    That it should fail to highlight a statement that produces no code or no reachable code seems the greater and more general issue. Sometimes compiler foibles become features perhaps this was once the case here.

    Alex

    Alex

  36. #36 John Mashey
    2011/10/13

    Sadly, Dennis Ritchie died earlier this week.

    But do see this.