Sunday, 20 September 2015

Compiler Warnings Considered Harmful

In this post I would like to make the argument that compiler warnings are bad. This may seem like a crazy thing to say, but bear with me while I make the case.

There are two times when a compiler is being used, the first is when the developer is writing the code, the second is when an end user or packager is building the package from source. The second use case is actually done far more often for popular software, but is seriously under served by compiler writers. For packagers, compiler warnings do a few things:
  • Break the build when -Werror is enabled and different compilers add new diagnostics.
  • Train people to ignore messages from the compilers.
  • Waste CPU time by checking things the person has no ability to fix anyway.
  • Look ugly.
For the programmer, warnings are actually useful, but misplaced. The compiler's job is to turn code into assembly/bytecode/whatever, and nothing else. The compiler should only stop on an invalid program. We actually already have tools designed to warn the programmer of bugs, and those are called linters and static analysers.

My proposal is simple. Shift all warnings from compilers and into code analysis tools that are as easy to run as the compiler itself. That way programmers get good warnings and our compilers can be faster and less annoying for everyone else. The Go programming language designers already realized this, with a compiler that emits no warnings, and excellent tools for catching bugs (https://golang.org/cmd/vet/, https://github.com/golang/lint).

Whatever happened to do one thing, and one thing well. Compiler warnings are bad design.

7 comments:

  1. Rebuttals welcome, I want other opinions on this.

    ReplyDelete
    Replies
    1. We could go further in detail on the technical and social aspects of the warnings, but we must admit that warnings are to a large extent a result of the language design (those undefined, implementation-defined and unspecified behaviors) and not a result of compiler implementation per se.

      You have options to suppress all warnings, enable or disable specific warnings and treat any produced warning as an error. You have that precisely because one size does not fit all. Some don't care, some may care but are incompetent and some are competent and do care. While probably imperfect, these options are available to all groups of people using compilers.

      As for shifting warnings out of compilers and into code analysis tools, you need to understand that there's a great deal of common logic in the two. Perhaps the only practical difference is that the latter does not generate any code. But all parsing, AST building, code/data-flow analysis must be there in both. Compilers are a natural foundation for this work nowadays (clang/llvm is probably even more natural since it can be used as a library, not necessarily for compilation). Long gone are the days when compilers were too simple and couldn't even ensure that parameters are passed correctly (in number and type (I mean the old style of function prototypes and calls to implicitly declared functions)) and for that you had a separate tool (lint and such). And then your code-analyzing tool would need to have simple integration, so that you don't end up having numerous entirely different and huge makefiles or scripts just to adjust your whole project from compilation to analysis. I'm not in favor of moving the warning functionality out.

      Delete
  2. It's not as simple, IMO.

    If a non-programmer (or an unfamiliar or an incompetent (in the language or the software) programmer) is tasked with building the package, then they can't and shouldn't do anything with warnings. If the build fails because of the warnings or because of anything else that has to do with the code, they need to get support from a programmer from the project. If there's a failure in the resultant software, they should do the same thing.

    If there's a warning, but things seem to work otherwise, it may still be a good thing to get someone at least look at it and potentially fix it. The reason is simple. Some warnings are worth attention and they shouldn't be lost in the noise of all other warnings that are not worth attention.

    There may be benign warnings, where the compiler is just trying to be too smart and overly paternalistic, e.g. warnings suggesting extra parentheses around a subexpression or warnings about an unused variable. But there may be warnings about things like signed with unsigned comparison, still legal, but too often done wrong with potentially far-going consequences.

    If you're dead set against dealing with warnings (and improving the code as a result of fixing them) no matter what, your only way to discover real problems is while using the built software. It's not an ideal stand, but it should be explicitly communicated, so the expectations of others (and the actions based on these expectations) are right.

    On the developer side things aren't very simple either, even though the developer can do something about the code, unlike the person who only presses a button or runs a script to make a build.

    First, warnings come and go. They aren't the same or stable across different compilers or even slightly different versions of the same compiler. It is annoying to have to add extra code or rewrite the existing code just because the compiler started thinking that something may be a problem. Enum cases missing in a switch()? Add "default:break;". That sort of stuff. And then some warnings are compiler bugs, they are generated by mistake, not due to lack of analysis or due to being too pedantic or paternalistic.

    Warnings aren't even stable across optimization levels as some facts about the code are only revealed when it's being analyzed for the purpose of optimization. For this one reason you can't make sure no new warnings will appear in the future even if the code stops changing. The compiler will change and you probably have no say in how.

    Second, there are warnings that fall into the portability category. The code isn't necessarily broken, but non-portable. Like right shifts of signed integers (implementation-defined) or the use of plain chars as array subscripts. The compiler may warn about these and do whatever is defined. But it may be not what another compiler will do with this same code.

    The developer may want to make use of implementation-defined behavior. The developer may also want to catch as many non-portable things as possible to improve portability. There are merits in each. What fits you, won't fit another developer. What fits one project, won't fit another.

    ReplyDelete
    Replies
    1. You said it is not that simple, but then everything thing you wrote is a strong argument for removing warnings from the compiler and moving them checker tools.

      Delete
    2. And duplicating rather complex functionality? And making checker tools accept and appropriately honor all compiler options and possibly even produce fake object files, so you don't end up rewriting your makefiles just to run the checker on the entire project?

      Delete
    3. That is true, from a code point of view, it makes sense to just put the warnings where the compiler already has 90 percent of the info, which is probably the reason things turned out how they did. Unfortunately no easy solution.

      Delete