Deconstructing Solid CodeI have criticised Steve Maguire's "Writing Solid Code" both loud and often. Quite rightly, people asked me to justify my criticism so here is an article that explores why I think his book is not as sound as some people claim.
Dynamic testingFirst of all, I have to question his overall approach to writing solid code. He is very much of the "let's test it until no more errors appear" school and, as anyone knows who's tried that approach, it is time-consuming, tedious and doesn't actually find all the bugs anyway. Quite simply, dynamic testing is never complete.
Leading on from this, he argues that you should maintain a separate debug version from the code you ship. This has three basic flaws:
- dynamic debugging relies on exhaustive testing
- it removes 'catch-all' checks in the critical version - production
- it may accidentally remove 'behaviour'
The third issue is that if you erroneously have some side-effects in your assertions that contribute to the correct behaviour of your program, the 'ship' version won't contain that code. Unlikely? Maybe, but certainly possible because any attempt to validate the behaviour is going to rely on dynamic testing again.
How thorough should your testing be? Maguire talks about 'coverage' and explains that you should step through both arms of each if statement to ensure statement coverage - step through with the debugger, by the way! By talking about paths and statements in this way, he confuses some of the myths underlying test coverage: 100% statement coverage is usually possible but 100% path coverage is often near-impossible. If you have 10 conditions, you need to cover 20 sets of statements to achieve 100% but for paths you need to cover 2^10 ==1024 different cases. How many decisions are in your application?
He mentions performance quite often which leads to some apparent contradictions in his thinking. Having said that you shouldn't ship the debug version because it will be too slow, he then talks about wrapping getchar in a function, commenting that it's only a small overhead so we shouldn't worry "because computers are getting faster all the time". If they're getting faster, why worry so much about the overhead of tests for 'impossible' states?
Despite this, he does at least say that you shouldn't rely on a testing group to find your bugs - you should be trying very hard to find them yourself.
Flying in the face of traditionSince the C committee went to such lengths to standardise existing practice and attempt to bring in some safety features, you'd expect Maguire to advocate taking advantage of this work.
Unfortunately, he insists on defining his own version of assert that deliberately doesn't work like the ISO C version: his is a statement instead of an expression and omits the string created from the expression due to "space limitations".
Assert isn't the only piece of ISO C safety that he chooses to ignore. He seems not to have heard of either const or volatile and shallowly discusses many problems that simply don't occur if you use function prototypes. His disregard for const leads to him labouring the issue of implicit contracts between caller and callee where the callee shouldn't modify data passed by the caller. He causes himself similar extra work by not pushing function prototypes hard enough.
The final kick in the teeth for accepted wisdom in software engineering is his treatment of version control and code inspection. Both heavy contributors to high-quality code, Maguire gives them only the briefest mention as if they were unimportant. Code inspection, particularly auto-mated code inspection, is an order of magnitude more efficient at finding bugs according to a recent Dutch study into software engineering.
'Clever' source codeThis is where I have my greatest disagreements with Maguire. He presents many code fragments which sit on the clever/obscure border depending on your background. As an example of this, he recommends using the macro NULL to indicate an empty statement:
while (condition) NULL;Am I alone in thinking this to be a very peculiar practice? Apart from which, most compilers I know would give a warning that the statement has no side- effects. Maguire has plenty to say about compiler warnings - as close as he ever gets to talking about automated code inspection - and here I agree with him. Compiler warnings are useful: switch them all on and then change your code to eliminate them!
Occasionally, he presents a code fragment with a subtle bug, such as:
p = realloc(p,n);I have to admit that I didn't spot the bug, but then I never use realloc, knowing it to have pitfalls. What is the bug? If realloc cannot allocate the memory, it returns NULL and the assignment means you lose your original pointer.
What are the pitfalls? Realloc may or may not copy the data to a new, larger, area of memory and return the address of that: many programmers forget this and end up with pointers into the old, deallocated, area. Even those programmers who remember will likely fall into the trap that Maguire shows.
Back to 'clever' code though, he prefers:
size-->0 over: --size>=0because of overflow and boundary conditions (rightly) but fails to separate the value and test from the increment operation. Surprisingly, many optimising compilers can actually do a better job if the increment appears in its own statement and I would argue that such code is clearer anyway.
This is an issue of performance again (remember that computers are "getting faster") and he goes on to discuss, at some length, when to use casts, shifts and divides. His claims that compilers will be able to optimise a division into a shift with suitable casts sounds particularly Microsoft-related. Such concerns distract developers from making sure the code is correct in the first place. You should only optimise when you know you have a performance problem and you have profiled the code to locate the bottleneck!
Problems by exampleAnother case of glossing over issues is when he talks about the problems that can be caused by returning the address of static local data. His example uses a string literal and then modifies it: blatantly undefined behaviour with no mention that it is. Eventually (page 240) he notes that you shouldn't modify string literals but since this is hidden away in the answers to his exercises, you could be forgiven for missing it!
In another example, he claims that reading a variable that has just been written is unsafe but there's no mention of volatile here. Again, it's a case of manufacturing problems through a lack of knowledge of the ISO C standard.
Not my styleThere are also items which are particular bugbears of mine which didn't enamour me to Maguire. OK, so these are almost religious issues and I shouldn't criticise him for these alone but when taken in the context of the book as a whole, I found them symptomatic of what I view as his shortcomings. He explains that identifiers used in the book use the Hungarian naming convention. This is where you encode the type information as an unpronounceable prefix to each identifier name. Why do I hate this so much? I've worked in an environment that uses it and I find it distracting. It also forces the implementation into public view: change the type used to implement a concept and you have to change every use of the identifier. In my opinion, Hungarian notation flies in the face of implementation detail encapsulation.
Another quasi-religious issue is multiple returns versus single-entry, single- exit. Maguire advocates multiple returns instead of state variables or duplicated conditions on the grounds that it is clearer. I'd argue the other way but I can always ignore his 'advice' in this case.