Log files often contain ‘error’ information. Chris Oldwood suggests they rarely contain anything that could be considered helpful.
Users are not the only people who have to deal with cryptic error messages; sadly support staff and developers can have an equally bad time too. Users have every right to complain as they are rarely able to do anything about it, whereas we programmers have it within our power to write better diagnostic messages to aid our colleagues, and in turn better serve our users by helping resolve their issues more quickly.
 This article looks at some of the techniques we can apply when writing code that is going to throw an exception in response to some unexpected condition. Whilst exceptions are clearly not the only style of error reporting available, I have found that due to the ‘distance’ that often exists between the
 
  throw
 
 and
 
  catch
 
 sites it is all the more important that thought be put into the message generated. I also have my suspicions that their very name ‘exceptions’ lures the unwary programmer into believing they will only rarely occur and so will demand far less attention than other diagnostic messages.
A rude awakening
One ‘very early’ Saturday morning, during my week on the support rota, I got a phone call from the 1st line support team to say that one of the weekend tasks had failed. They had kindly sent me the relevant portion of the service log file and the error said something along these lines:
ERROR: Failed to calibrate currency pair!
The task was a fairly lengthy one (taking over an hour) and there were no obvious clues as to which particular currency pair (of the non-trivial number being processed) that might be the source of the problem. I decided my only recourse was to set a breakpoint on the site of the thrown error and then wait for it to fail. When I located the line of code where the error was generated I was more than slightly annoyed to see it was written something like this:
  void calibrate(string ccy1, string ccy2, . . .)
  {
    . . .
    if (. . .)
      throw Error("Failed to calibrate currency
      pair!");
  }
That’s right, the throw site knew what the exact combination of currency pairs were, but had neglected to include them in the error message. Needless to say once I got back in the office on Monday morning, the first thing I did was to fix the error message to save someone else wasting time unnecessarily in the future, and to allow the support team to do a bit more diagnosis themselves.
Clearer exception messages
The examples that provided the impetus for this article are equally terse:
  throw new Exception(String.Empty);
  throw new NotFoundException("Invalid ID");
One hopes that the first example is simply an unfortunate mistake where the programmer truly intended to come back and write something useful, but clearly forgot. The second is better, but I would contend that it’s only marginally better because there is so much more than could be said by the message.
If we imagine the message in the context of a busy log file, rather than surrounded by its related code, it will likely appear with only the usual thread context information to back it up, e.g.
[date & time, thread, etc] ERROR: Invalid ID
If you also have a stack trace to work from that might at least give you a clue as to what type the ‘ID’ refers to. You may also be able to hunt around and find an earlier diagnostic message from the same thread that could give you the value too, but you’re probably going to have to work to find it.
Consequently at a bare minimum I would also include the ‘logical’ type and, if possible (and it usually is), then the offending value too. As such what I would have thrown is something more like this:
  throw new NotFoundException
    ("Invalid customer ID '{0}'", id);
This message now contains the two key pieces of information that I know are already available at the throw site and so would generate a more support-friendly message such as this:
[...] ERROR: Invalid customer ID '1234-5678'
Discerning empty values
You’ll notice that the value, which in this case was a string value, is surrounded by quotes. This is important in a message where it might not be obvious that the value is empty. For example if I had not added the quotes and the customer ID was blank (a good indication of what the problem might be) then it would have looked thus:
ERROR: Invalid customer ID
Unless you know intimately what all your log messages look like, you could easily be mistaken for believing that the error message contains little more than our first example, when in fact the value is in fact there too, albeit invisibly. In the improved case you would see this:
ERROR: Invalid customer ID ''
For strings there is another possible source of problems that can be hard to spot when there are no delimiters around the value, and that is leading and trailing whitespace. Even when using a fixed width font for viewing log files it can be tricky to spot a single erroneous extra space – adding the quotes makes that a little easier to see:
ERROR: Invalid customer ID '1234-5678 '
As an aside my personal preference for using single quotes around the value probably stems from many years working with SQL. I’ve lifted values out of log files only to paste them into a variety of SQL queries countless times and so automatically including the right sort of quotes seemed a natural thing to do.
Discerning incorrect values
It’s probably obvious but another reason to include the value is to discern the difference between a value that is badly formed, and could therefore never work (i.e. a logic error), and one that is only temporarily unusable (i.e. a runtime error). For example, if I saw the following error I would initially assume that the client has passed the wrong value in the wrong argument or that there is an internal logic bug which has caused the value to become ‘corrupted’:
ERROR: Invalid customer ID '10p off baked beans'
If I wanted to get a little more nit-picky about this error message I would stay clear of the word ‘invalid’ as it is a somewhat overloaded term, much like ‘error’. Unless you have no validation at all for a value, there are usually two kinds of errors you could generate – one during parsing and another during range validation. This is most commonly seen with datetimes where you might only accept ISO-8601 formatted values which, once parsed, could be range-checked, for example, to ensure that an end date does not precede a start date.
To me the kind of error below would still be a terse message, better than some of our earlier ones, but could be even better:
ERROR: Invalid datetime '01/01/2001 10:12 am'
My preferred term for values that fail ‘structural validation’ was adopted from XML parsing – malformed. A value that parses correctly is considered ‘well formed’ whereas the converse would be one that is ‘malformed’. Hence I would have thrown:
ERROR: Malformed datetime '01/01/2001 10:12 am'
For types like a datetime where there are many conventions I would also try and include the common name for the format or perhaps a template if the format is dynamically configurable so the caller doesn’t have to root around in the specs to find out what it should have been:
  ERROR: Malformed datetime '01/01/2001 10:12 am'
    (Only ISO-8601 supported)
  ERROR: Malformed datetime '01/01/2001 10:12 am'
    (Must conform to 'YYYY-MM-DD HH:MM:SS')
Once a value has passed structural validation, if it then fails ‘semantic validation’ I would report that using an appropriate term, such as ‘out of range’ or better yet include the failed comparison. With semantic validation you often know more about the role the value is playing (e.g. it’s a ‘start’ or an ‘end’ date) and so you can be more explicit in your message about what the problem is:
ERROR: The end date '2010-01-01' precedes the start date '2014-02-03'
Terse framework messages
Framework exception messages are notoriously terse. The reason is probably that, like all general purpose library functions, there is a balance between performance, security, etc. and so including the value in the exception may well be highly undesirable in some types of application.
 One of the best examples of brevity in the .Net framework is probably the one below which is thrown from the LINQ extensions, such as
 
  Single()
 
 :
Sequence contains no matching element
 The prevalent use of LINQ in modern C# code is to be applauded over traditional
 
  for
 
 -
 
  next
 
 loops, but as their use grows, so does the possibility of this kind of message popping up when things go awry. Whilst it’s a little extra code to write for the caller, one obvious choice is to use the non-throwing version
 
  SingleOrDefault()
 
 , so that you can detect the failure manually and throw a more suitable exception, e.g.
  var customer = 
    customers.SingleOrDefault(c =>c.Id == id);
  if (customer == null)
  {
    throw new NotFoundException("Failed to find
      customer with ID '{0}'", id);
  }
As always there is a trade-off here. On the one hand I could write less code by directly using the LINQ methods provided, but at the cost of poorer diagnostics. However, as we shall see next, in C# there is nothing to stop me wrapping this extra code in my own extension method to keep the call site more readable.
Custom parsing methods
 Whilst it’s perhaps understandable why the more generic methods are suitably terse, sadly the same also goes for the more focused string parsing methods too, such as
 
  int.Parse()
 
 . On the kinds of systems I work with, I’d happily trade-off much better diagnostics for whatever little extra garbage this technique might incur, especially given that it’s only (hopefully) in exceptional code paths.
 My solution to this problem is the proverbial ‘extra level of indirection’, hence I wrap the underlying non-exception-throwing
 
  TryParse()
 
 style framework methods with something that produces a more satisfying diagnostic. As suggested a moment ago, extension methods in C# are a wonderful mechanism for doing that.
 For example I tend to wrap the standard configuration mechanism with little methods that read a string value, try to parse it, and if that fails then include the setting name and value in the message. Consequently instead of the usual terse
 
  FormatException
 
 affair you get from the .Net framework when parsing an integer value with code like this:
int.Parse(settings["TimeoutInMs"]);
You invoke a helper method like this instead:
  settings.ReadInt("TimeoutInMs");
And if this fails you’ll get a much more useful error message:
ERROR: The 'TimeoutInMs' configuration setting value '3.1415' was not a well formed integer value
This handles the structural validation side of things. For the semantic validation it can either be done with traditional conditional statements or by chaining on a validation method, a là fluent style:
  settings.ReadInt("TimeoutInMs")
    .EnsuringValueIsPositive();
For common ‘types’ of settings this pair can itself be wrapped up further to once again simplify the call site:
  settings.ReadTimeout("SendRequestTimeoutInMs");
Formatting exceptions in C#
 Whenever I create a custom exception type in C#, I generally add a variety of overloads so you can create one directly with a message format and selection of arguments without forcing the caller to manually use
 
  String.Format()
 
 . This is how the example near the beginning worked:
  throw new NotFoundException("Invalid customer ID
   '{0}'", id);
 All the class needs for this kind of use is a signature akin to
 
  String.Format
 
 ’s:
  public NotFoundException(string format, 
    params object[] args)
    : base(String.Format(format, args)) 
  { }
 However this is not enough by itself – it’s dangerous. If we pass a raw message that happens to contain formatting instructions but no arguments (e.g.
 
  "Invalid list {0, 1, 2}"
 
 ) it will throw during the internal call to
 
  String.Format()
 
 , so we need to add a simple string overload as well:
  public NotFoundException(string message)
    : base(message)
  { }
As an aside, I never add a public default constructor by default because I’m not convinced you should be allowed to throw an exception without providing some further information.
 Due to the framework exception classes not having a variable arguments (var-args) style constructor you might already be happy putting formatting calls into the callee, either directly with
 
  String.Format()
 
 or via a simple extension method like
 
  FormatWith()
 
 [
 
  Newton-King
 
 ], e.g.
  throw new NotFoundException(String.Format
    ("Invalid customer ID '{0}'", id));
  throw new NotFoundException
    ("Invalid customer ID '{0}'".FormatWith(id));
One scenario where the var-args style constructor falls down in C# is when you start adding overloads to capture any inner exception. You can’t just add it on the end of the signature due to the final params-based parameter, so you have to add it to the beginning instead:
  public NotFoundException(Exception inner,
    string format, params object[] args)
    : base(String.Format(format, args), inner)
  { }
 Sadly, as you can see from the
 
  base()
 
 constructor call, this is the exact opposite of what the framework does; it expects them on the end because the preceding argument is always a simple string.
In my experience little effort is put in by developers to simulate faults and validate that exception handling is behaving correctly, i.e. verifying ‘what’ is reported and ‘how’. Consequently this makes the var-args overload potentially dangerous as adding an exception argument on the end (by following the .Net framework convention) would cause it to be silently swallowed as no string format placeholder would reference it.
Whilst this has never been a problem for me in practice because I virtually always throw custom exception types and nearly always end up formatting a non-trivial diagnostic message, that’s easy to say when you’re the one who discovers and instigates a pattern – others are more likely to follow the one they’re used to, which is almost certainly the way the .Net framework does it.
Testing exception messages
At the tail end of the previous section I touched briefly on what I think is at the heart of why (server-side) error messages can often be poor – a lack of testing to make sure that they satisfy the main ‘requirement’, which is to be helpful in diagnosing a problem. When unit tests are written for error scenarios it’s all too common to see something simplistic like this:
Assert.That(() => o.DoIt(. . .), Throws.Exception);
 This documents and verifies very little about what is expected to happen. In fact if the
 
  o
 
 variable is a null reference, the test will still pass and not even invoke the behaviour we want to exercise! Our expectations should be higher; but of course at the same time we don’t want to over-specify the behaviour and make the test brittle instead.
If a caller is going to attempt recovery of the failure then they at least need to know what the type of the exception is, lest they be forced to resort to ‘message scraping’ to infer its type. Consequentially it’s beneficial to verify that the exception is of at least a certain type (i.e. it may also be a derived type). Where an obvious value, such as an input argument, is a factor in the error we can include that as a real property if we believe it may be useful for recovery. If the exception type is likely to remain vague we can still loosely test that the value is contained somewhere within the message instead:
  const string id = "malformed-value";
  Assert.That(() => o.DoIt(id),
    Throws.InstanceOf<ValidationException>()
    .And.Message.Contains(id));
If we’re writing tests for a method where there are many arguments to be validated, it’s more important that we try and match on the message too (or some richer-typed property) to discern which of the many arguments caused the exception to be thrown. It’s all too easy when making a change to a method that you end up finding one or more of your tests are passing because they are detecting a different error to the one they were intended to, as the null reference example above highlights.
Whilst that takes care of our code-based contractual obligations we also have an obligation to those who will be consuming these messages to make sure they form a useful narrative within any diagnostic output. This is an article in its own right [ Oldwood ], but using a text editor that performs spell checking of string literals certainly goes a little way towards helping avoid silly mistakes.
Summary
If you’re going to rely on exception messages as a major source of your diagnostic output, such as through the use of Big Outer Try Blocks [ Longshaw ] and classic log files, then it pays to make sure that what you’re going to end up writing contains the information you really need. This article has provided a number of suggestions about how the content of these messages can be improved, along with some ideas about how you can help ensure the code doesn’t get swamped with these tangential concerns so that the underlying logic becomes obscured. The final section looked at what we can do to leverage automated testing as a means of helping to formalise the errors that we generate as an aid to throwing ‘better quality’ exceptions.
Acknowledgements
Thanks as always to the Overload review team for sparing my blushes.
References
[Longshaw] ‘The Generation, Management and Handling of Errors’, Andy Longshaw, Overload #93, http://accu.org/index.php/journals/1586
[Newton-King] ‘FormatWith Extension Method’, James Newton-King, http://james.newtonking.com/archive/2008/03/27/formatwith-string-format-extension-method
[Oldwood] ‘Diagnostic & Support User Interfaces’, Chris Oldwood, http://chrisoldwood.blogspot.co.uk/2011/12/diagnostic-support-user-interfaces.html









 
                     
                    