An Overview of C#.NET

.NET rests on the Common Language Infrastructure (CLI). Microsoft, Intel, and Hewlett-Packard have jointly submitted the CLI as an ECMA standard. The CLI is designed for strongly typed languages and the CLI proposal has 5 partitions. Part 1 specifies the CLI foundation: the CTS, the VES, and the CLS. The Common Type System (CTS) specifies two CLI fundamental types: value types and reference types. Compiling a C# program does not create a regular executable file. Instead it creates a program in Common Intermediate Language (CIL, specified in partition 3). A compiled C# program also contains a block of metadata (data about the program itself) called a manifest (specified in partition 2). This metadata allows reflection and effectively eliminates the need for the registry. The job of the Virtual Execution System (VES) is to translate the CIL into native executable code (which can be done just-in-time or at installation). The Common Language Specification (CLS) is a set of rules designed to allow language interoperability. For example, unsigned integer types are not in the CLS so your C# modules must not expose unsigned integers if you want them to be fully interoperable.

Hello World

The obligatory console Hello World in C# looks like this.

class HelloWorld {
  static void Main() {
    System.Console.WriteLine(
                        "Hello, world!");
  }
}

C# has a sensibly limited preprocessor. There are no macro functions. What you see is what you get. A C# source file is not required to have the same name as the class it contains. Identifiers should follow the camelCasing or PascalCasing notation depending on whether they are private or non-private respectively. Hungarian notation is officially not recommended. C# is a case sensitive language so Main must be spelled with a capital M. A C# program exposing two identifiers differing only in case is not CLS compliant. The CLS supports exception handling and C# accesses these features using the try/catch/finally keywords. Exceptions are used extensively in the .NET framework classes. C# also supports C++ like namespaces as a purely logical scoping/naming mechanism. You can write using directives to bring the typenames in a namespace into scope.

using System; // System.Exception

class HelloWorld {
  static void Main() {
    try {
      NotMain()
    }
    catch (Exception caught) {
      ...
    }
  }
  ...
}

C# Fundamentals

Numeric Types

C# supports 8 integer types (not all of which are CLS compliant) and three floating point types. The floating point literal suffixes for these three types are F/f, D/d, and M/m (think m for money).

Figure 1. C# Integer Types

C# expressions follow the standard C/C++/Java rules of precedence and associativity. As in Java, the order of operand evaluation is left to right (in C/C++ it's unspecified), an expression must have a side effect (in C/C++ it needn't) and a variable can only be used once it has definitely been assigned (not true in C/C++).

Checked Arithmetic

The CLS allows expressions or statements that contain integer arithmetic to be checked to detect integer overflow. C# uses the checked and unchecked keywords to access this feature. An integer overflow throws an OverflowException when checked. (Integer division by zero always throws a DivideByZeroException .) Floating point expressions never throw exceptions (except when being cast to integers). For example:

class Overflow {
  static void Main() {
    try {
      int x = int.MaxValue + 1;
      // wraps to int.MinValue
      int y = checked(int.MaxValue + 1);
      // throws
    }
    catch (System.OverflowException
                                caught) {
      System.Console.WriteLine(caught);
    }
  }
}

Control Flow

C# supports the if / while / for / do statements familiar to C/C++/Java programmers. As in Java, a C# boolean expression must be a genuine boolean expression. There are never any conversions from a built in type to true / false . A variable introduced in a for statement initialization is scoped to that for statement. C# supports a foreach statement, which you can use to effortlessly iterate through an array (or any type that supports the correct interface).

class Foreach {
  static void Main(string[] args) {
    foreach (string arg in args) {
      System.Console.WriteLine(arg);
    }
  }
}

The C# switch statement does not allow fall-through behavior. Every case section (including the optional default section) must end in a break statement, a return statement, a throw statement, or a goto statement. You are only allowed to switch on integral types, bools, chars, strings and enums (these types all have a literal syntax).

Methods and Parameters

C# does not allow global methods; all methods must be declared within a struct or a class. C# does not have a C/C++ header/source file separation; all methods must be declared inline. Arguments can be passed to methods in three different ways:

copy . The parameter is a copy of the argument. The argument must be definitely assigned. The method cannot modify the argument.
out . The parameter is an alias for the argument. The argument need not be definitely assigned. The method must definitely assign the parameter/argument.
ref . The parameter is again an alias for the argument. The argument must be definitely assigned. The method is not required to assign the parameter/argument.

The ref/out keywords must appear on the method declaration and the method call. For example:

class Calling {
  static void Copies(int param) { ... }
  static void Modifies(out int param)
                                { ... }
  static void Accesses(ref int param)
                                { ... }
  static void Main() {
    int arg = 42;
    Copies(arg); // arg won't change
    Modifies(out arg); // arg will change
    Accesses(ref arg);
                 // arg might change
  }
}

C# supports method overloading but not return type covariance. Unlike Java, C# does not support method throw specifications (all exceptions are effectively unchecked).

Value Types

C# makes a clear distinction between value types and reference types. Value type instances (values) live on the stack and are used directly whereas reference type instances (objects) live on the heap and are used indirectly. C# has excellent language support for declaring user-defined value types (unlike Java which has none).

Enums and Structs

You can declare enum types in C#. For example:

enum Suit {Hearts, Clubs, Diamonds, Spades}

You can also declare a user-defined value type using the struct keyword. For example:

struct CoOrdinate {
  int x, y;
}

Unlike C++, the default accessibility of struct fields is private. You control the initialization of struct values using constructors. You use the static keyword to declare shared methods and shared fields. The readonly keyword is used for fields that can't be modified and are initialised at runtime. The const keyword is used for fields (and local variables) that can't be modified and are initialised at compile time (and is therefore restricted to enum s and built in types). As in Java, each declaration must repeat its access specifier.

struct CoOrdinate {
  public CoOrdinate(int initialX,
                    initialY) {
    x = rangeCheckedX(initialX);
    y = rangeCheckedY(initialY);
  }
  public const int MaxX = 600;
  public static readonly CoOrdinate
            Empty = new CoOrdinate(0, 0);
  ...
  private int x, y;
}

The built in value type keywords are in fact just a notational convenience. The keyword int (for example) is an alias for System.Int32 , a struct called Int32 that lives in the System namespace. Whether you use int or System.Int32 in a C# program makes no difference.

Operator Overloading

C# supports operator overloading. Enum types automatically support most operators but struct types do not. For example, to allow struct values to be compared for equality/inequality you must write == and != operators:

struct CoOrdinate {
  public static bool operator==(
        CoOrdinate lhs, CoOrdinate rhs) {
    return lhs.x == rhs.x &&
           lhs.y == rhs.y;
  }
  public static bool operator!=(
        CoOrdinate lhs, CoOrdinate rhs) {
    return !(lhs == rhs);
  }
  ...
  private int x, y;
}

Operators must be public static methods. Operator parameters can only be passed by copy (no ref or out parameters). One or more of the operator parameter types must be of the containing type so you can't change the meaning of the built in operators. The increment (and decrement) operator can be overloaded and works correctly when used in either prefix and postfix form. C# also supports conversion operators which must be declared using the implicit or explicit keyword. Some operators (such as simple assignment) cannot be overloaded.

Properties

Rather than using a Java Bean like naming convention, C# uses properties to declare read/write access to a logical field without breaking encapsulation. Properties contain only get and set accessors. The get accessor is automatically called in a read context and the set accessor is automatically called in a write context. For example (note the x and X case difference):

struct CoOrdinate {
  ...
  public int X {
    get { return x; }
    set { x = rangeCheckedX(value); }
  }
  ...
  private static int
            rangeCheckedX(int argument) {
    if(argument < 0 || argument > MaxX) {
      throw new ArgumentOutOfRange("X");
    }
    return argument;
  }
  ...
  private int x, y;
}

Indexers

An indexer is an operator like way to allow a user-defined type to be used as an array. An indexer, like a property, can contain only get/set accessors. For example:

struct Matrix {
  ...
  public double this [ int x, int y ] {
    get { ... }
    set { ... }
  }
  public Row this [ int x ] {
    get { ... }
    set { ... }
  }
  ...
}

Reference Types

Classes

Classes allow you to create user-defined reference types. One or more reference type variables can easily refer to the same object. A variable whose declared type is a class can be assigned to null to signify that the reference does not refer to an object (struct variables cannot be assigned to null ). Assignment to null counts as a Definite Assignment. Classes can declare constructors, destructors, fields, properties, indexers, and operators. Despite identical syntax, classes and structs have subtly different rules and semantics. For example, you can declare a parameterless constructor in a class but not in a struct. You can initialise fields declared in a class at their point of declaration, but struct fields can only be initialized inside a constructor. Here is a class called MyForm that implements the GUI equivalent of Hello World in C#.NET.

using System.Windows.Forms;
class Launch {
  static void Main() {
    Application.Run(new MyForm());
  }
}
class MyForm : Form {
  public MyForm() { Text = captionText; }
  private string captionText
                       = "Hello, world!";
}

Variables whose declared type is a class can be passed by copy , by ref , and by out exactly as before.

class WrappedInt {
  public WrappedInt(int initialValue)
  { value = initialValue; }
  ...
  private int value;
}
class Calling {
  static void Copies(WrappedInt param)
    { ... }
  static void Modifies(out
               WrappedInt param) { ... }
  static void Accesses(ref
               WrappedInt param) { ... }
  static void Main() {
    WrappedInt arg = new WrappedInt(42);
    Copies(arg); // arg won't change
    Modifies(out arg); // arg will change
    Accesses(ref arg); // arg might change
  }
}

Strings

C# string literals are double quote delimited ( char literals are single quote delimited). Strings are reference types so it is easy for two or more string variables to refer to the same string object. The keyword string is an alias for the System.String class in exactly the same way that int is an alias for the System.Int32 struct.

namespace System {
  public sealed class String : ... {
    ...
    public static bool operator==(
          String lhs, String rhs) { ... }
    public static bool operator!=(
          String lhs, String rhs) { ... }
    ...
    public int Length { get { ... } }
    public char this[int index]
      { get { ... } }
    ...
    public CharEnumerator GetEumerator()
      { ... }
    ...
  }
}

The String class supports a readonly indexer (it contains a get accessor but no set accessor). The C# string type is an immutable type (just like in Java). The string equality and inequality operators are overloaded but the relational operators (< <= > >=) are not. The StringBuilder class is the mutable companion to string and lives in the System.Text namespace. You can iterate through a string expression using a foreach statement.

Arrays

C# arrays are reference types. The size of the array is not part of the array type. You can declare rectangular arrays of any rank (Java supports only one dimensional rectangular arrays).

int[] row;
int[,] grid;

Array instances are created using the new keyword. Array elements are default initialised to zero ( enum s and numeric types), false (bool), or null (reference types).

row = new int[42];
grid = new int[9,6];

Array instances can be initialised. A useful initialisation shorthand does not work for assignment.

int[] row = new int[4]{1, 2, 3, 4};
                              // longhand
int[] row = { 1, 2, 3, 4 };  // shorthand
row = new int[4]{ 1, 2, 3, 4 };  // okay
row = {1, 2, 3, 4};  // compile time error

Array indexes start at zero and all array accesses are bounds checked ( IndexOutOfRangeException ). All arrays implicitly inherit from the System.Array class. This class brings array types into the CLR (Common Language Runtime) and provides some handy properties and methods:

namespace System {
  public abstract class Array : ... {
    ...
    public int Length { get { ... } }
    public int Rank { get { ... } }
    public int GetLength(int rank) { ... }
    public virtual IEnumerator
    GetEnumerator() { ... }
    ...
  }
}

The element type of an array can itself be an array creating a so called "ragged" array. Ragged arrays are not CLS compliant. You can use a foreach statement to iterate through a ragged array or through a rectangular array of any rank:

class ArrayIteration {
  static void Main() {
    int[] row = { 1, 2, 3, 4 };
    foreach (int number in row) { ... }
    int[,] grid = { { 1, 2 }, { 3, 4 } };
    foreach (int number in grid) { ... }
    int[][] ragged = { new int[2]{1,2},
                   new int[4]{3,4,5,6} };
    foreach (int[] array in ragged) {
      foreach (int number in array) { ... }
    }
  }
}

Boxing

An object reference can be initialised with a value. This does not create a reference referring into the stack (which is just as well!). Instead the CLR makes a copy of the value on the heap and the reference refers to this copy. The copy is created using a plain bitwise copy (guaranteed to never throw an exception). This is called boxing. Extracting a boxed value back into a local value is called unboxing and requires an explicit cast. When unboxing the CLR checks if the boxed value has the exact type specified in the cast (conversions are not considered). If it doesn't the CLR throws an InvalidCastException . C# uses boxing as part of the params mechanism to create typesafe variadic methods (methods that can accept a variable number of arguments of any type).

struct CoOrdinate {
  ...
  private int x, y;
}
class Boxing {
  static void Main() {
    CoOrdinate pos;
    pos.X = 1;
    pos.Y = 2;
    object o = pos; // boxes
    ...
    CoOrdinate copy = (CoOrdinate)o;
                        // cast to unbox
  }
}

Figure 2. Boxing

Type Relationships

Inheritance

C# supports the same single inheritance model as Java; a class can extend at most one other class (in fact a class always extends exactly one class since all classes implicitly extend System.Object ). A struct cannot act as a base type or be derived from. A derived class can access non-private members of its immediate base class using the base keyword. Unlike Java (and like C++) by default C# methods, indexers, properties, and events are not virtual. The virtual keyword specifies the first implementation. The override keyword specifies another implementation. The sealed override combination specifies the last implementation.

class Token {
  ...
  public virtual CoOrdinate Location {
    get {
      ...
    }
  }
}
class LiteralToken : Token {
  ...
  public LiteralToken(string symbol) {
    ...
  }
  public override CoOrdinate Location {
    get {
      ...
    }
  }
}
class StringLiteralToken : LiteralToken {
  ...
  public StringLiteralToken(string
                 symbol) : base(symbol) {
    ...
  }
  public sealed override
                    CoOrdinate Location {
    get {
      ...
    }
  }
}

Interfaces

C# interfaces contain only the names of methods. Method bodies are not allowed. Access modifiers are not allowed (all methods are implicitly public). Fields are not allowed (not even static ones). Static methods are not allowed (so no operators). Nested types are not allowed. Properties, indexers, and events (again with no bodies) are allowed though. An interface, struct, or class can have as many base interfaces as it likes.

interface IToken {
  ...
  CoOrdinate Location { get; }
}

A struct or class must implement all its inherited interface methods. Interface methods can be implemented implicitly or explicitly.

class LiteralToken : IToken {
  ...
  public CoOrdinate Location {
    // implicit implementation
    get {
      ...
    }
  }
}
class LiteralToken : IToken {
  ...
  CoOrdinate IToken.Location {
    // explicit implementation
    get {
      ...
    }
  }
}

You use the abstract keyword to declare an abstract class or an abstract method (only abstract classes can declare abstract methods). You use the sealed keyword to declare a class that cannot be derived from. The inheritance notation is positional; base class first, followed by base interfaces.

interface IToken {
  ...
  CoOrdinate Location {
    get;
  }
}

abstract class DefaultToken {
  ...
  protected DefaultToken(CoOrdinate
                               where) {
    location = where;
  }
  public CoOrdinate Location {
    get {
      return location;
    }
  }
  private readonly CoOrdinate location;
}

sealed class StringLiteralToken
                : DefaultToken, IToken {
  ...
}

Runtime type information is available via the is , as , and typeof keywords as well as the object.GetType() method.

Resource Management

You can declare a destructor in a class. A C# destructor has the same name as its class, prefixed with a tilde (~). A destructor is not allowed an access modifier or any parameters. The compiler converts your destructor into an override of the object.Finalize method. For example, this:

public class StreamWriter : TextReader {
  ...
  ~StreamWriter() {
    Close();
  }
  public override void Close() {
    ...
  }
}

is converted into this: (You can use the ILDASM tool to see this transformation in CIL.)

public class StreamWriter : TextReader {
  ...
  protected override void Finalize() {
    try {
      Close();
    }
    finally {
      base.Finalize();
    }
  }
  public override void Close() {
    ...
  }
}

You are not allowed to call a destructor or the Finalize method in code. Instead, the generational garbage collector (which is part of the CLR) calls Finalize on objects sometime after they become unreachable but definitely before the program ends. You can force a garbage collection using the System.GC.Collect() method. C# does not support struct destructors (although CIL does). However, C# does have a using statement which you can use to scope a resource to a local block in an exception safe way. For example, this:

class Example {
  void Method(string path) {
    using (LocalStreamWriter exSafe =
                new StreamWriter(path)) {
      StreamWriter writer =
                exSafe.StreamWriter;
      ...
    }
  }
}

is automatically translated into this:

class Example {
  void Method(string path) {
    {
      LocalStreamWriter exSafe =
                new StreamWriter(path);
      try {
        StreamWriter writer =
                exSafe.StreamWriter;
        ...
      }
      finally {
        exSafe.Dispose();
      }
    }
  }
}

which relies on LocalStreamWriter implementing the System.IDisposable interface:

public struct LocalStreamWriter
                          : IDisposable {
  public LocalStreamWriter(StreamWriter
                             decorated) {
    local = decorated;
  }

  public static implicit operator
       LocalStreamWriter(StreamWriter
                             decorated) {
    return new
            LocalStreamWriter(decorated);
  }

  public StreamWriter StreamWriter {
    get {
      return local;
    }
  }

  public void Dispose() {
    local.Close();
  }

  private readonly StreamWriter local;
}

Program Relationships

Delegates and Events

The delegate is the last C# type. A delegate is a named method signature (similar to a function pointer in C/C++). For example, the System namespace declares a delegate called EventHandler that's used extensively in the Windows.Forms classes:

namespace System {
  public delegate void EventHandler(
        object sender, EventArgs sent);
  ...
}

EventHandler is now a reference type you can use as a field, a parameter, or a local variable. Calling a delegate calls all the delegate instances attached to it.

namespace Not.System.Windows.Forms {
  public class Button {
    ...
    public EventHandler Click;
    ...
    protected void OnClick(EventArgs
                                 sent) {
      if (Click != null) {
        Click(this, sent); // call here
      }
    }
  }
}

All delegate types implicitly derive from the System.Delegate class. You use the event keyword to modify the declaration of a delegate field. Event delegates can only be used in restricted, safe ways (for example, you can't call the delegate from outside its class):

namespace System.Windows.Forms {
  public class Button {
    ...
    public event EventHandler Click;
  }
}

You create an instance of a delegate type by naming a method with a matching signature and you attach a delegate instance to a matching field using the += operator.

using System.Windows.Forms;

class MyForm : Form {
  ...
  private void initializeComponent() {
    ...
    okButton = new Button("OK");
    okButton.Click += new
    EventHandler(this.okClick);
    // create + attach
  }
  private void okClick(object sender,
                       EventArgs sent) {
    ...
  }
  ...
  private Button okButton;
}

Assemblies

You can compile a working set of source files (all written in the same supported language) into a .NET module. For example, using the C# command line compiler:

csc /target:module /out:ratio.netmodule *.cs

The default file extension for a .NET module is .netmodule . A .NET module contains types and CIL instructions directly and forms the smallest unit of dynamic download. However, a .NET module cannot be run. The only thing you can do with a .NET module is add it to an assembly. An assembly contains a manifest (a module does not). The manifest is metadata that describes the contents of the assembly and makes the assembly self describing. An assembly knows:

the assembly identity
any referenced assemblies
any referenced modules
types and CIL code held directly
security permissions
resources (eg bitmaps, icons)

You create a .NET DLL (an assembly) using the /target:library option from the command line compiler (there are various other options for adding modules and referencing other assemblies):

csc /target:library /out:ratio.dll *.cs

You create a .NET EXE (an executable assembly) using the /target:exe options on the command line compiler (one of the structs/classes must contain a Main method).

csc /target:exe /out:ratio.exe *.cs

Assemblies come in two forms. A private assembly is not versioned, and is used only by a single application. A shared assembly is versioned, and lives in a special shared directory called the Global Assembly Cache (GAC). Shared assembly version numbers are created using an IP like numbering scheme:

<major> . <minor> . <build> . <revision>

Shared applications that differ only by version number can coexist in the GAC (this is called side-by-side execution). The particular version of an assembly that an individual application uses when running can be controlled from an XML file. For example:

...
<BindingPolicy>
  <BindingRedir
    Name="ratio" ...
    Version="*"
    VersionNew="6.1.1212.14"
    UseLatestBuildRevision="no"/>
</BindingPolicy>
...

You can edit this config file to choose your binding policy. For example:

Safe: exactly as built
Default: major.minor as built
Specific: major.minor as specified.

Attributes

You use attributes to tag code elements with declarative information. This information is added to the metadata, and can be queried and acted upon at translation/run time using reflection. For example, you use the [Conditional] attribute to tag methods you want removed from the release build (calls to conditional methods are also removed):

using System.Diagnostics;
  class Trace {
    [Conditional("DEBUG")]
    public static void Write(string
                               message) {
      ...
  }
}

You use the [ CLSCompliant ] attribute to declare (or check) that a source file conforms to the Common Language Specification:

using System;

[assembly:CLSCompliant(true)]
...

You can use the [ MethodImpl ] attribute to synchronize a method:

using System.Runtime.CompilerServices;

class Example {
  [MethodImpl(MethodImplOptions.
                           Synchronized)]
  void SynchronizedMethod() {
    ...
  }
}

The attribute mechanism is extensible; you can easily create and use your own attribute types:

public sealed class DeveloperAttribute
                 : Attribute {
  public DeveloperAttribute(string name) {
    ...
  }
}
...
[Developer("Jon Jagger")]
public struct LocalStreamWriter
                          : IDisposable {
  ...
}

Summary

C# programs compile into Common Intermediate Language (CIL). CIL types that conform to the CLS (Common Language Specification) can be used by any .NET language. For example, the types in the System namespace are implemented in the mscorlib.dll assembly. Programs written in C#, in VB.NET, or in managed C++, can all use this assembly (there isn't one version of the assembly for each language).

CIL programs are translated into executable programs either at installation time or just-in-time as they are executed by the VES (Virtual Execution System). The CLI (Common Language Infrastructure - the CTS, the VES, the CLS, and the metadata specification) is an ECMA standard and efforts are already underway to implement the CLI on non Windows platforms (eg http://www.go-mono.com ).

C# is a modern general purpose programming language. It has clear similarities to Java (reference types, inheritance model, garbage collection) and to C++ (value types, operator overloading, logical namespaces, by default methods are not virtual). It has no backward compatibility constraints (as C++ does to C) and avoids/resolves known problems in Java. The CTS (Common Type System) makes a clear distinction between value types and reference types. The more I use C# the more I like it and the more I appreciate the careful and consistent decisions taken during its design. C# is my language of choice for .NET development. In roughly keeping to the allotted word count I have necessarily omitted numerous important aspects of C#. Nevertheless I hope this article has given you a flavour of C# and its relationship to .NET.