Minimising Stack Use and Other Uses of new

Last issue I promised you an article on the use of new to help accommodate deeply recursive functions. Before I do so I want to bring you all up to speed on a technique sometimes called the 'Cheshire Cat' method. Experienced programmers will already be familiar with it either by that name or by some other.

Leaving Only a Smile (pointer)

This technique was originally developed for two other purposes that had little to do with minimising stack use; minimise recompilation and really hide data structures and implementation details from the client programmer. It has now become popular as a method for providing implementations for different platforms as transparently as possible. In the second part of this article I am going to provide an extra twist that uses one of its side effects, namely that data is dynamically assigned to the heap.

First let me clear the air for the less experienced reader. Despite what it says in far too many doorstops that masquerade as books on C++ and OOP, the class is NOT the vehicle to encapsulate an Abstract Data Type. The C++ class is part of your tool-box for implementing a variety of programming methods which include a full representation of an ADT.

All programmers that have passed the C++ kindergarten stage know that it is necessary to use global friends to handle some desirable operators for many ADTs (for example the insert and extract operators for I/O). Those who have progressed rather further will know about handle classes and a variety of other techniques that are desirable to produce a complete ADT. (There may even be some among you who have read James Coplien's Advanced Programming Styles and Idioms to the end who know that classes can be used to support the more traditional, pre-OOP, block structured methodology).

In simple terms an ADT is represented in C++ by a tightly bound group of functions and classes with controlled access between them. This grouping is usually transparent to the client programmer but should be clearly understood by a library designer or anyone else seeking to implement an ADT. Unfortunately those who try to stuff too much into a single client accessible class damage both themselves and their clients.

Whatever you put into an accessible class must be revealed in its declaration (i.e. in a header file). This means that client programmers can inspect your implementation (and with the shocking standard of much documentation, often need to in order to decide the efficiency of your implementation in the context of their program. For example, it is not enough to know that a class can sort data, I need to know what algorithm has been used to determine its suitability for the profile of the data I have.)

There are not a few times when you do not want your CPs to inspect your implementation. Sometimes it contains a currently commercially secret algorithm, at other times you do not want CPs making assumptions based on implementation details which may not remain valid in future releases of your library.

Another circumstance is where you wish to provide alternative implementations for different platforms or for other commercial reasons (such as using a relatively poor implementation in a PD or Shareware version while reserving a much more powerful version for the commercial release).

Now what about the CP's problems or those of a development team working on a large project? Every time you change a general header file all files using that header must be recompiled. Those working on projects that are only a few thousand lines long may be quite happy to have an excuse for a tea break. If you were working on a large complex project you don't want to do a rebuild until you are ready to go home (I recently heard of a 300 000 line project that took 24 hours for a complete rebuild).

What we need is complete separation between interface and implementation. Any competent designer should be able to get the interface stable early on. Indeed I would expect the interface to be stable from the end of the prototype stage of program development. What is not stable for a wide range of reasons is the implementation.

Publicly available header files should only declare the stable interface. The only files that need access to implementation details are those that actually provide those details and the specific source code files that provide the code that puts the interface into effect.

The Smile (solution)

Once you have identified the problem the solution is relatively easy to spot, and even if you do not see it yourself you can easily appreciate it.

Create two classes. For example:

class cat {
    // the implementation details
};

class cheshire {
    cat * smile; 
public:
    // public interface specification 
};

All that needs to be made Publicly available via a header file is the second of these with a declaration of the class name cat.

The source code for cheshire and cat are hived off into their own file or files. When you change the implementation you recompile the relevant files and relink (but not recompile) everything else.

You distribute only the public header file together with the compiled implementation. Now if you change your implementation for whatever reason your CPs only need to relink with your new object code files. They have no access and (if you write proper documentation) no need for access to implementation details. Now you really have hidden the implementation details which is something that cannot be done with a single class.

Cleaning Up

I believe that where it would be wrong for a class to be instantiated globally that construction should be controlled. Only interface classes have any right of access to implementation classes so I suggest that you should use the following skeleton:

class cat {
    friend class cheshire        // provide restricted access
    cat ( ){ .... }              // private constructor
    cat (const cat & c) { .... } // and copy constructor
    // the rest of the implementation

};

class cheshire {
    cat * smile; 
public:
    cheshire (    ){ smile = new cat;  ... }
    ~cheshire (    ){  ...    delete smile;}
    // the rest of the interface 
};

You will need to do a little work to ensure that items are declared and defined in the right places. You will also have to be particularly careful where you put stuff inline as it is quite easy to destroy the main objectives with careless inlining. You may prefer to eliminate inline from development code to ensure minimal rebuilds while the implementation is still fluid. Distribution code may use inline functions if you are not intending to provide upgrades that only require relinking or where you are certain that later versions will not alter the inline function.

Something More to Smile About (stack saving)

One of the side-effects of this program technique is that all data gets transferred from the stack to an allocation from the heap. On many platforms your stack is a precious resource not to be squandered lightly. This is even more the case where you are using recursion.

I can remember my first confrontation with this problem. I was writing a polygon fill function for a graphics package for an implementation of FORTH. I came across what seemed to be an excellent algorithm, not least because I could understand exactly how it worked. However it materialised that it had a fatal flaw, it stacked points to be filled and called itself recursively.

Even a stack based language such as FORTH could not cope with this kind of stack explosion.

I hope you can now see why I took this diversion to explain the 'Cheshire Cat'. Clearly the first improvement to any recursive function is to provide a local data structure to hold the functions internal data in dynamic memory from a heap. For reasons that will become apparent I am not going to use a true local class but provide the necessary functionality via a restricted access global class.

class rdata {
    friend T recurse ( /* parameter list */ );
    rdata ();  // inhibit default constructor
    rdata (const rdata & r) {}; // provide restricted copy constructor
    rdata ( /* initialiser parameters */){...} // provide constructor for 
                                               // normal use by recurse()
    ~rdata (){ ... } // the data structure 
};

T recurse ( /* parameter list */ ){
    rdata r=new rdata( /* initialiser parameters */ }
    // the rest of the function
    delete r; 
}

This certainly reduces the amount of stack use in many cases but in many cases by less than you would like. There is still a parameter list to consider. If you think that passing parameters by reference will solve this problem it is time you stopped to consider how reference passing is probably implemented. Yes, you're right, a hidden pointer, so each parameter probably costs you storage for a pointer on the stack. You will also need to consider the return value.

Smile at Everyone

One way round this is to include space for the parameters in the rdata structure. The problem with this is that you have to make rdata generally accessible so that functions wanting to call the recursive function can construct an rdata object to pass.

Your code now becomes:

class rdata {
    friend void recurse ( rdata & r);
    rdata ();     // inhibit default constructor
    rdata (const rdata & r) { }; // the data structure 
public:
    rdata ( /* initialiser parameters */){...} // provide constructor for 
                                               // normal use by recurse()
    ~rdata (){ ... } 
};

void recurse ( rdata & r){
// the rest of the function }

Why the continued declaration of friendship and the private copy constructor? Well think about the nature of recurse(). It needs a mechanism for relaying data to the next call of itself.

The problem with this version is that the responsibility for constructing (and destructing) an rdata object is passed back to the calling function which is not something to be encouraged.

Only Smile at Friends

A better solution involves using a helper function (which is the only part that needs to know the details.

class rdata {
    friend T recurse ( /* parameter list */ );
    rdata ();     // inhibit default constructor
    rdata (const rdata & r) { }; // provide restricted copy constructor
    rdata ( /* initialiser parameters */ ) { ... }
    // provide constructor for normal use by recurse()
    ~rdata (){ ... }
    void recursif ( ); // the data structure
};

T recurse ( /* parameter  list */  ){
    rdata r=new rdata( /*  initialiser parameters */ }
    r.recursif();
    delete r; 
}

Now we are making progress. Its not quite as good as it looks because we still have at least a return address and a representation of 'this' sitting on the stack for each recursive call.

Too improve this any further we need to try something more drastic. It is time to get serious and dig out your notes on data structures. If you are not familiar with writing push down stacks you will have to go and do some homework because you will need one for the next version.

Don't Call your Friends, Invite Them to Form a Queue

Let me concentrate on rdata::recursif(). Note that I have not specified how the recursion is implemented but have assumed that it is via the call mechanism but it does not have to be that way.

Have a look at this:

void recursif(rdata * r) { 
    int depth =0; 
    rdata * link= r; 
    do {
        depth++;
        link=new rdata(link);
        // code up to recursive call
    } while ( /* test to end recursion */);
    do {
        // tail of code executed after return from recursive call
        delete link;
    } while ( --depth);
}

This would be nice if we could get it to work. To do so we need a number of things. First we must write a that creates a new copy of the rdata structure and then pushes it onto the front of a push-down stack managed through 'link'. We also need to write a destructor that pops items off the stack. I leave the reader to flesh out the following (try a simple recursive function till you get the hang of it).

class rdata {
    friend T recurs( /* parameter list */);
    rdata ();     // no default construction
    rdata (const rdata & r) ; // inhibit
    rdata (rdata * ln);    // define as required above
    rdata ( /* parameter list */) ; // define appropriately
    // the actual data structure
    rdata * link; // implement stack as linked list 
}

T recurs ( /* parameter list */){
    rdata * link = new rdata( /* parameter list */); 
    int depth = 0; 
    do {
        depth++;
        link=new rdata(link);
        // code up to recursive call
    } while ( /* test to end recursion */);
    do {
        // tail of code executed after return from recursive call
        delete link;
    } while ( --depth);
    T t= ...;      // grab a copy of the return value
    delete link;   // get rid of initial data set
    return t; 
}

Notice that we have come the full circle and we are back with a single function and a class that might be declared as a local class. I didn't because it was required to be otherwise in intermediate code and at one time I had the a static pointer to rdata as a member of rdata so that the copy constructor could be hijacked to push new data sets onto the push down stack. I decided that solution was unnecessarily fancy.

It is worth looking at the final cost in speed and size compared with a naive implementation. The link pointer is the only extra piece of data (apart from any data such as depth that is only used once). This is balanced by no call overhead on the recursive function. Hence the total data space will be no more than the naive version and probably less.

On the speed side there is the overhead for using new. If this gets too large you can provide your own version of new for the class which gets memory from the heap in substantial multiples of that for a single data set. You would then need to overload delete so that this memory was returned as each block was freed up. Fancier memory management methods are available to programmers with more experience.

Next time I will look at overloading new and delete and discuss the revisions introduced in the recent ANSI/ISO meetings to provide facilities for providing user defined memory management for arrays.

And finally, please provide any feedback that results from your reading of this article. This is only my first cut at the topic and I expect that there is plenty of room for improvement.