25 Sep 2011, 01:56

C++ Object Initialization and Finalization Protocol

Note: this article first appeared as a Tip Of The Day in Flipcode.

This is a coding procedure we have adopted as standard in all of Pyro Studios projects. The roots of it probably come from the fact that the first Object Oriented language I learned was Borland’s Turbo Pascal 5; it didn’t have automatic construction or destruction calls, so you had to initialize the instances explicitly. Ok, no big deal, the real benefits were extensible classes and virtual methods.

Then I arrived to the C++ world, reading who is probably the best writer about C++: Brucke Eckel. All those magic things happening under the hood! Constructors! Virtual multiple inheritance! Templates and Exceptions! WOW. At first I was thinking every new feature was cooler than the previous one. At some point, however, I came to the conclusion that I just had too many features in my hands, and not all of them were good all the time. But how to structure this? I can handle it just by intuition when I’m coding at home, but when it comes to a programming team composed of more than ten programmers, intuition is just not there. So I evolved a set of coding standards that would enable more coherence in the code written by a team of people. This is just one of those standards, which I have discussed and refined over time with teammates like Dwight Luetscher, Scott Corley, Terry Wellman, Unai Landa and Jon Beltran de Heredia.

Any class in a software system will conform to this structure: Constructor: Does not receive parameters, and its body simply assigns a specific value to one member field. Destructor: Simply calls the End() member funcion It will have at least the following member functions:

bool Init(...);     // Whatever initialization parameters actually needed.
void End();         // No parameters and no return value for finalization.

bool IsOk() const;  // Typically will be inlined.

The Init() funcion may have parameters, and the return value is some sort of error code indicating whether the object was properly initialized (error codes are a whole separate topic worth its own tip of the day).

The End() funcion never gets parameters, which means that the object must remember any potential piece of data necessary for freeing its resources.

The function IsOk() returns true if the object is correctly initialized and ready for use. It will always return false before the first call to Init(), after a call to End(), or after a call to Init() that returns an error.

Any of these three functions must be acceptable at any point in time. This means, You may call Init() several times in a row without any memory or resource leakage. You may also call End() succesively without the code attempting any invalid operation.

Any other member functions may check the IsOk() state before going to perform their operations. In many cases you will want to use assert() to flag as a programming error any attempts to use an object which failed to initialize properly.

This is a typical implementation:

---- File InitEndTest.h ----

class CInitEndTest
{
  public:
    CInitEndTest    (): m_bOk(false)   { }
    ~CInitEndTest   ()                 { End(); }

    bool    Init    (parameters);
    void    End     ();
    bool    IsOk    ()           const { return m_bOk; }

    void    DoStuff ();

  private:
    bool    m_bOk;
};
---- File InitEndTest.cpp ----

#include "InitEndTest.h"

bool CInitEndTest::Init(parameters)
{
  End();
  // ... perform initializations
  // ... set m_bOk to true if everything goes right.
  return m_bOk;
}

void CInitEndTest::End()
{
  if (m_bOk)
  {
    // Free resources.
    m_bOk = false;
  }
}

void CInitEndTest::DoStuff()
{
  ASSERT(IsOk());
  // Actually do the stuff
}

When you derive a new class from one that uses this protocol, you should be calling the base’s Init() and End() functions at the right places, instead of messing around with the m_bOk member. If you use virtual functions and polymorphic behaviour, then both the destructor and the End() function must be virtual as well (in the base and the derived classes). Supposing the previous example had them properly virtualized, here’s an example of derivation:

---- File VirtualTest.h ----

#include "InitEndTest.h"

class CVirtualTest: public CInitEndTest
{
    typedef CInitEndTest inherited;  // To avoid the actual base's name.

  public:
            CVirtualTest    ()                 { }
    virtual ~CVirtualTest   ()                 { End(); }

            bool    Init    (parameters);
    virtual void    End     ();

  private:
};
---- File VirtualTest.cpp ----

#include "VirtualTest.h"

bool CVirtualTest::Init(parameters)
{
  End();

  // Initialize the base first.  
  if (inherited::Init())
  {
    // ... perform my own initializations
    // ... call inherited::End() if they fail.
  }
  return IsOk();
}

/*virtual*/ void CVirtualTest::End()
{
  if (IsOk())
  {
    // Free my own resources
    // Free the base's at the end.
    inherited::End();
  }
}

So what’s the advantage of limiting the use of the powerful C++ constructor / destructor syntax? Simply, that the initialization and lifespan of an instance are not tied together. This can be useful in many situations when you want to reinitialize a set of objects that are pointed to from many different places in the system, for example textures and lost surfaces in a 3D engine after the user Alt-TABs out and back into your app.

Thanks to the ASSERT sentences you quickly catch any initializations when a return code was not properly checked. With correct inlining, there is no speed penalty to IsOk() calls. And the construction doesn’t fill in any unnecessary fields, making array allocations faster to execute.

In some situations where you want to keep the memory or code tight, you might use an actual data member to flag initialization, instead of a separate m_bOk member. For example, in a File class wrapper, if you have a FILE *m_pFile member to point to the actual stdio file, then your constructor would set it to NULL and your IsOk() function would compare it to NULL; an array class might use “m_numElements == 0”, a DirectX wrapper might use “m_pDD == NULL” and so on.

“So what’s the big deal? I have done this already when I figured I needed it.” The real advantage lies in the fact that, when you apply this protocol all through a big system, you always have a standard way of checking for the state of an object before you use it. Different programmers don’t figure out their own names and code paths, and you can catch missing error checks quickly. As usual, the counterpart is that any simple class already requires writing a few lines of boring “bureaucratic” code. As time passed, I found myself using this protocol for even the simplest of classes in my own personal projects, and has now become about as standard a part of development as a good Source Control system.

Note: you may want to dispense of this protocol for classes that are simple data structures with no resource allocation, like math vectors, matrices and such, where initialization is a simple matter of assigning values.