A Different Kind of
Polymorphism
Templates give us a different kind of reuse
mechanism than inheritance. Inheritance means that we'd like to
reuse the implementation and signature of a base class and extend
it in a derived class. Inheritance is type-centric. The derived
class is type compatible with the
base class. Type compatibility means that an instance of the
derived class can be used where an instance of the base class is
required.
On the other hand, templates enable you to reuse
the behavior of a class, but to divorce it from the types involved.
If the type can fit where it's being used, the compiler doesn't
care about the type. For example, all that was required of the type
being passed to the Array template was that it had a
default constructor so that it could be created in arrays. If the
type couldn't live up to that requirement, the compiler would
complain.
Because of the way the compiler generates code
using template arguments, templates give us a different kind of
polymorphism. Instead of polymorphism based on type compatibility,
we've got polymorphism based on signature
compatibility. If the type has the appropriate
function signatures available, the compiler is
perfectly happy. This kind of polymorphism has some interesting
properties, of which ATL makes heavy use.
Using Behavior
Classes
Imagine modifying the Array class so
that it does bounds checking only if you so desire:
template <typename T, size_t MAX_ELEMS = 8,
bool bCheckBounds = true>
class Array {
public:
T& operator[](size_t n) {
if( bCheckBounds && (n < 0 || n >= MAX_ELEMS) )
throw "out of bounds!";
return m_rg[n];
}
protected:
T m_rg[MAX_ELEMS];
};
This allows a client to turn bounds checking on
or off, based on its own requirements:
void main(int argc, char* argv[]) {
#ifdef _DEBUG
bool bCheckBounds = true;
#else
bool bCheckBounds = false;
#endif
Array<long, 256, bCheckBounds> array;
array[256] = 1;
}
The intent here is that we skip the bounds
checks in release mode because we have hopefully caught the
out-of-bounds errors during development. However, we're still doing
a check against bCheckBounds every time through the
operator[] member function. We could hope for a good
compiler that would optimize away the line because
bCheckBounds is known at compile time, or we could remove
all doubt and make use of a behavior
class.
Behavior Classes
Can Group Functions
A behavior class (also known as a trait class or a
trait) is a class that contains some static member functions or
some type definitions. Behavior classes are never meant to be
created. Instead, by putting the functions and type definitions
into a class, we've grouped them and can refer to them as a
unit.
To solve our bounds-checking problem, imagine
two behavior classes, one that checks the bounds and one that does
not:
struct DebugBoundsChecker {
static void CheckBounds(size_t n, size_t nMax) {
if( n < 0 || n >= nMax ) throw "out of bounds!";
}
};
struct ReleaseBoundsChecker {
static void CheckBounds(size_t n, size_t nMax) {}
};
Notice that both the behavior classes have the
same function with the same signature. The debug version does the
bounds checking, and the release version doesn't do anything. And
because both implementations are inline functions, the compiler can
optimize easily. Given only the knowledge of the signature of the
bounds-checking function, I can rewrite the Array class to
use the CheckBounds function scoped by a type name passed
as a template parameter:
template <typename T, size_t MAX_ELEMS = 8,
typename BoundsChecker = DebugBoundsChecker>
class Array {
public:
T& operator[](size_t n) {
BoundsChecker::CheckBounds(n, MAX_ELEMS);
return m_rg[n];
}
protected:
T m_rg[MAX_ELEMS];
};
The client can now take advantage of the bounds
checkingor not, as decided at compile time:
void main(int argc, char* argv[]) {
#ifdef _DEBUG
typedef DebugBoundsChecker BoundsChecker;
#else
typedef ReleaseBoundsChecker BoundsChecker;
#endif
Array<long, 256, BoundsChecker> array;
array[256] = 1;
}
In this case, I've used
signature compatibility to make my decisions at compile time,
resulting in potentially more efficient code. And if I want to add
another kind of bounds-checking behavior class in the future, I
can, if it has a CheckBounds function that lives up to the
signature requirements of the caller.
Simulating Dynamic
Binding
One of the benefits of inheritance is dynamic
binding. A virtual function in the base class can be overridden in
the derived class to extend or replace base class functionality.
One especially powerful expression of this idea is the pure virtual member function. A pure virtual
member function is a virtual member function that must be overridden in the deriving class. In
fact, any class with a pure virtual member function is thought of
as an abstract base class (ABC),
and no instance of that class can be created. To use the
functionality of an ABC, it must be used as a base class, and the
deriving class must implement all the pure virtual member
functions. This is useful because it allows the base class to
define functionality required by the deriving class:
template <typename T>
class Array {
public:
...
virtual int Compare(const Array<T>& rhs) =0;
bool operator< (const Array<T>& rhs)
{ return this->Compare(rhs) < 0; }
bool operator> (const Array<T>& rhs)
{ return this->Compare(rhs) > 0; }
bool operator== (const Array<T>& rhs)
{ return this->Compare(rhs) == 0; }
T m_rg[1024];
};
By defining the Compare function as
pure virtual, the Array class designer has decreed that
Array can be used only as a base class and that the
deriving class must provide some way of comparing two arrays of the
same type so that the comparison operators can be implemented. For
example, a String class must implement the
Compare function:
class String : public Array<char> {
public:
// the compiler requires this method:
int Compare(const Array<char>& rhs)
{ return strcmp(m_rg, rhs.m_rg); }
};
The compiler uses the implementation of the pure
virtual member function provided in the derived class to fill in
the virtual function table entry for that member function (as it
does with all virtual member function pointers). Using function
pointers to invoke member functions is how C++ provides dynamic
bindingthat is, binding to the specific implementation at runtime
based on the type of the variable.
However, we're paying the price of dynamic
binding for our String class. The price of dynamic binding
is a virtual function table, shared among all instances, an extra
4-bye virtual function pointer per object, and at least two extra
pointer indirections to invoke the appropriate implementation. This
seems a high price to pay, given perfect knowledge at compile time.
We know how to compare two String objects. Why not make
the compiler do the work and save us the overhead of dynamic
binding? We can do that by using simulated dynamic binding.
Simulated dynamic binding enables us to provide
the name of the deriving class to the base class. The base class
casts itself to the deriving class to call the required function.
Revising the Array to use simulated dynamic binding looks
like this:
template <typename T, typename Deriving>
class Array {
public:
...
bool operator< (const Array<T, Deriving>& rhs)
{ return static_cast<Deriving*>(this)->Compare(rhs) < 0; }
bool operator> (const Array<T, Deriving>& rhs)
{ return static_cast<Deriving*>(this)->Compare(rhs) > 0; }
bool operator== (const Array<T, Deriving>& rhs)
{ return static_cast<Deriving*>(this)->Compare(rhs) == 0; }
T m_rg[1024];
};
Notice that the Array template takes an
additional parameter: the name of the deriving class. It uses that
class name to perform a static cast on itself. Because the compiler
expands the code of the base class while instantiating the deriving
class, the static cast performs a perfectly safe downcast (normally
a contradiction in terms). The compiler uses the member functions
of the deriving cast to resolve the address of the required
function at compile time, saving us the cost of dynamic binding.
The deriving class must implement the appropriate member function
and pass its own name as a parameter to the base class
template:
class String : public Array<char, String> {
public:
// the compiler requires this method:
int Compare(const Array<char, String>& rhs)
{ return strcmp(m_rg, rhs.m_rg); }
};
This technique gives us the look and feel of
dynamic binding without using virtual member functions. The base
class can require functionality of the deriving class. The base
class cannot be instantiated by itself. The major difference is
that because no virtual functions are required, we don't have the
overhead of dynamic binding.
The Discovery of
Simulated Dynamic Binding
It's my understanding that the ATL team members
discovered simulated dynamic binding accidentally. When they did,
they went immediately to the compiler team members, who claimed
that the C++ Standard does not mandate or prohibit such behavior,
but they promised to keep it working for ATL. Does that mean you
should use simulated dynamic binding when you're writing your most
portable code? Probably not. Should you still understand it because
ATL is rife with it? Absolutely. Should you sneak it into your own
bag of tricks? It's in mine.
|