I l@ve RuBoard

Solution

This idiom^[3] is frequently recommended, and it appears as an example in the C++ standard (see the discussion in the accompanying box). It is also exceedingly poor form and causes no end of problems. Don't do it.

^[3] I'm ignoring pathological cases (for example, overloading T::operator&() to do something other than return this). See also Item 38.

Fortunately, as we'll see, there is a right way to get the intended effect.

A Warning Example

In the C++ standard, this example was intended to demonstrate the object lifetime rules, not to recommend a good practice (it isn't). For those interested, here it is, slightly edited for space, from clause 3.8, paragraph 7:

 [Example: 
  struct C {
    int i;
    void f();
    const C& operator=( const C& );
  };
  const C& C::operator=( const C& other)
  {
    if ( this != &other )
    {
      this->~C();     // lifetime of *this ends
      new (this) C(other);
                      // new object of type C created
      f();            // well-defined
    }
    return *this;
  }
  C c1;
  C c2;
  c1 = c2; // well-defined
  c1.f();  // well-defined; c1 refers to a new object of type C
--end example]

As further proof that this is not intended to recommend good practice, note that here C::operator=() returns a const C& rather than a plain C&, which needlessly prevents portable use of these objects in standard library containers.

From the PeerDirect coding standards:

declare copy assignment as T& T::operator=(const T&)
don't return const T&. Though this would be nice, because it prevents usage such as "(a=b)=c", it would mean that, for example, you couldn't portably put T objects into standard library containers, since these require that assignment returns a plain T& (see also Cline95: 212; Murray93: 32-33)

Now let's look at the questions again.

What legitimate goal does it try to achieve?

This idiom expresses copy assignment in terms of copy construction. That is, it's trying to make sure that T's copy assignment and its copy constructor do the same thing, which keeps the programmer from having to needlessly repeat the same code in two places.

This is a noble goal. After all, it makes programming easier when you don't have to write the same thing twice, and if T changes (for example, gets a new member variable) you can't forget to update one of the functions when you update the other.

This idiom could be particularly useful when there are virtual base classes that have data members, which would otherwise be assigned incorrectly at worst or multiple times at best. While this sounds good, it's not really much of a benefit in reality, because virtual base classes shouldn't have data members anyway (see Meyers98, Meyers99, and Barton94). Also, if there are virtual base classes, it means that this class is designed for inheritance, which (as we're about to see) means we can't use this idiom—it is too dangerous.

The rest of the first question was: Correct any coding flaws in the version above.

The above code has one flaw that can be corrected, and several others that cannot.

Problem 1: It Can Slice Objects

The code "this->~T(); new (this) T(other);" does the wrong thing if T is a base class with a virtual destructor. When called on an object of a derived class, it will destroy the derived object and replace it with a T object. This will almost certainly break any subsequent code that tries to use the object. See Item 40 for further discussion of slicing.

In particular, this makes life a royal pain for authors of derived classes (and there are other potential traps for derived classes; see below). Recall that derived assignment operators are usually written in terms of the base's assignment, as follows:

Derived& 
Derived::operator=( const Derived& other )
{
  Base::operator=( other );
  // ...now assign Derived members here...
  return *this;
}

In this case, we get:

class U : T { /* ... */ }; 
U& U::operator=( const U& other )
{
  T::operator=( other );
  // ...now assign U members here...
  //    ...oops, but we're not a U any more!
  return *this;  // likewise oops
}

As written, the call to T::operator=() silently breaks all the code after it (both the U member assignments and the return statement). This often manifests as a mysterious and hard-to-debug run-time error if the U destructor doesn't reset its data members to invalid values.

To correct this problem, we could try a couple of alternatives.

Should U's copy assignment operator call "this->T::~T();" instead, perhaps followed by a placement new of the T base subobject? Well, that would ensure that for a derived object, only the T base subobject will be replaced (rather than have the whole derived object sliced and wrongly transformed into a T). It turns out that there are pitfalls with this approach, but let's keep this idea active for now.
So should we go a step further, and have U's copy assignment operator follow T's idiom of in-place destruction (of the whole U object) and placement reconstruction? This is a better alternative than the previous one, but it well illustrates a secondary weakness of the idiom. If one class uses this idiom, then all its derived classes must use the idiom, too. (This is a bad thing for many reasons, not the least of which is that the default compiler-generated copy assignment operator doesn't use this idiom, and many programmers who derive without adding data members won't even think of providing their own version of operator=(), thinking that the compiler-generated one ought to be okay.)

Alas, both of these patchwork solutions only replace an obvious danger with a subtler one that can still affect authors of derived classes (see below).

Guideline

Avoid the "dusty corners" of a language; use the simplest techniques that are effective.

Guideline

Avoid unnecessarily terse or clever code, even if it's perfectly clear to you when you first write it.

Now for the second question: Even with any flaws corrected, is this idiom safe? Explain.

No. Note that none of the following problems can be fixed without giving up on the entire idiom.

Problem 2: It's Not Exception-Safe

The new statement will invoke the T copy constructor. If that constructor can throw (and many classes do report constructor errors by throwing an exception), then the function is not exception-safe, because it will end up destroying the old object without replacing it with anything.

Like slicing, this flaw will break any subsequent code that tries to use the object. Worse, it will probably cause the program to attempt to destroy the same object twice, because the outside code has no way of knowing that the destructor for this object has already been run. (See Item 40 for further discussion of double destruction.)

Guideline

Always endeavor to write exception-safe code. Always structure code so that resources are correctly freed and data is in a consistent state even in the presence of exceptions.

Problem 3: It Changes Normal Object Lifetimes

This idiom breaks any code that relies on normal object lifetimes. In particular, it breaks or interferes with all classes that use the common "resource acquisition is initialization" idiom. In general, it breaks or interferes with any class whose constructor or destructor has side effects.

For example, what if T (or any base class of T) acquires a mutex lock or starts a database transaction in its constructor and frees the lock or transaction in its destructor? Then the lock/transaction will be incorrectly released and reacquired during an assignment—typically breaking both client code and the class itself. Besides T and T's base classes, this can also break T's derived classes if they rely on T's normal lifetime semantics.

Some will say, "But I'd never do this in a class that acquires/releases a mutex in its constructor/destructor!" The short answer is, "Really? And how do you know that none of your (direct or indirect) base classes does so?" Frankly, you often have no way of knowing this, and you should never rely on your base classes working properly in the face of playing unusual games with object lifetimes.

The fundamental problem is that this idiom subverts the meaning of construction and destruction. Construction and destruction correspond exactly to the beginning/end of an object's lifetime, at which times the object typically acquires/releases resources. Construction and destruction are not meant to be used to change an object's value (and in fact they do not; they actually destroy the old object and replace it with a lookalike that happens to carry the new value, which is not the same thing at all).

Problem 4: It Can Still Break Derived Classes

With Problem 1 solved using the first approach, namely calling "this->T::~T();" instead, this idiom replaces only the "T part" (or T base subobject) within a derived object. For one thing, this should make us nervous because it subverts C++'s usual guarantee that base subobject lifetimes embrace the complete derived object's lifetime—that base subobjects are always constructed before, and destroyed after, the complete derived object. In practice, many derived classes may react well to having their objects' base parts swapped out and in like this, but some may not.

In particular, any derived class that takes responsibility for its base class's state could fail if its base parts are modified without its knowledge (and invisibly destroying and reconstructing an object certainly counts as modification). This danger can be mitigated as long as the assignment doesn't do anything more extraordinary or unexpected than a "normally written" assignment operator would do.

Problem 5: `this != &other`

The idiom relies completely on the this != &other test. (If you doubt that, consider what happens in the case of self-assignment if the test isn't performed.)

The problem is the same as the one we covered in Item 38: Any copy assignment that is written in such a way that it must check for self-assignment is probably not strongly exception-safe.^[4]

^[4] There's nothing wrong with using the this != &other test to optimize away known self-assignments. If it works, you've saved yourself an assignment. If it doesn't, of course, your assignment operator should still be written in such a way that it's safe for self-assignment. There are arguments both for and against using this test as an optimization, but that's beyond the scope of this Item.

This idiom harbors other potential hazards that can affect client code and/or derived classes (such as behavior in the presence of virtual assignment operators, which are always a bit tricky at the best of times), but this should be enough to demonstrate that the idiom has some serious problems.

And for the last part of the second question: If not, how else should the programmer achieve the intended results?

Expressing copy assignment in terms of copy construction is a good idea, but the right way to do it is to use the canonical (not to mention elegant) form of strongly exception-safe copy assignment. Provide a Swap() member that is guaranteed not to throw and that simply swaps the guts of this object with another one as an atomic operation, then write the copy assignment operator as follows:

T& T::operator=( const T& other ) 
{
  T temp( other );  // do all the work off to the side
  Swap( temp );     // then "commit" the work using only
  return *this;     //    nonthrowing operations
}

This method still implements copy assignment in terms of copy construction, but it does it the right, safe, low-calorie, and wholesome way. It doesn't slice objects; it's strongly exception-safe; it doesn't play games with normal object lifetimes; and it doesn't rely on checks for self-assignment.

For more about this canonical form and the different levels of exception-safety, see also Items 8 through 17, and 40.

So, in conclusion, the original idiom is full of pitfalls; it's often wrong; and it makes life a royal pain for the authors of derived classes. I'm sometimes tempted to post the problem code in the office kitchen with the caption: "Here be dragons."^[5]

^[5] True, none of the solutions presented here works for classes with reference members—but that's not a limitation of the solutions, it's a direct result of the fact that classes with reference members shouldn't be assigned. If one needs to change the object that a reference refers to, one really ought to be using a pointer instead.

Guideline

Prefer providing a nonthrowing Swap() and implement copy assignment in terms of copy construction as follows:

// GOOD 
T& T::operator=( const T& other )
{
  T temp( other );
  Swap( temp );
  return *this;
}

Guideline

Never use the trick of implementing copy assignment in terms of copy construction by using an explicit destructor followed by placement new, even though this trick crops up every three months on the newsgroups—that is, never write:

// BAD 
T& T::operator=( const T& other )
{
  if( this != &other)
  {
    this->~T();             // evil
    new (this) T( other );  // evil
  }
  return *this;
}

Some Advice in Favor of Simplicity

My best advice? Don't get fancy. Treat a new C++ feature just as you would treat a loaded automatic weapon in a crowded room. Never use it just because it looks nifty. Wait until you understand the consequences, don't get cute, write what you know, and know what you write.

Cuteness hurts, for at least two reasons. First, cute code hurts portability. The cute programming trick you used to good effect under one compiler may not work at all on another. Worse, the second compiler might accept the code, but produce unintended results. For a simple example, many compilers don't support default template arguments or template specialization, so any code that relies on those features is effectively nonportable today. For a subtler example, many compilers disagree on how to handle multiple active exceptions. And if you just did a double-take—thinking that multiple active exceptions aren't allowed (a popular misconception)—just imagine how the poor compiler manufacturers must lose sleep over it, since there can indeed be any number of simultaneously active exceptions during stack unwinding.

Second, cute code hurts maintainability. Always remember that someone will have to maintain what you write. What you consider a neat programming trick today will be cursed as an incomprehensible rat's nest tomorrow by the person maintaining your code—and that maintainer might be you. For example, a certain graphics library once made cute use of what was then a new language feature: operator overloading. Using this library, your client code to add a control widget to an existing window would look something like this:

Window  w( /*...*/ ); 
Control c( /*...*/ );
w + c;

Instead of being cute, the library writers should have followed Scott Meyers' excellent advice to "do as the ints do." That is, when using operator overloading or any other language feature for your own classes, when in doubt always make your class follow the same semantics as the builtin and standard library types.

Write what you know when writing code. Consciously try to restrict 90% of your coding to the language features you understand well enough to write correctly in your sleep (you do dream about C++ code, don't you?), and use the other 10% to gain experience with other features. For those newer to C++, this advice may at first mean using C++ as just a better C, and there's absolutely nothing wrong with that.

Next, know what you write. Be aware of the consequences of what your code actually does. Among other things, keep up your reading in books like this one; relevant Internet newsgroups, such as comp.lang.c++.moderated and comp.std.c++; and trade magazines, such as C/C++ User's Journal, and Dr. Dobb's Journal. You'll need that continuous self-improvement to stay current with and deepen your understanding of your chosen programming language.

To illustrate the importance of knowing what you write, consider the dangerous idiom that we just discussed in this Item—namely, the trick of trying to implement copy assignment in terms of explicit destruction followed by in-place new. This trick crops up regularly every few months on the C++ newsgroups, and those who promote it generally mean well, intending to avoid code duplication—but do these programmers really know what they write? Probably not, because there are subtle pitfalls here, some of them quite serious. This Item has covered most of them. There wasn't anything wrong with the original goal of improving consistency, but in this case, knowing what you write would lead the programmer to achieve consistency in a better way—for example, by having a common private member function that does the work and is called by both copy construction and copy assignment.

In short, avoid the dusty corners of the language, and remember that cuteness hurts in many ways. Above all, write what you know, and know what you write, and you're likely to do well.

I l@ve RuBoard

Solution

A Warning Example

Problem 1: It Can Slice Objects

Problem 2: It's Not Exception-Safe

Problem 3: It Changes Normal Object Lifetimes

Problem 4: It Can Still Break Derived Classes

Problem 5: this != &other

Some Advice in Favor of Simplicity

Problem 5: `this != &other`