C++ Pitfalls

This is an attempt to provide an overview of many of the C++ pitfalls that begcontent to moderately experienced C++ programmers often fail to understand. Specifically this addresses mistakes that I've seen from newer KDE contributers over the last couple of years. Please note that this is not an attempt to replace a good C++ reference, but simply to provide an introduction to some often misunderstood concepts and to point out their usefulness.

References

References are a way of assigning a "handle" to a variable. There are two places where this is used in C++. We'll discuss both of them briefly.

Assigning References

This is the less often used variety of references, but still worth noting as an introduction to the use of references in function arguments. Here we create a reference that looks and acts like a standard C++ variable except that it operates on the same data as the variable that it references.

int foo = 3;     // foo == 3
int &bar = foo;  // foo == 3
bar = 5;         // foo == 5
	

Here because we've made bar a reference to foo changing the value of bar also changes the value of foo.

Passing Function Arguments With References

The same concept of references is used when passing variables. For example:

void foo( int &i )
{
    i++;
}

int main()
{
    int bar = 5;   // bar == 5
    foo( bar );    // bar == 6
    foo( bar );    // bar == 7

    return 0;
}
	

Here we display one of the two common uses of references in function arguments — they allow us to use the conventional syntax of passing an argument by value but manipulate the value in the caller.

Note: While sometimes useful, using this style of references can sometimes lead to counter-intuitive code. It is not clear to the caller of foo() above that bar will be modified without consulting an API reference.

However there is a more common use of references in function arguments — they can also be used to pass a handle to a large data structure without making multiple copies of it in the process. Consider the following:

void foo( const std::string &s )
{
    std::cout << s << std::endl;
}

void bar( std::string s )
{
    std::cout << s << std::endl;
}

int main()
{
    std::string text = "This is a test.";

    foo( text ); // doesn't make a copy of "text"
    bar( text ); // makes a copy of "text"

    return 0;
}
	

In this simple example we're able to see the differences in pass by value and pass by reference. In this case pass by value just expends a few additional bytes, but imagine instance if text contained the text of an entire book.

The ability to pass it by reference keeps us from needing to make a copy of the string and avoids the ugliness of using a pointer.

It should also be noted that this only makes sense for complex types — classes and structs. In the case of ordinal types — i.e. int, float, bool, etc. — there is no savings in using a reference instead of simply using pass by value.

Note: We'll discuss the const-ness of variables passed by reference in the next section.

The const Keyword

There is a lot of confusion about the use of const. I'll try to cover the basic syntax and the places where it is appropriate.

const Variables

Either local or class-level variables may be declared const indicating that you don't intend to change their value after they're initialized. const local variables are often used in conjunction with const return values, which will be discussed below.

For example:

int main()
{
    const int i = 10;

    i = 3;            // ERROR - we can't change "i"

    int &j = i;       // ERROR - we promised not to
                      // change "i" so we can't
                      // create a non-const reference
                      // to it

    const int &x = i; // fine - "x" is a reference
                      // to "i"

    return 0;
}
	

const Methods

Const methods are a way for us to say that a method does not modify the member variables of a class. It's a hint both to the programmer and the compiler that a given method doesn't change the internal state of a class.

Take for example:

class Foo
{
public:
    int value() const
    {
        return m_value;
    }

    void setValue( int i )
    {
        m_value = i;
    }

private:
    int m_value;
};
	

Here value() clearly does not change m_value and as such can and should be const. However setValue() does modify m_value and as such cannot be const.

Another subtlety that is often missed is that by extension a const method cannot call a non-const method (and the compiler will complain if you try). Since we assume that a non-const method does modify the member variables of our class, we can't call that from a method where we've promised not to do just that.

Also note the placement of the keyword in the method declaration — it must follow the method decleration.

Specifically all of the following cases are different:

class Foo
{
public:
    /*
     * Modifies m_widget and the user
     * may modify the returned widget.
     */
    Widget *widget();

    /*
     * Does not modify m_widget but the
     * user may modify the returned widget.
     */
    Widget *widget() const;

    /*
     * Modifies m_widget, but the user
     * may not modify the returned widget.
     */
    const Widget *cWidget();

    /*
     * Does not modify m_widget and the user
     * may not modify the returned widget.
     */
    const Widget *cWidget() const;

private:
    Widget *m_widget;
};
	

Note: Also interesting to note from the above is that while the compiler differentiates between const and non-const methods and allows overloading based on a method's const-ness, it does not differentiate between const and non-const return values — if the last two methods had also been named widget() the compiler would have given us an error due to the ambiguous overload.

const Return Values

We saw in the above example the methods cWidget() where our type had a const prepended. In that position in the value that is returned may not be modified.

Consider (using the above class):

int main()
{
    Foo f;

    Widget *w1 = f.widget();  // fine

    Widget *w2 = f.cWidget(); // ERROR - "cWidget()"
                              // returns a const value
                              // and "w2" is not const

    const Widget *w3 = f.cWidget(); // fine

    return 0;
}
	

So, if we are using a method with a const return value we must assign the value to a const local variable.

If such a const return value is a pointer or a reference to a class then we cannot call non-const methods on that pointer or reference since that would break our agreement not to change it.

Note: As a general rule methods should be const except when it's not possible to make them such. While getting used to the semantics you can use the compiler to inform you when a method may not be const — it will give an error if you declare a method const that needs to be non-const.

const Function Arguments

The keyword const can also be used as a guarantee that a function will not modify a value that is passed in. This is really only useful for references and pointers (and not things passed by value), though there's nothing syntactically to prevent the use of const for arguments passed by value.

Take for example the following fuctions:

void foo( const std::string &s )
{
    s.append("blah"); // ERROR — we can't modify the string

    std::cout << s.length() << std::endl; // fine
}

void bar( const Widget *w )
{
    w->rotate(); // ERROR - rotate wouldn't be const

    std::cout << w->name() << std::endl; // fine
}
	

In the first example we tried to call a non-const method — append() — on an argument passed as a const reference, thus breaking our agreement with the caller not to modify it and the compiler will give us an error.

The same is true with rotate(), but with a const pointer in the second example.

Public, Protected and Private Labels

C++ supports three labels that can be used in classes (or structs) to define the permissions for the members in that section of the class. These labels can be used multiple times in a class declaration for cases where it's logical to have multiple groups of these types.

These keywords affect the permissions of the members — whether functions or variables.

public

This label is used to say that the methods and variables may be accessed from any portion of the code that knows the class type. This should usually only be used for member functions (and not variables) and should not expose implementation details.

protected

Only subclasses of this type may access these functions or variables. Many people prefer to also keep this restricted to functions (as opposed to variables) and to use accessor methods for getting to the underlying data.

private

This is used for methods that can not be used in either subclasses or other places. This is usually the domain of member variable and helper functions. It's often useful to start off by putting functions here and to only move them to the higher levels of access as they are needed.

Note: It's often misunderstood that different instances of the same class may access each others' private or protected variables. A common case for this is in copy constructors.

class Foo
{
public:
    Foo( const Foo &f )
    {
        m_value = f.m_value; // perfectly legal
    }

private:
    int m_value;
};
	

(It should however be mentioned that the above is not needed as the default copy constructor will do the same thing.)

virtual Methods

virtual methods are an essential part of designing a class hierarchy and subclassing classes from a toolkit. The concept is relatively simple, but often misunderstood. Specifically it determines the behavior of overridden methods in certain contexts.

By placing the keyword virtual before a method declaration it says that when referring to an instance of a superclass by a pointer or reference to a base class that the correct implementation should be resolved at run time and that the "highest level" implementation should be used.

Again, this should be more clear with an example:

class Foo
{
public:
    void f()
    {
        std::cout << "Foo::f()" << std::endl;
    }
    virtual void g()
    {
        std::cout << "Foo::g()" << std::endl;
    }
};

class Bar : public Foo
{
public:
    void f()
    {
        std::cout << "Bar::f()" << std::endl;
    }
    virtual void g()
    {
        std::cout << "Bar::g()" << std::endl;
    }
};

int main()
{
    Foo foo;
    Bar bar;

    Foo *baz = &bar;
    Bar *quux = &bar;

    foo.f(); // "Foo::f()"
    foo.g(); // "Foo::g()"

    bar.f(); // "Bar::f()"
    bar.g(); // "Bar::g()"

    // So far everything we would expect...

    baz->f();  // "Foo::f()"
    baz->g();  // "Bar::g()"

    quux->f(); // "Bar::f()"
    quux->g(); // "Bar::g()"

    return 0;
}
	

Our first calls to f() and g() on the two objects are straightforward. However things get interesting with our baz pointer which is a pointer to the Foo type.

f() is not virtual and as such a call to f() will always invoke the implementation associated with the pointer type — in this case the implementation from Foo.

"Pure" Virtual Methods

There is one additional interesting possiblity — sometimes we don't want to provide an implementation of our function at all, but want to require people subclassing our class to be required to provide an implementation on their own. This is the case for "pure" virtuals.

To indicate a "pure" virtual method instead of an implementation we simply add an "= 0" after the function declaration.

Again — an example:

class Widget
{
public:
    virtual void paint() = 0;
};

class Button : public Widget
{
public:
    virtual void paint()
    {
        // do some stuff to draw a button
    }
};
	

Because paint() is a pure virtual method in the Widget class we are required to provide an implementation in all subclasses. If we don't the compiler will give us an error at build time.

This is helpful for providing interfaces — things that we expect from all of the objects based on a certain hierarchy, but when we want to ignore the implementation details.

So why is this useful?

Let's take our example from above where we had a pure virtual for painting. There are a lot of cases where we want to be able to do things with widgets without worrying about what kind of widget it is. Painting is an easy example.

Imagine that we have something in our application that repaints widgets when they become active. It would just work with pointers to widgets — i.e. Widget *activeWidget() const might be a possible function signature. So we might do something like:

Widget *w = window->activeWidget();
w->paint();
	

We want to actually call the appropriate paint method for the "real" widget type — not Widget::paint() (which is a "pure" virtual and will cause the program to crash if called). By using a virtual method we insure that the method implementation for our subclass — Button::paint() in this case — will be called.

Note: While it is not required to use the virtual keyword in our subclass implementations (since if the base class implementation is virtual all subclass implementations will be virtual) it is still good style to do so.

mutable Class Members

We discussed const members earlier and like many things there are exceptions to the rule — in this case the mutable keyword is the key to making exceptions to const-ness.

There are a number of possible uses of the mutable keyword, but most of them are obscure and rarely used. The most often usage of mutable is to specifically declare that a class member variable may be changed by a const method.

This is most often used in cases such as delayed initialization or similar. Take for example:

class Foo
{
public:
    Foo() : m_object( 0 ) // initalize to null
    {

    }
    ~Foo()
    {
        delete m_object;
    }

    const Object *object() const
    {
        if( !m_object )
            m_object = new Object;

        return m_object;
    }

private:
    mutable Object *m_object;
};
	

In the example above we have an accessor which convention says should be const. However in this case we're assuming that we never want to return an uninitialized value, so because the member variable has been marked as being mutable we're able to modify it even in a const method.

If m_object were not marked as mutable then the compiler would give an error indicating that we are not allowed to change the value of a class member variable in a const method.

Note: Like exceptions to most rules, the mutable keyword exists for a reason, but should not be overused. If you find that you have marked a significant number of the member variables in your class as mutable you should probably consider whether or not the design really makes sense.

Copyright © 2004 - 2005, Scott Wheeler

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.