C++ Language Guide [II]

Polymorphism

Before getting into this section, it is recommended that you have a proper understanding of pointers and class inheritance. If any of the following statements seem strange to you, you should review the indicated sections:

Statement: Explained in:
int a::b(int c) { } Classes
a->b Data Structures
class a: public b { }; Friendship and inheritance

 

Pointers to base class

One of the key features of derived classes is that a pointer to a derived class is type-compatible with a pointer to its base class. Polymorphism is the art of taking advantage of this simple but powerful and versatile feature, that brings Object Oriented Methodologies to its full potential.

We are going to start by rewriting our program about the rectangle and the triangle of the previous section taking into consideration this pointer compatibility property:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
// pointers to base class #include <iostream> using namespace std; class CPolygon { protected: int width, height; public: void set_values (int a, int b) { width=a; height=b; } }; class CRectangle: public CPolygon { public: int area () { return (width * height); } }; class CTriangle: public CPolygon { public: int area () { return (width * height / 2); } }; int main () { CRectangle rect; CTriangle trgl; CPolygon * ppoly1 = &rect; CPolygon * ppoly2 = &trgl; ppoly1->set_values (4,5); ppoly2->set_values (4,5); cout << rect.area() << endl; cout << trgl.area() << endl; return 0; }
20 10

In function main, we create two pointers that point to objects of class CPolygon (ppoly1 and ppoly2). Then we assign references torect and trgl to these pointers, and because both are objects of classes derived from CPolygon, both are valid assignment operations.

The only limitation in using *ppoly1 and *ppoly2 instead of rect and trgl is that both *ppoly1 and *ppoly2 are of type CPolygon*and therefore we can only use these pointers to refer to the members that CRectangle and CTriangle inherit from CPolygon. For that reason when we call the area() members at the end of the program we have had to use directly the objects rect and trglinstead of the pointers *ppoly1 and *ppoly2.

In order to use area() with the pointers to class CPolygon, this member should also have been declared in the class CPolygon, and not only in its derived classes, but the problem is that CRectangle and CTriangle implement different versions of area, therefore we cannot implement it in the base class. This is when virtual members become handy:

Virtual members

A member of a class that can be redefined in its derived classes is known as a virtual member. In order to declare a member of a class as virtual, we must precede its declaration with the keyword virtual:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
// virtual members #include <iostream> using namespace std; class CPolygon { protected: int width, height; public: void set_values (int a, int b) { width=a; height=b; } virtual int area () { return (0); } }; class CRectangle: public CPolygon { public: int area () { return (width * height); } }; class CTriangle: public CPolygon { public: int area () { return (width * height / 2); } }; int main () { CRectangle rect; CTriangle trgl; CPolygon poly; CPolygon * ppoly1 = &rect; CPolygon * ppoly2 = &trgl; CPolygon * ppoly3 = &poly; ppoly1->set_values (4,5); ppoly2->set_values (4,5); ppoly3->set_values (4,5); cout << ppoly1->area() << endl; cout << ppoly2->area() << endl; cout << ppoly3->area() << endl; return 0; }
20 10 0

Now the three classes (CPolygonCRectangle and CTriangle) have all the same members: widthheightset_values() and area().

The member function area() has been declared as virtual in the base class because it is later redefined in each derived class. You can verify if you want that if you remove this virtual keyword from the declaration of area() within CPolygon, and then you run the program the result will be 0 for the three polygons instead of 2010 and 0. That is because instead of calling the correspondingarea() function for each object (CRectangle::area()CTriangle::area() and CPolygon::area(), respectively), CPolygon::area()will be called in all cases since the calls are via a pointer whose type is CPolygon*.

Therefore, what the virtual keyword does is to allow a member of a derived class with the same name as one in the base class to be appropriately called from a pointer, and more precisely when the type of the pointer is a pointer to the base class but is pointing to an object of the derived class, as in the above example.

A class that declares or inherits a virtual function is called a polymorphic class.

Note that despite of its virtuality, we have also been able to declare an object of type CPolygon and to call its own area() function, which always returns 0.

Abstract base classes

Abstract base classes are something very similar to our CPolygon class of our previous example. The only difference is that in our previous example we have defined a valid area() function with a minimal functionality for objects that were of class CPolygon (like the object poly), whereas in an abstract base classes we could leave that area() member function without implementation at all. This is done by appending =0 (equal to zero) to the function declaration.

An abstract base CPolygon class could look like this:

1
2
3
4
5
6
7
8
9
// abstract class CPolygon class CPolygon { protected: int width, height; public: void set_values (int a, int b) { width=a; height=b; } virtual int area () =0; };

Notice how we appended =0 to virtual int area () instead of specifying an implementation for the function. This type of function is called a pure virtual function, and all classes that contain at least one pure virtual function are abstract base classes.

The main difference between an abstract base class and a regular polymorphic class is that because in abstract base classes at least one of its members lacks implementation we cannot create instances (objects) of it.

But a class that cannot instantiate objects is not totally useless. We can create pointers to it and take advantage of all its polymorphic abilities. Therefore a declaration like:

 
CPolygon poly;

would not be valid for the abstract base class we have just declared, because tries to instantiate an object. Nevertheless, the following pointers:

1
2
CPolygon * ppoly1; CPolygon * ppoly2;

would be perfectly valid.

This is so for as long as CPolygon includes a pure virtual function and therefore it’s an abstract base class. However, pointers to this abstract base class can be used to point to objects of derived classes.

Here you have the complete example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
// abstract base class #include <iostream> using namespace std; class CPolygon { protected: int width, height; public: void set_values (int a, int b) { width=a; height=b; } virtual int area (void) =0; }; class CRectangle: public CPolygon { public: int area (void) { return (width * height); } }; class CTriangle: public CPolygon { public: int area (void) { return (width * height / 2); } }; int main () { CRectangle rect; CTriangle trgl; CPolygon * ppoly1 = &rect; CPolygon * ppoly2 = &trgl; ppoly1->set_values (4,5); ppoly2->set_values (4,5); cout << ppoly1->area() << endl; cout << ppoly2->area() << endl; return 0; }
20 10

If you review the program you will notice that we refer to objects of different but related classes using a unique type of pointer (CPolygon*). This can be tremendously useful. For example, now we can create a function member of the abstract base classCPolygon that is able to print on screen the result of the area() function even though CPolygon itself has no implementation for this function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
// pure virtual members can be called // from the abstract base class #include <iostream> using namespace std; class CPolygon { protected: int width, height; public: void set_values (int a, int b) { width=a; height=b; } virtual int area (void) =0; void printarea (void) { cout << this->area() << endl; } }; class CRectangle: public CPolygon { public: int area (void) { return (width * height); } }; class CTriangle: public CPolygon { public: int area (void) { return (width * height / 2); } }; int main () { CRectangle rect; CTriangle trgl; CPolygon * ppoly1 = &rect; CPolygon * ppoly2 = &trgl; ppoly1->set_values (4,5); ppoly2->set_values (4,5); ppoly1->printarea(); ppoly2->printarea(); return 0; }
20 10

Virtual members and abstract classes grant C++ the polymorphic characteristics that make object-oriented programming such a useful instrument in big projects. Of course, we have seen very simple uses of these features, but these features can be applied to arrays of objects or dynamically allocated objects.

Let’s end with the same example again, but this time with objects that are dynamically allocated:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
// dynamic allocation and polymorphism #include <iostream> using namespace std; class CPolygon { protected: int width, height; public: void set_values (int a, int b) { width=a; height=b; } virtual int area (void) =0; void printarea (void) { cout << this->area() << endl; } }; class CRectangle: public CPolygon { public: int area (void) { return (width * height); } }; class CTriangle: public CPolygon { public: int area (void) { return (width * height / 2); } }; int main () { CPolygon * ppoly1 = new CRectangle; CPolygon * ppoly2 = new CTriangle; ppoly1->set_values (4,5); ppoly2->set_values (4,5); ppoly1->printarea(); ppoly2->printarea(); delete ppoly1; delete ppoly2; return 0; }
20 10

Notice that the ppoly pointers:

1
2
CPolygon * ppoly1 = new CRectangle; CPolygon * ppoly2 = new CTriangle;

are declared being of type pointer to CPolygon but the objects dynamically allocated have been declared having the derived class type directly.

Templates

Function templates

Function templates are special functions that can operate with generic types. This allows us to create a function template whose functionality can be adapted to more than one type or class without repeating the entire code for each type.

In C++ this can be achieved using template parameters. A template parameter is a special kind of parameter that can be used to pass a type as argument: just like regular function parameters can be used to pass values to a function, template parameters allow to pass also types to a function. These function templates can use these parameters as if they were any other regular type.

The format for declaring function templates with type parameters is:

template <class identifier> function_declaration;
template <typename identifier> function_declaration;

The only difference between both prototypes is the use of either the keyword class or the keyword typename. Its use is indistinct, since both expressions have exactly the same meaning and behave exactly the same way.

For example, to create a template function that returns the greater one of two objects we could use:

1
2
3
4
template <class myType> myType GetMax (myType a, myType b) { return (a>b?a:b); }

Here we have created a template function with myType as its template parameter. This template parameter represents a type that has not yet been specified, but that can be used in the template function as if it were a regular type. As you can see, the function template GetMax returns the greater of two parameters of this still-undefined type.

To use this function template we use the following format for the function call:

function_name <type> (parameters);

For example, to call GetMax to compare two integer values of type int we can write:

1
2
int x,y; GetMax <int> (x,y);

When the compiler encounters this call to a template function, it uses the template to automatically generate a function replacing each appearance of myType by the type passed as the actual template parameter (int in this case) and then calls it. This process is automatically performed by the compiler and is invisible to the programmer.

Here is the entire example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
// function template #include <iostream> using namespace std; template <class T> T GetMax (T a, T b) { T result; result = (a>b)? a : b; return (result); } int main () { int i=5, j=6, k; long l=10, m=5, n; k=GetMax<int>(i,j); n=GetMax<long>(l,m); cout << k << endl; cout << n << endl; return 0; }
6 10

In this case, we have used T as the template parameter name instead of myType because it is shorter and in fact is a very common template parameter name. But you can use any identifier you like.

In the example above we used the function template GetMax() twice. The first time with arguments of type int and the second one with arguments of type long. The compiler has instantiated and then called each time the appropriate version of the function.

As you can see, the type T is used within the GetMax() template function even to declare new objects of that type:

 
T result;

Therefore, result will be an object of the same type as the parameters a and b when the function template is instantiated with a specific type.

In this specific case where the generic type T is used as a parameter for GetMax the compiler can find out automatically which data type has to instantiate without having to explicitly specify it within angle brackets (like we have done before specifying <int> and<long>). So we could have written instead:

1
2
int i,j; GetMax (i,j);

Since both i and j are of type int, and the compiler can automatically find out that the template parameter can only be int. This implicit method produces exactly the same result:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
// function template II #include <iostream> using namespace std; template <class T> T GetMax (T a, T b) { return (a>b?a:b); } int main () { int i=5, j=6, k; long l=10, m=5, n; k=GetMax(i,j); n=GetMax(l,m); cout << k << endl; cout << n << endl; return 0; }
6 10

Notice how in this case, we called our function template GetMax() without explicitly specifying the type between angle-brackets <>. The compiler automatically determines what type is needed on each call.

Because our template function includes only one template parameter (class T) and the function template itself accepts two parameters, both of this T type, we cannot call our function template with two objects of different types as arguments:

1
2
3
int i; long l; k = GetMax (i,l); 

This would not be correct, since our GetMax function template expects two arguments of the same type, and in this call to it we use objects of two different types.

We can also define function templates that accept more than one type parameter, simply by specifying more template parameters between the angle brackets. For example:

1
2
3
4
template <class T, class U> T GetMin (T a, U b) { return (a<b?a:b); }

In this case, our function template GetMin() accepts two parameters of different types and returns an object of the same type as the first parameter (T) that is passed. For example, after that declaration we could call GetMin() with:

1
2
3
int i,j; long l; i = GetMin<int,long> (j,l);

or simply:

 
i = GetMin (j,l);

even though j and l have different types, since the compiler can determine the appropriate instantiation anyway.

Class templates

We also have the possibility to write class templates, so that a class can have members that use template parameters as types. For example:

1
2
3
4
5
6
7
8
9
template <class T> class mypair { T values [2]; public: mypair (T first, T second) { values[0]=first; values[1]=second; } };

The class that we have just defined serves to store two elements of any valid type. For example, if we wanted to declare an object of this class to store two integer values of type int with the values 115 and 36 we would write:

 
mypair<int> myobject (115, 36); 

this same class would also be used to create an object to store any other type:

 
mypair<double> myfloats (3.0, 2.18); 

The only member function in the previous class template has been defined inline within the class declaration itself. In case that we define a function member outside the declaration of the class template, we must always precede that definition with the template <...> prefix:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
// class templates #include <iostream> using namespace std; template <class T> class mypair { T a, b; public: mypair (T first, T second) {a=first; b=second;} T getmax (); }; template <class T> T mypair<T>::getmax () { T retval; retval = a>b? a : b; return retval; } int main () { mypair <int> myobject (100, 75); cout << myobject.getmax(); return 0; }
100

Notice the syntax of the definition of member function getmax:

1
2
template <class T> T mypair<T>::getmax () 

Confused by so many T‘s? There are three T‘s in this declaration: The first one is the template parameter. The second T refers to the type returned by the function. And the third T (the one between angle brackets) is also a requirement: It specifies that this function’s template parameter is also the class template parameter.

Template specialization

If we want to define a different implementation for a template when a specific type is passed as template parameter, we can declare a specialization of that template.

For example, let’s suppose that we have a very simple class called mycontainer that can store one element of any type and that it has just one member function called increase, which increases its value. But we find that when it stores an element of type char it would be more convenient to have a completely different implementation with a function member uppercase, so we decide to declare a class template specialization for that type:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
// template specialization #include <iostream> using namespace std; // class template: template <class T> class mycontainer { T element; public: mycontainer (T arg) {element=arg;} T increase () {return ++element;} }; // class template specialization: template <> class mycontainer <char> { char element; public: mycontainer (char arg) {element=arg;} char uppercase () { if ((element>='a')&&(element<='z')) element+='A'-'a'; return element; } }; int main () { mycontainer<int> myint (7); mycontainer<char> mychar ('j'); cout << myint.increase() << endl; cout << mychar.uppercase() << endl; return 0; }
8 J

This is the syntax used in the class template specialization:

 
template <> class mycontainer <char> { ... };

First of all, notice that we precede the class template name with an emptytemplate<> parameter list. This is to explicitly declare it as a template specialization.

But more important than this prefix, is the <char> specialization parameter after the class template name. This specialization parameter itself identifies the type for which we are going to declare a template class specialization (char). Notice the differences between the generic class template and the specialization:

1
2
template <class T> class mycontainer { ... }; template <> class mycontainer <char> { ... };

The first line is the generic template, and the second one is the specialization.

When we declare specializations for a template class, we must also define all its members, even those exactly equal to the generic template class, because there is no “inheritance” of members from the generic template to the specialization.

Non-type parameters for templates

Besides the template arguments that are preceded by the class or typename keywords , which represent types, templates can also have regular typed parameters, similar to those found in functions. As an example, have a look at this class template that is used to contain sequences of elements:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
// sequence template #include <iostream> using namespace std; template <class T, int N> class mysequence { T memblock [N]; public: void setmember (int x, T value); T getmember (int x); }; template <class T, int N> void mysequence<T,N>::setmember (int x, T value) { memblock[x]=value; } template <class T, int N> T mysequence<T,N>::getmember (int x) { return memblock[x]; } int main () { mysequence <int,5> myints; mysequence <double,5> myfloats; myints.setmember (0,100); myfloats.setmember (3,3.1416); cout << myints.getmember(0) << '\n'; cout << myfloats.getmember(3) << '\n'; return 0; }
100 3.1416

It is also possible to set default values or types for class template parameters. For example, if the previous class template definition had been:

 
template <class T=char, int N=10> class mysequence {..};

We could create objects using the default template parameters by declaring:

 
mysequence<> myseq;

Which would be equivalent to:

 
mysequence<char,10> myseq;

Templates and multiple-file projects

From the point of view of the compiler, templates are not normal functions or classes. They are compiled on demand, meaning that the code of a template function is not compiled until an instantiation with specific template arguments is required. At that moment, when an instantiation is required, the compiler generates a function specifically for those arguments from the template.

When projects grow it is usual to split the code of a program in different source code files. In these cases, the interface and implementation are generally separated. Taking a library of functions as example, the interface generally consists of declarations of the prototypes of all the functions that can be called. These are generally declared in a “header file” with a .h extension, and the implementation (the definition of these functions) is in an independent file with c++ code.

Because templates are compiled when required, this forces a restriction for multi-file projects: the implementation (definition) of a template class or function must be in the same file as its declaration. That means that we cannot separate the interface in a separate header file, and that we must include both interface and implementation in any file that uses the templates.

Since no code is generated until a template is instantiated when required, compilers are prepared to allow the inclusion more than once of the same template file with both declarations and definitions in a project without generating linkage errors.

Namespaces

Namespaces allow to group entities like classes, objects and functions under a name. This way the global scope can be divided in “sub-scopes”, each one with its own name.

The format of namespaces is:

namespace identifier
{
entities
}

Where identifier is any valid identifier and entities is the set of classes, objects and functions that are included within the namespace. For example:

1
2
3
4
namespace myNamespace { int a, b; }

In this case, the variables a and b are normal variables declared within a namespace called myNamespace. In order to access these variables from outside the myNamespace namespace we have to use the scope operator ::. For example, to access the previous variables from outside myNamespace we can write:

1
2
myNamespace::a myNamespace::b 

The functionality of namespaces is especially useful in the case that there is a possibility that a global object or function uses the same identifier as another one, causing redefinition errors. For example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
// namespaces #include <iostream> using namespace std; namespace first { int var = 5; } namespace second { double var = 3.1416; } int main () { cout << first::var << endl; cout << second::var << endl; return 0; }
5 3.1416

In this case, there are two global variables with the same name: var. One is defined within the namespace first and the other one in second. No redefinition errors happen thanks to namespaces.

using

The keyword using is used to introduce a name from a namespace into the current declarative region. For example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
// using #include <iostream> using namespace std; namespace first { int x = 5; int y = 10; } namespace second { double x = 3.1416; double y = 2.7183; } int main () { using first::x; using second::y; cout << x << endl; cout << y << endl; cout << first::y << endl; cout << second::x << endl; return 0; }
5 2.7183 10 3.1416

Notice how in this code, x (without any name qualifier) refers to first::x whereas y refers to second::y, exactly as our usingdeclarations have specified. We still have access to first::y and second::x using their fully qualified names.

The keyword using can also be used as a directive to introduce an entire namespace:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
// using #include <iostream> using namespace std; namespace first { int x = 5; int y = 10; } namespace second { double x = 3.1416; double y = 2.7183; } int main () { using namespace first; cout << x << endl; cout << y << endl; cout << second::x << endl; cout << second::y << endl; return 0; }
5 10 3.1416 2.7183

In this case, since we have declared that we were using namespace first, all direct uses of x and y without name qualifiers were referring to their declarations in namespace first.

using and using namespace have validity only in the same block in which they are stated or in the entire code if they are used directly in the global scope. For example, if we had the intention to first use the objects of one namespace and then those of another one, we could do something like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
// using namespace example #include <iostream> using namespace std; namespace first { int x = 5; } namespace second { double x = 3.1416; } int main () { { using namespace first; cout << x << endl; } { using namespace second; cout << x << endl; } return 0; }
5 3.1416

Namespace alias

We can declare alternate names for existing namespaces according to the following format:

namespace new_name = current_name;

Namespace std

All the files in the C++ standard library declare all of its entities within the std namespace. That is why we have generally included the using namespace std; statement in all programs that used any entity defined in iostream.

Exceptions

Exceptions provide a way to react to exceptional circumstances (like runtime errors) in our program by transferring control to special functions called handlers.

To catch exceptions we must place a portion of code under exception inspection. This is done by enclosing that portion of code in atry block. When an exceptional circumstance arises within that block, an exception is thrown that transfers the control to the exception handler. If no exception is thrown, the code continues normally and all handlers are ignored.

An exception is thrown by using the throw keyword from inside the try block. Exception handlers are declared with the keywordcatch, which must be placed immediately after the try block:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
// exceptions #include <iostream> using namespace std; int main () { try { throw 20; } catch (int e) { cout << "An exception occurred. Exception Nr. " << e << endl; } return 0; }
An exception occurred. Exception Nr. 20

The code under exception handling is enclosed in a try block. In this example this code simply throws an exception:

 
throw 20;

A throw expression accepts one parameter (in this case the integer value 20), which is passed as an argument to the exception handler.

The exception handler is declared with the catch keyword. As you can see, it follows immediately the closing brace of the try block. The catch format is similar to a regular function that always has at least one parameter. The type of this parameter is very important, since the type of the argument passed by the throw expression is checked against it, and only in the case they match, the exception is caught.

We can chain multiple handlers (catch expressions), each one with a different parameter type. Only the handler that matches its type with the argument specified in the throw statement is executed.

If we use an ellipsis (...) as the parameter of catch, that handler will catch any exception no matter what the type of the throwexception is. This can be used as a default handler that catches all exceptions not caught by other handlers if it is specified at last:

1
2
3
4
5
6
try { // code here } catch (int param) { cout << "int exception"; } catch (char param) { cout << "char exception"; } catch (...) { cout << "default exception"; }

In this case the last handler would catch any exception thrown with any parameter that is neither an int nor a char.

After an exception has been handled the program execution resumes after the try-catch block, not after the throw statement!.

It is also possible to nest try-catch blocks within more external try blocks. In these cases, we have the possibility that an internalcatch block forwards the exception to its external level. This is done with the expression throw; with no arguments. For example:

1
2
3
4
5
6
7
8
9
10
11
try { try { // code here } catch (int n) { throw; } } catch (...) { cout << "Exception occurred"; }

Exception specifications

When declaring a function we can limit the exception type it might directly or indirectly throw by appending a throw suffix to the function declaration:

 
float myfunction (char param) throw (int);

This declares a function called myfunction which takes one agument of type char and returns an element of type float. The only exception that this function might throw is an exception of type int. If it throws an exception with a different type, either directly or indirectly, it cannot be caught by a regular int-type handler.

If this throw specifier is left empty with no type, this means the function is not allowed to throw exceptions. Functions with nothrow specifier (regular functions) are allowed to throw exceptions with any type:

1
2
int myfunction (int param) throw(); // no exceptions allowed int myfunction (int param); // all exceptions allowed 

Standard exceptions

The C++ Standard library provides a base class specifically designed to declare objects to be thrown as exceptions. It is calledexception and is defined in the <exception> header file under the namespace std. This class has the usual default and copy constructors, operators and destructors, plus an additional virtual member function called what that returns a null-terminated character sequence (char *) and that can be overwritten in derived classes to contain some sort of description of the exception.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
// standard exceptions #include <iostream> #include <exception> using namespace std; class myexception: public exception { virtual const char* what() const throw() { return "My exception happened"; } } myex; int main () { try { throw myex; } catch (exception& e) { cout << e.what() << endl; } return 0; }
My exception happened.

We have placed a handler that catches exception objects by reference (notice the ampersand & after the type), therefore this catches also classes derived from exception, like our myex object of class myexception.

All exceptions thrown by components of the C++ Standard library throw exceptions derived from this std::exception class. These are:

exception description
bad_alloc thrown by new on allocation failure
bad_cast thrown by dynamic_cast when fails with a referenced type
bad_exception thrown when an exception type doesn’t match any catch
bad_typeid thrown by typeid
ios_base::failure thrown by functions in the iostream library

For example, if we use the operator new and the memory cannot be allocated, an exception of type bad_alloc is thrown:

1
2
3
4
5
6
7
8
try { int * myarray= new int[1000]; } catch (bad_alloc&) { cout << "Error allocating memory." << endl; }

It is recommended to include all dynamic memory allocations within a try block that catches this type of exception to perform a clean action instead of an abnormal program termination, which is what happens when this type of exception is thrown and not caught. If you want to force a bad_alloc exception to see it in action, you can try to allocate a huge array; On my system, trying to allocate 1 billion ints threw a bad_alloc exception.

Because bad_alloc is derived from the standard base class exception, we can handle that same exception by catching references to the exception class:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
// bad_alloc standard exception #include <iostream> #include <exception> using namespace std; int main () { try { int* myarray= new int[1000]; } catch (exception& e) { cout << "Standard exception: " << e.what() << endl; } return 0; }

Type Casting

Converting an expression of a given type into another type is known as type-casting. We have already seen some ways to type cast:

Implicit conversion

Implicit conversions do not require any operator. They are automatically performed when a value is copied to a compatible type. For example:

1
2
3
short a=2000; int b; b=a;

Here, the value of a has been promoted from short to int and we have not had to specify any type-casting operator. This is known as a standard conversion. Standard conversions affect fundamental data types, and allow conversions such as the conversions between numerical types (short to intint to floatdouble to int…), to or from bool, and some pointer conversions. Some of these conversions may imply a loss of precision, which the compiler can signal with a warning. This can be avoided with an explicit conversion.

Implicit conversions also include constructor or operator conversions, which affect classes that include specific constructors or operator functions to perform conversions. For example:

1
2
3
4
5
class A {}; class B { public: B (A a) {} }; A a; B b=a;

Here, a implicit conversion happened between objects of class A and class B, because B has a constructor that takes an object of class A as parameter. Therefore implicit conversions from A to B are allowed.

Explicit conversion

C++ is a strong-typed language. Many conversions, specially those that imply a different interpretation of the value, require an explicit conversion. We have already seen two notations for explicit type conversion: functional and c-like casting:

1
2
3
4
short a=2000; int b; b = (int) a; // c-like cast notation b = int (a); // functional notation 

The functionality of these explicit conversion operators is enough for most needs with fundamental data types. However, these operators can be applied indiscriminately on classes and pointers to classes, which can lead to code that while being syntactically correct can cause runtime errors. For example, the following code is syntactically correct:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
// class type-casting #include <iostream> using namespace std; class CDummy { float i,j; }; class CAddition { int x,y; public: CAddition (int a, int b) { x=a; y=b; } int result() { return x+y;} }; int main () { CDummy d; CAddition * padd; padd = (CAddition*) &d; cout << padd->result(); return 0; }

The program declares a pointer to CAddition, but then it assigns to it a reference to an object of another incompatible type using explicit type-casting:

 
padd = (CAddition*) &d;

Traditional explicit type-casting allows to convert any pointer into any other pointer type, independently of the types they point to. The subsequent call to member result will produce either a run-time error or a unexpected result.

In order to control these types of conversions between classes, we have four specific casting operators: dynamic_cast,reinterpret_caststatic_cast and const_cast. Their format is to follow the new type enclosed between angle-brackets (<>) and immediately after, the expression to be converted between parentheses.

dynamic_cast <new_type> (expression)
reinterpret_cast <new_type> (expression)
static_cast <new_type> (expression)
const_cast <new_type> (expression)

The traditional type-casting equivalents to these expressions would be:

(new_type) expression
new_type (expression)

but each one with its own special characteristics:

dynamic_cast

dynamic_cast can be used only with pointers and references to objects. Its purpose is to ensure that the result of the type conversion is a valid complete object of the requested class.

Therefore, dynamic_cast is always successful when we cast a class to one of its base classes:

1
2
3
4
5
6
7
8
class CBase { }; class CDerived: public CBase { }; CBase b; CBase* pb; CDerived d; CDerived* pd; pb = dynamic_cast<CBase*>(&d); // ok: derived-to-base pd = dynamic_cast<CDerived*>(&b); // wrong: base-to-derived 

The second conversion in this piece of code would produce a compilation error since base-to-derived conversions are not allowed with dynamic_cast unless the base class is polymorphic.

When a class is polymorphic, dynamic_cast performs a special checking during runtime to ensure that the expression yields a valid complete object of the requested class:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
// dynamic_cast #include <iostream> #include <exception> using namespace std; class CBase { virtual void dummy() {} }; class CDerived: public CBase { int a; }; int main () { try { CBase * pba = new CDerived; CBase * pbb = new CBase; CDerived * pd; pd = dynamic_cast<CDerived*>(pba); if (pd==0) cout << "Null pointer on first type-cast" << endl; pd = dynamic_cast<CDerived*>(pbb); if (pd==0) cout << "Null pointer on second type-cast" << endl; } catch (exception& e) {cout << "Exception: " << e.what();} return 0; }
Null pointer on second type-cast
Compatibility note: dynamic_cast requires the Run-Time Type Information (RTTI) to keep track of dynamic types. Some compilers support this feature as an option which is disabled by default. This must be enabled for runtime type checking using dynamic_castto work properly.

The code tries to perform two dynamic casts from pointer objects of type CBase* (pba and pbb) to a pointer object of typeCDerived*, but only the first one is successful. Notice their respective initializations:

1
2
CBase * pba = new CDerived; CBase * pbb = new CBase;

Even though both are pointers of type CBase*pba points to an object of type CDerived, while pbb points to an object of type CBase. Thus, when their respective type-castings are performed using dynamic_castpba is pointing to a full object of class CDerived, whereas pbb is pointing to an object of class CBase, which is an incomplete object of class CDerived.

When dynamic_cast cannot cast a pointer because it is not a complete object of the required class -as in the second conversion in the previous example- it returns a null pointer to indicate the failure. If dynamic_cast is used to convert to a reference type and the conversion is not possible, an exception of type bad_cast is thrown instead.

dynamic_cast can also cast null pointers even between pointers to unrelated classes, and can also cast pointers of any type to void pointers (void*).

static_cast

static_cast can perform conversions between pointers to related classes, not only from the derived class to its base, but also from a base class to its derived. This ensures that at least the classes are compatible if the proper object is converted, but no safety check is performed during runtime to check if the object being converted is in fact a full object of the destination type. Therefore, it is up to the programmer to ensure that the conversion is safe. On the other side, the overhead of the type-safety checks of dynamic_cast is avoided.

1
2
3
4
class CBase {}; class CDerived: public CBase {}; CBase * a = new CBase; CDerived * b = static_cast<CDerived*>(a);

This would be valid, although b would point to an incomplete object of the class and could lead to runtime errors if dereferenced.

static_cast can also be used to perform any other non-pointer conversion that could also be performed implicitly, like for example standard conversion between fundamental types:

1
2
double d=3.14159265; int i = static_cast<int>(d); 

Or any conversion between classes with explicit constructors or operator functions as described in “implicit conversions” above.

reinterpret_cast

reinterpret_cast converts any pointer type to any other pointer type, even of unrelated classes. The operation result is a simple binary copy of the value from one pointer to the other. All pointer conversions are allowed: neither the content pointed nor the pointer type itself is checked.

It can also cast pointers to or from integer types. The format in which this integer value represents a pointer is platform-specific. The only guarantee is that a pointer cast to an integer type large enough to fully contain it, is granted to be able to be cast back to a valid pointer.

The conversions that can be performed by reinterpret_cast but not by static_cast have no specific uses in C++ are low-level operations, whose interpretation results in code which is generally system-specific, and thus non-portable. For example:

1
2
3
4
class A {}; class B {}; A * a = new A; B * b = reinterpret_cast<B*>(a);

This is valid C++ code, although it does not make much sense, since now we have a pointer that points to an object of an incompatible class, and thus dereferencing it is unsafe.

const_cast

This type of casting manipulates the constness of an object, either to be set or to be removed. For example, in order to pass a const argument to a function that expects a non-constant parameter:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
// const_cast #include <iostream> using namespace std; void print (char * str) { cout << str << endl; } int main () { const char * c = "sample text"; print ( const_cast<char *> (c) ); return 0; }
sample text

typeid

typeid allows to check the type of an expression:

typeid (expression)

This operator returns a reference to a constant object of type type_info that is defined in the standard header file <typeinfo>. This returned value can be compared with another one using operators == and != or can serve to obtain a null-terminated character sequence representing the data type or class name by using its name() member.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
// typeid #include <iostream> #include <typeinfo> using namespace std; int main () { int * a,b; a=0; b=0; if (typeid(a) != typeid(b)) { cout << "a and b are of different types:\n"; cout << "a is: " << typeid(a).name() << '\n'; cout << "b is: " << typeid(b).name() << '\n'; } return 0; }
a and b are of different types: a is: int * b is: int

When typeid is applied to classes typeid uses the RTTI to keep track of the type of dynamic objects. When typeid is applied to an expression whose type is a polymorphic class, the result is the type of the most derived complete object:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
// typeid, polymorphic class #include <iostream> #include <typeinfo> #include <exception> using namespace std; class CBase { virtual void f(){} }; class CDerived : public CBase {}; int main () { try { CBase* a = new CBase; CBase* b = new CDerived; cout << "a is: " << typeid(a).name() << '\n'; cout << "b is: " << typeid(b).name() << '\n'; cout << "*a is: " << typeid(*a).name() << '\n'; cout << "*b is: " << typeid(*b).name() << '\n'; } catch (exception& e) { cout << "Exception: " << e.what() << endl; } return 0; }
a is: class CBase * b is: class CBase * *a is: class CBase *b is: class CDerived

Notice how the type that typeid considers for pointers is the pointer type itself (both a and b are of type class CBase *). However, when typeid is applied to objects (like *a and *btypeid yields their dynamic type (i.e. the type of their most derived complete object).

If the type typeid evaluates is a pointer preceded by the dereference operator (*), and this pointer has a null value, typeid throws a bad_typeid exception.

Preprocessor directives

Preprocessor directives are lines included in the code of our programs that are not program statements but directives for the preprocessor. These lines are always preceded by a hash sign (#). The preprocessor is executed before the actual compilation of code begins, therefore the preprocessor digests all these directives before any code is generated by the statements.

These preprocessor directives extend only across a single line of code. As soon as a newline character is found, the preprocessor directive is considered to end. No semicolon (;) is expected at the end of a preprocessor directive. The only way a preprocessor directive can extend through more than one line is by preceding the newline character at the end of the line by a backslash (\).

macro definitions (#define, #undef)

To define preprocessor macros we can use #define. Its format is:

#define identifier replacement

When the preprocessor encounters this directive, it replaces any occurrence of identifier in the rest of the code by replacement. This replacement can be an expression, a statement, a block or simply anything. The preprocessor does not understand C++, it simply replaces any occurrence of identifierby replacement.

1
2
3
#define TABLE_SIZE 100 int table1[TABLE_SIZE]; int table2[TABLE_SIZE]; 

After the preprocessor has replaced TABLE_SIZE, the code becomes equivalent to:

1
2
int table1[100]; int table2[100]; 

This use of #define as constant definer is already known by us from previous tutorials, but #define can work also with parameters to define function macros:

 
#define getmax(a,b) a>b?a:b 

This would replace any occurrence of getmax followed by two arguments by the replacement expression, but also replacing each argument by its identifier, exactly as you would expect if it was a function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
// function macro #include <iostream> using namespace std; #define getmax(a,b) ((a)>(b)?(a):(b)) int main() { int x=5, y; y= getmax(x,2); cout << y << endl; cout << getmax(7,x) << endl; return 0; }
5 7

Defined macros are not affected by block structure. A macro lasts until it is undefined with the #undef preprocessor directive:

1
2
3
4
5
#define TABLE_SIZE 100 int table1[TABLE_SIZE]; #undef TABLE_SIZE #define TABLE_SIZE 200 int table2[TABLE_SIZE];

This would generate the same code as:

1
2
int table1[100]; int table2[200];

Function macro definitions accept two special operators (# and ##) in the replacement sequence:
If the operator # is used before a parameter is used in the replacement sequence, that parameter is replaced by a string literal (as if it were enclosed between double quotes)

1
2
#define str(x) #x cout << str(test);

This would be translated into:

 
cout << "test";

The operator ## concatenates two arguments leaving no blank spaces between them:

1
2
#define glue(a,b) a ## b glue(c,out) << "test";

This would also be translated into:

 
cout << "test";

Because preprocessor replacements happen before any C++ syntax check, macro definitions can be a tricky feature, but be careful: code that relies heavily on complicated macros may seem obscure to other programmers, since the syntax they expect is on many occasions different from the regular expressions programmers expect in C++.

Conditional inclusions (#ifdef, #ifndef, #if, #endif, #else and #elif)

These directives allow to include or discard part of the code of a program if a certain condition is met.

#ifdef allows a section of a program to be compiled only if the macro that is specified as the parameter has been defined, no matter which its value is. For example:

1
2
3
#ifdef TABLE_SIZE int table[TABLE_SIZE]; #endif 

In this case, the line of code int table[TABLE_SIZE]; is only compiled if TABLE_SIZE was previously defined with #define, independently of its value. If it was not defined, that line will not be included in the program compilation.

#ifndef serves for the exact opposite: the code between #ifndef and #endif directives is only compiled if the specified identifier has not been previously defined. For example:

1
2
3
4
#ifndef TABLE_SIZE #define TABLE_SIZE 100 #endif int table[TABLE_SIZE];

In this case, if when arriving at this piece of code, the TABLE_SIZE macro has not been defined yet, it would be defined to a value of 100. If it already existed it would keep its previous value since the #define directive would not be executed.

The #if#else and #elif (i.e., “else if”) directives serve to specify some condition to be met in order for the portion of code they surround to be compiled. The condition that follows #if or #elif can only evaluate constant expressions, including macro expressions. For example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
#if TABLE_SIZE>200 #undef TABLE_SIZE #define TABLE_SIZE 200 #elif TABLE_SIZE<50 #undef TABLE_SIZE #define TABLE_SIZE 50 #else #undef TABLE_SIZE #define TABLE_SIZE 100 #endif int table[TABLE_SIZE]; 

Notice how the whole structure of #if#elif and #else chained directives ends with #endif.

The behavior of #ifdef and #ifndef can also be achieved by using the special operators defined and !defined respectively in any #if or #elif directive:

1
2
3
4
5
#if !defined TABLE_SIZE #define TABLE_SIZE 100 #elif defined ARRAY_SIZE #define TABLE_SIZE ARRAY_SIZE int table[TABLE_SIZE];

Line control (#line)

When we compile a program and some error happens during the compiling process, the compiler shows an error message with references to the name of the file where the error happened and a line number, so it is easier to find the code generating the error.

The #line directive allows us to control both things, the line numbers within the code files as well as the file name that we want that appears when an error takes place. Its format is:

#line number "filename"

Where number is the new line number that will be assigned to the next code line. The line numbers of successive lines will be increased one by one from this point on.

"filename" is an optional parameter that allows to redefine the file name that will be shown. For example:

1
2
#line 20 "assigning variable" int a?; 

This code will generate an error that will be shown as error in file "assigning variable", line 20.

Error directive (#error)

This directive aborts the compilation process when it is found, generating a compilation the error that can be specified as its parameter:

1
2
3
#ifndef __cplusplus #error A C++ compiler is required! #endif 

This example aborts the compilation process if the macro name __cplusplus is not defined (this macro name is defined by default in all C++ compilers).

Source file inclusion (#include)

This directive has also been used assiduously in other sections of this tutorial. When the preprocessor finds an #include directive it replaces it by the entire content of the specified file. There are two ways to specify a file to be included:

1
2
#include "file" #include <file> 

The only difference between both expressions is the places (directories) where the compiler is going to look for the file. In the first case where the file name is specified between double-quotes, the file is searched first in the same directory that includes the file containing the directive. In case that it is not there, the compiler searches the file in the default directories where it is configured to look for the standard header files.
If the file name is enclosed between angle-brackets <> the file is searched directly where the compiler is configured to look for the standard header files. Therefore, standard header files are usually included in angle-brackets, while other specific header files are included using quotes.

Pragma directive (#pragma)

This directive is used to specify diverse options to the compiler. These options are specific for the platform and the compiler you use. Consult the manual or the reference of your compiler for more information on the possible parameters that you can define with #pragma.

If the compiler does not support a specific argument for #pragma, it is ignored – no error is generated.

Predefined macro names

The following macro names are defined at any time:

macro value
__LINE__ Integer value representing the current line in the source code file being compiled.
__FILE__ A string literal containing the presumed name of the source file being compiled.
__DATE__ A string literal in the form “Mmm dd yyyy” containing the date in which the compilation process began.
__TIME__ A string literal in the form “hh:mm:ss” containing the time at which the compilation process began.
__cplusplus An integer value. All C++ compilers have this constant defined to some value. If the compiler is fully compliant with the C++ standard its value is equal or greater than 199711L depending on the version of the standard they comply.

For example:

1
2
3
4
5
6
7
8
9
10
11
12
13
// standard macro names #include <iostream> using namespace std; int main() { cout << "This is the line number " << __LINE__; cout << " of file " << __FILE__ << ".\n"; cout << "Its compilation began " << __DATE__; cout << " at " << __TIME__ << ".\n"; cout << "The compiler gives a __cplusplus value of " << __cplusplus; return 0; }
This is the line number 7 of file /home/jay/stdmacronames.cpp. Its compilation began Nov 1 2005 at 10:12:29. The compiler gives a __cplusplus value of 1

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s