C++ Templates
From a Types Perspective

Pablo Nogueira

Last updated: 7 June 2006

 

These notes provide an overview of C++ Templates from a "types" perspective. The aim is to introduce the features in relation to well-known concepts in the theory of typed programming languages. The focus is on breadth rather than depth of coverage. Further references can be found here.

Contents

Introduction

Most sophisticated strongly and statically type-checked languages usually have a "programming" language at the type level whose "execution" takes place at compile time. By this we not only mean that type checking or type reconstruction may require substantial computational effort, but that some form of compile-time "execution" of type-level terms is also taking place (e.g. application of type-level functions to type-level arguments). How involved this execution is depends on the complexity of the type-level language. This feature has been exploited in unexpected ways in many richly typed languages.

For instance, in the Simply (Explicitly) Typed Lambda Calculus, the type-level language is rather small (base types and arrow type operator). Type-level computation is just type checking: making sure "abstractions" (functions) and "applications" (function calls) are well-typed. In the Simply (Implicitly) Typed Lambda Calculus, the type checker has to do some more work: it needs to infer the type of formal parameter variables from their use in the abstraction's body.

When programmer-defined parameterised type operators are added to these languages in the form of data type definitions, type-level execution (reduction) takes place in the form of type application, where type operators are instantiated with type arguments. Another form of type application takes place in the various lambda calculi modelling polymorphism like the predicative, impredicative (or System F), and type:type polymorphic lambda calculi of which the predicative, ML variant with let-polymorphism is the most popular and widely known. In these languages, type abstraction and type application are expressions, but since they play no computational role (no computational decision is made in terms of types; this is the so-called parametricity property of parametric polymorphism) they can be erased by the compiler after type checking.

Modern programming languages are convenient for real programming because of their many features and derived forms. The type-level language of C is approximately a monomorphic language of products with restricted higher-order features (pointers to functions). The pure, non-strict functional programming Haskell has some sort of prolog-ish language (which includes the simply typed lambda calculus) at the type level due to its "type class" mechanism.

What is the type-level language of C++? This question seems to have been of little concern to C++ designers or implementors, who have tackled the design and compilation of the language from a practical perspective, offering extremely clever but arguably ad-hoc solutions to looming problems.

Writing a formal type system for C++ would have been quite laborious indeed but it would have shed light on the type-level language and would have improved the language's design, its compiler, and helped determine the real power of C++ which does not seem to be widely known. New possibilities keep emerging, a recently striking one being the Turing-compleness of templates — which is perhaps a partial answer to our question about the type-level language.

What are templates? In short, C++'s mechanism for static universal polymorphism in its dependent, unbounded parametric, and bounded parametric variants, albeit with some restrictions and peculiarities. It is not the "static" (e.g. compile-time) bit what makes templates different to other notions of polymorphism, but the mechanics of instantiation, which turns them into type-parameterised macros for what after instantiation are a set of overloaded functions and classes, with "late" overloading resolution performed at link time if necessary.

This generative, macro-expansion approach to their implementation still has the advantage of reducing the number of source-code definitions but it has the disadvantage of producing many concrete instantiations, a problem known as code bloating. However, templates are not implemented in this fashion out of theoretical necessity, but for reasons of instrumentation: other approaches may require significant changes to existing compilers; notice that proposed new features to the language are carefully examined by the C++ designers' committee with implementation, smooth integration and backwards compatibility in mind. The interaction between templates and other language features (e.g. inheritance, automatic conversion) can have surprising and powerful consequences.

Let's explain this in more detail.

Note on terminology

Template Fundamentals

Templates allow classes (types) and functions to be parameterised by classes (types), by integral values (integer, enumeration, pointers or references to objects with external linkage), and other parameterised types, called template-templates. More precisely, the template keyword instructs the compiler to treat the entity that follows as a parameterised entity. Entities can be classes and functions. The keyword is followed by a list of type, value, or template-template parameters enclosed in angle brackets.

Within the parameter list, the keyword typename followed by an identifier signals to the compiler that the identifier must be treated within the template's scope as a type identifier rather than a value identifier. For example, the template function

  template<typename T>
  T id(T x) { return x; }
  

defines an identity function parameterised by a type T and a value of type T (we leave references and temporaries aside for the moment). Here template plays a role similar to big lambda in Girard and Reynold's System F. The template definition is a sort of type abstraction.

The application id(3) actually means id<int>(3), where id<int> is a sort of type application. An expression of the form id<T>, where T does not stand for a concrete type but for a type variable, is a legal type application when it is embedded in a type abstraction:

  template<typename T>
  T another_id(T x) { return id<T>(x); }
  

Specialisation takes place when an actual value for T is given.

There are some important differences with System F in the range (bounded, unbounded) and kind (typenames, values, template-templates) of parameters allowed, and the mechanics of instantiation:

  1. The parametric polymorphism is static, that is, template instantiation takes place at compile time. This need not be the case in an implementation of System F.
  2. For template functions, there is a process of type reconstruction or type inference (called template argument deduction in C++ jargon) that takes place during instantiation, where the actual type argument is inferred from the types of the actual value arguments passed to the function. For example, the compiler infers that id(3) is precisely id<int>(3) because 3 has type int, and so it generates the definition
      int id(int x) { return x; }
      
  3. Most C++ compilers treat templates as macros and generate specific, overloaded code for each specialisation. This leads to code bloating. For example, given expressions id(3) and id(4.0), the compiler would generate and compile code for id<int> and id<double> (making id an overloaded function) even if the bodies only differ on the types of the arguments and return value, e.g.:
      int    id(int x)    { return x; }
      double id(double x) { return x; }
      

    Generated definitions are conceptually overloaded: they share the same name and structure but differ only in type annotations. However, the compiler can give each generated function a unique name that does not conflict with any other function name, for instance:

      int    id_INTEGER(int x) { ... }
      double id_DOUBLE(double x) { ... }
      

    These new names are also introduced in applications, so that id(3) is replaced by id_INTEGER(3) and id(4.0) is replaced by id_DOUBLE(4.0).

  4. Internal renaming is performed automatically for instantiated template classes as classes cannot be overloaded.
  5. Polymorphism can be unbounded parametric, like in System F, or bounded parametric, where the range of types is not "forall", but only those for which certain operations are defined. Template parameters can also be expressions of integral or pointer type whose value can be determined at compile-time; in this case a template introduces a restricted form of dependently typed function or class. Finally, template parameters can be parameterised classes, called template-template parameters in C++ jargon. Template classes thus correspond to parametrically polymorphic type operators that can take other type operators as parameters.

Templates are very powerful because ad-hoc polymorphism (overloading) resolved in terms of scopes and type annotations is very powerful, if perhaps not very elgant or optimal in terms of code size. One of the key insights of the C++ designers was to realise that many forms of parametric polymorphim can be instrumented in this fashion, in terms of type-parameterised macros that can be recursively defined and instantiated, and which are more reliable than mere textual macros as they are an integral part of the type system.

Template Functions

Template Classes

Just like functions, classes can be parameterised by typenames, values, and other parameterised classes.

Further Reading

  1. C++ Templates. The Complete Guide. David Vandervoorde and Nicolai Josuttis. Addison-Wesley, 2003. Pretty much the template bible.
  2. The Annotated C++ Reference Manual. Margaret Ellis and Bjarne Stroustrup. Addison-Wesley, 1990. The standard C++ reference, but short on detailed discussions about applications and compilation techniques of templates.
  3. Generative Programming. Methods, Tools and Applications. Krzysztof Czarnecki and Ulrich Eisenecker. Addison-Wesley, 2000. Part II overviews templates thoroughly and discusses some of their applications in generative programming and meta-programming.
  4. Types and Programming Languages. Benjamin Pierce. MIT Press, 2002. Excellent introduction to the field of type theory and type systems from a programming perspective, with special focus on the various popular typed lambda calculi.
  5. Foundations for Programming Languages. John C. Mitchell, MIT Press, 1996.
    A massive textbook on the theoretical foundations of programming languages. The chapters on the simply typed lambda calculus and on polymorphism are thorough and systematic.