Testing If std::tuple Is Being Applied To Functors Efficiently

Overview

In computer science it is never good enough to make a claim when the claim can be verified through testing. Fortunately C++ is a rich language which significantly and often aids the writing of test code.

In my previous blog article, Applying std::tuple To Functors Efficiently and Bearded Code Warrior's (BCW) most recent blog entry, Applying a function to a tuple – once again, claims are made that the C++ compiler will pass the tuple arguments efficiently to the desired function.

In my previous post, I had a long preamble explaining how the solution would work before presenting the solution near the end. In this post, I will start with the end results and explain how the code was written to obtain those results. This is done to emphasize why verifying claims/properties of code you've written is important.

Please note that BCW's apply() code is called apply_tuple() within my code. To avoid confusion, within this post all names of any BCW apply() functions are renamed to apply_tuple().

Update Feb. 3, 2012: I have changed Operand's operator ()(T…) function overloads to return Operand*. This ensures no copies/move are made with the returned value from the functor (as such is not relevant to this article series). All code in this series has been updated accordingly.

Article Source Code: cpp_code-apply_tuple-v3.zip.


What Is The Optimal Solution?

Before seeing any test results or writing any test code, first consider what is the optimal solution given a function applied to some arguments:

apply(foo, arg1, arg2, ..., argN);
apply_tuple(foo, make_tuple(arg1, arg2, ..., argN);

Assuming arg1, arg2, …, argN (ARGS) are all C++ class types; assuming all arguments were created prior to the call; and all arguments are valid rvalue references, then the optimal solutions are:

Table 1: Theoretical Optimal Apply Function Results
Call foo Function All N arguments
apply 0 misc
0 copies
0 moves
0 assign
0 misc
0 copies
0 moves
0 assign
apply_tuple 0 misc
0 copies
0 moves
0 assign
0 misc
0 copies
0 moves
0 assign

where:

Table 2: Definitions Of Terms In Table 1
Term Definition
misc Refers to the invocation of a default or any other constructor excluding any copy or move constructors.
copies Refers to the invocation of any copy constructor.
moves Refers to the invocation of any move constructor.
assign Refers to the invocation of any copy/move assignment operator.

For all tests all arguments (except foo) will be created as unnamed or compiler-generated temporary variables. Using only such temporary variables implies that the compiler should be able to reuse them to avoid doing any unnecessary copies/moves, etc. Provided the code being tested is properly written, optimal code should be generated as a result. If such is seen, then this test ought to also be a nice way to verify and test the claim that C++11 allows perfect forwarding!

Table 1 needs to be adjusted since temporary variables will be used throughout for all arguments except the function call argument. Specifically, the following items need to be added to Table 1:

  • the creation of the required (unnamed) initial variables, and,
  • any necessary copies, moves, and/or assignments that must be done for those variables.

These adjustments are reflected in Table 3:

Table 3: Theoretical Optimal Apply Function Results Adjusted For Required Initial Rvalue Reference Variables
Call foo Function All N arguments
apply 1 misc
0 copies
0 moves
0 assign
N misc
0 copies
0 moves
0 assign
apply_tuple 1 misc
0 copies
0 moves
0 assign
N misc
0 copies
N moves
0 assign

Clearly to populate a std::tuple with N rvalue references, a total of N moves, not copies, must be done. Additionally the actual original variables/functor must be created and this accounts for the misc constructor invocations listed in Table 3. Finally, in the case when the foo operation has no arguments, since zero-argument functions often do return a value, a value will be returned in those tests. Thus, there will be "1 misc" foo constructor invocation for any zero-argument cases in the results.

Finally note that within all tests performed within this article, all calls to apply_tuple will use std::make_tuple. Now that the theory looks good, let's check out reality!


The Test Results

Results For My Version Of apply() And apply_tuple()

Table 4 summarizes the results of using my previous blog entry's apply() and apply_tuple():

Table 4: Test Results Using Paul Preney's Apply Code
Call Functor Overhead N Argument Overhead
0 1 2 3
apply 1 misc
0 copies
0 moves
0 assign
1 misc
0 copies
0 moves
0 assign
1 misc
0 copies
0 moves
0 assign
2 misc
0 copies
0 moves
0 assign
3 misc
0 copies
0 moves
0 assign
apply_tuple 1 misc
0 copies
0 moves
0 assign
1 misc
0 copies
0 moves
0 assign
1 misc
0 copies
1 moves
0 assign
2 misc
0 copies
2 moves
0 assign
3 misc
0 copies
3 moves
0 assign

which matches the optimal results in Table 3. (Recall the zero-argument case creates one Operand as its return value hence the "1 misc" in those cases.) The actual test program output for apply() was:

Op() = 1.

Op LOG ============
default_constructor[this=0x7fff7c331c48]
destructor[this=0x7fff7c331c48]
Op HIST ===========
default_constructor 1
destructor 1
Op END ============
Operand LOG ============
other_constructor[this=0x7fff7c331c50]
destructor[this=0x7fff7c331c50]
Operand HIST ===========
other_constructor 1
destructor 1
Operand END ============

Op()(Operand(-3)) = -3.

Op LOG ============
default_constructor[this=0x7fff7c331c49]
destructor[this=0x7fff7c331c49]
Op HIST ===========
default_constructor 1
destructor 1
Op END ============
Operand LOG ============
other_constructor[this=0x7fff7c331c60]
destructor[this=0x7fff7c331c60]
Operand HIST ===========
other_constructor 1
destructor 1
Operand END ============

Op()(Operand(-3),Operand(-10)) = -13.

Op LOG ============
default_constructor[this=0x7fff7c331c4a]
destructor[this=0x7fff7c331c4a]
Op HIST ===========
default_constructor 1
destructor 1
Op END ============
Operand LOG ============
other_constructor[this=0x7fff7c331c80]
other_constructor[this=0x7fff7c331c70]
destructor[this=0x7fff7c331c70]
destructor[this=0x7fff7c331c80]
Operand HIST ===========
other_constructor 2
destructor 2
Operand END ============

Op()(Operand(-3),Operand(-10),Operand(-100)) = -113.

Op LOG ============
default_constructor[this=0x7fff7c331c4b]
destructor[this=0x7fff7c331c4b]
Op HIST ===========
default_constructor 1
destructor 1
Op END ============
Operand LOG ============
other_constructor[this=0x7fff7c331cb0]
other_constructor[this=0x7fff7c331ca0]
other_constructor[this=0x7fff7c331c90]
destructor[this=0x7fff7c331c90]
destructor[this=0x7fff7c331ca0]
destructor[this=0x7fff7c331cb0]
Operand HIST ===========
other_constructor 3
destructor 3
Operand END ============

and the output for apply_tuple() was:

Op() = 1.

Op LOG ============
default_constructor[this=0x7fff7c331c4c]
destructor[this=0x7fff7c331c4c]
Op HIST ===========
default_constructor 1
destructor 1
Op END ============
Operand LOG ============
other_constructor[this=0x7fff7c331cc0]
destructor[this=0x7fff7c331cc0]
Operand HIST ===========
other_constructor 1
destructor 1
Operand END ============

Op()(Operand(-3)) = -3.

Op LOG ============
default_constructor[this=0x7fff7c331c4d]
destructor[this=0x7fff7c331c4d]
Op HIST ===========
default_constructor 1
destructor 1
Op END ============
Operand LOG ============
other_constructor[this=0x7fff7c331cd0]
move_constructor[this=0x7fff7c331ce0,src=0x7fff7c331cd0]
destructor[this=0x7fff7c331ce0]
destructor[this=0x7fff7c331cd0]
Operand HIST ===========
move_constructor 1
other_constructor 1
destructor 2
Operand END ============

Op()(Operand(-3),Operand(-10)) = -13.

Op LOG ============
default_constructor[this=0x7fff7c331c4e]
destructor[this=0x7fff7c331c4e]
Op HIST ===========
default_constructor 1
destructor 1
Op END ============
Operand LOG ============
other_constructor[this=0x7fff7c331d00]
other_constructor[this=0x7fff7c331cf0]
move_constructor[this=0x7fff7c331d40,src=0x7fff7c331d00]
move_constructor[this=0x7fff7c331d44,src=0x7fff7c331cf0]
destructor[this=0x7fff7c331d44]
destructor[this=0x7fff7c331d40]
destructor[this=0x7fff7c331cf0]
destructor[this=0x7fff7c331d00]
Operand HIST ===========
move_constructor 2
other_constructor 2
destructor 4
Operand END ============

Op()(Operand(-3),Operand(-10),Operand(-100)) = -113.

Op LOG ============
default_constructor[this=0x7fff7c331c4f]
destructor[this=0x7fff7c331c4f]
Op HIST ===========
default_constructor 1
destructor 1
Op END ============
Operand LOG ============
other_constructor[this=0x7fff7c331d30]
other_constructor[this=0x7fff7c331d20]
other_constructor[this=0x7fff7c331d10]
move_constructor[this=0x7fff7c331d50,src=0x7fff7c331d30]
move_constructor[this=0x7fff7c331d54,src=0x7fff7c331d20]
move_constructor[this=0x7fff7c331d58,src=0x7fff7c331d10]
destructor[this=0x7fff7c331d58]
destructor[this=0x7fff7c331d54]
destructor[this=0x7fff7c331d50]
destructor[this=0x7fff7c331d10]
destructor[this=0x7fff7c331d20]
destructor[this=0x7fff7c331d30]
Operand HIST ===========
move_constructor 3
other_constructor 3
destructor 6
Operand END ============

where:

  • Op os foo,
  • Op() is the creation of a functor,
  • Op()(...) is the invocation of the functor with the … arguments passed in,
  • Operand is the type representing all arguments and return values,
  • Operand(value) is the creation of an argument with the value of value,
  • Op_LOG and Operand_LOG are the constructor, assignment, and destructor actions performed during the test in chronological order for all Op and Operand types respectively, and,
  • Op_HIST and Operand_HIST are the frequency histograms of the actions performed during the test for all Op and Operand types respectively.

These outputs were produced by the following code:

// TEST apply()...
APPLY_FULL0(Op);
APPLY_FULL1(Op, -3);
APPLY_FULL2(Op, -3, -10);
APPLY_FULL3(Op, -3, -10, -100);

// TEST apply_tuple()...
APPLY_TUPLE_FULL0(Op);
APPLY_TUPLE_FULL1(Op, -3);
APPLY_TUPLE_FULL2(Op, -3, -10);
APPLY_TUPLE_FULL3(Op, -3, -10, -100);

where APPLY_FULLn and APPLY_TUPLE_FULLn are macros that will be defined below. The reason macros were used was to ensure no side-effects were incurred through the use of function calls.

This test demonstrates:

  • my apply() and apply_tuple() implementations are optimal, and,
  • perfect forwarding of function arguments is a real achievement of C++11. 🙂

The code enabling these tests to be performed will be discussed after discussing the results of BCW's apply_tuple().


Results For BCW's Version Of apply_tuple()

BCW's apply_tuple() does not use any rvalue references and instead relies on the compiler optimizing the use of all recursive calls. Consequently, my hypothesis was that there would be more overhead in the general case using objects as the compiler would need to make copies of class instances. This is understandable since C++ compilers cannot use the multitude of assumptions it uses for any primitive types such as int, double, etc. That said, I was shocked to see just how bad the results were using BCW's code (after adding appropriate Operand operations supporting lvalues so BCW's code would compile):

Table 5: Test Results Using BCW's Apply Code's Function Call Overhead
Call Functor Overhead With N Arguments
0 1 2 3
apply N/A
apply_tuple 1 misc
0 copies
0 moves
0 assign
1 misc
1 copies
0 moves
0 assign
1 misc
2 copies
0 moves
0 assign
1 misc
3 copies
0 moves
0 assign

which shows how BCW's code unnecessarily copies the function object with each recursive call and:

Table 6: Test Results Using BCW's Apply Code's Argument Overhead
Call N Argument Overhead
0 1 2 3
apply N/A
apply_tuple 1 misc
0 copies
0 moves
0 assign
1 misc
3 copies
2 moves
0 assign
2 misc
8 copies
3 moves
0 assign
3 misc
16 copies
4 moves
0 assign

which shows how quickly the number of copies explodes as the argument size increases as well that the number of moves is equal to the number of arguments + 1, except in the zero-argument case as a special function overload is used to handle tuples with no elements.

Thus, it appears that BCW's code requires:

  • N copies of its function object, and,
  • some 2^(N+1) copies with N+1 moves of its arguments

for a single N-argument function call instead of the optimal:

  • zero copies of its function object, and,
  • zero copies with N moves of the function arguments.

So for N arguments:

  • N extra copies of the function argument are created,
  • it appears some O(2^(N+1)) copies of the arguments are created, and,
  • N extra moves of the arguments are performed.

all with GCC v4.7 (snapshot 2011-10-22) compiled using -std=c++0x and -O3 options! Wow!

This clearly shows just how relevant and practically useful rvalue references are in C++11 or any language that supports objects. Moreover, given how common functions with zero, one, two, and three arguments are, certainly BCW's code shouldn't be used unmodified with class/struct/union types. Thus, although BCW's code may compile and optimize well with the built-in C++ types, it fails to be anywhere near optimal for general class types.

Interestingly, one should note that BCW's solution, as posted in the blog article, performs, for N arguments, N recursive calls each passing increasing numbers of arguments by value. Such is clearly reflected in the values summarized in Table 5 and Table 6 whose values which were obtained from this test program output (whose computed output is also not completely correct due to some serious compiler warnings about references to temporaries in the recursive step of BCW's solution):

Op() = 1.

Op LOG ============
default_constructor[this=0x7fffede986cc]
destructor[this=0x7fffede986cc]
Op HIST ===========
default_constructor 1
destructor 1
Op END ============
Operand LOG ============
other_constructor[this=0x7fffede986d0]
destructor[this=0x7fffede986d0]
Operand HIST ===========
other_constructor 1
destructor 1
Operand END ============

Op()(Operand(-3)) = 20.

Op LOG ============
default_constructor[this=0x7fffede986cd]
copy_constructor[this=0x7fffede985bf,src=0x7fffede986cd]
destructor[this=0x7fffede985bf]
destructor[this=0x7fffede986cd]
Op HIST ===========
default_constructor 1
copy_constructor 1
destructor 2
Op END ============
Operand LOG ============
other_constructor[this=0x7fffede986e0]
move_constructor[this=0x7fffede986f0,src=0x7fffede986e0]
copy_constructor[this=0x7fffede985d0,src=0x7fffede986f0]
copy_constructor[this=0x7fffede985c0,src=0x7fffede986f0]
copy_constructor[this=0x7fffede985f0,src=0x7fffede985d0]
move_constructor[this=0x7fffede985e0,src=0x7fffede985f0]
destructor[this=0x7fffede985f0]
destructor[this=0x7fffede985e0]
destructor[this=0x7fffede985c0]
destructor[this=0x7fffede985d0]
destructor[this=0x7fffede986f0]
destructor[this=0x7fffede986e0]
Operand HIST ===========
copy_constructor 3
move_constructor 2
other_constructor 1
destructor 6
Operand END ============

Op()(Operand(-3),Operand(-10)) = -13.

Op LOG ============
default_constructor[this=0x7fffede986ce]
copy_constructor[this=0x7fffede984ce,src=0x7fffede986ce]
copy_constructor[this=0x7fffede984cf,src=0x7fffede984ce]
destructor[this=0x7fffede984cf]
destructor[this=0x7fffede984ce]
destructor[this=0x7fffede986ce]
Op HIST ===========
default_constructor 1
copy_constructor 2
destructor 3
Op END ============
Operand LOG ============
other_constructor[this=0x7fffede98710]
other_constructor[this=0x7fffede98700]
move_constructor[this=0x7fffede98750,src=0x7fffede98710]
move_constructor[this=0x7fffede98754,src=0x7fffede98700]
copy_constructor[this=0x7fffede984d0,src=0x7fffede98754]
copy_constructor[this=0x7fffede98520,src=0x7fffede98750]
copy_constructor[this=0x7fffede98524,src=0x7fffede98754]
copy_constructor[this=0x7fffede984f0,src=0x7fffede98520]
copy_constructor[this=0x7fffede984e0,src=0x7fffede984d0]
copy_constructor[this=0x7fffede98530,src=0x7fffede98520]
copy_constructor[this=0x7fffede98534,src=0x7fffede98524]
copy_constructor[this=0x7fffede98510,src=0x7fffede984e0]
move_constructor[this=0x7fffede98500,src=0x7fffede98510]
destructor[this=0x7fffede98510]
destructor[this=0x7fffede98500]
destructor[this=0x7fffede98534]
destructor[this=0x7fffede98530]
destructor[this=0x7fffede984e0]
destructor[this=0x7fffede984f0]
destructor[this=0x7fffede98524]
destructor[this=0x7fffede98520]
destructor[this=0x7fffede984d0]
destructor[this=0x7fffede98754]
destructor[this=0x7fffede98750]
destructor[this=0x7fffede98700]
destructor[this=0x7fffede98710]
Operand HIST ===========
copy_constructor 8
move_constructor 3
other_constructor 2
destructor 13
Operand END ============

Op()(Operand(-3),Operand(-10),Operand(-100)) = -113.

Op LOG ============
default_constructor[this=0x7fffede986cf]
copy_constructor[this=0x7fffede984de,src=0x7fffede986cf]
copy_constructor[this=0x7fffede984df,src=0x7fffede984de]
copy_constructor[this=0x7fffede9831f,src=0x7fffede984df]
destructor[this=0x7fffede9831f]
destructor[this=0x7fffede984df]
destructor[this=0x7fffede984de]
destructor[this=0x7fffede986cf]
Op HIST ===========
default_constructor 1
copy_constructor 3
destructor 4
Op END ============
Operand LOG ============
other_constructor[this=0x7fffede98740]
other_constructor[this=0x7fffede98730]
other_constructor[this=0x7fffede98720]
move_constructor[this=0x7fffede98760,src=0x7fffede98740]
move_constructor[this=0x7fffede98764,src=0x7fffede98730]
move_constructor[this=0x7fffede98768,src=0x7fffede98720]
copy_constructor[this=0x7fffede984e0,src=0x7fffede98768]
copy_constructor[this=0x7fffede98510,src=0x7fffede98760]
copy_constructor[this=0x7fffede98514,src=0x7fffede98764]
copy_constructor[this=0x7fffede98518,src=0x7fffede98768]
copy_constructor[this=0x7fffede98500,src=0x7fffede98514]
copy_constructor[this=0x7fffede984f0,src=0x7fffede984e0]
copy_constructor[this=0x7fffede98520,src=0x7fffede98510]
copy_constructor[this=0x7fffede98524,src=0x7fffede98514]
copy_constructor[this=0x7fffede98528,src=0x7fffede98518]
copy_constructor[this=0x7fffede98340,src=0x7fffede98520]
copy_constructor[this=0x7fffede98330,src=0x7fffede98500]
copy_constructor[this=0x7fffede98320,src=0x7fffede984f0]
copy_constructor[this=0x7fffede98370,src=0x7fffede98520]
copy_constructor[this=0x7fffede98374,src=0x7fffede98524]
copy_constructor[this=0x7fffede98378,src=0x7fffede98528]
copy_constructor[this=0x7fffede98360,src=0x7fffede98320]
move_constructor[this=0x7fffede98350,src=0x7fffede98360]
destructor[this=0x7fffede98360]
destructor[this=0x7fffede98350]
destructor[this=0x7fffede98378]
destructor[this=0x7fffede98374]
destructor[this=0x7fffede98370]
destructor[this=0x7fffede98320]
destructor[this=0x7fffede98330]
destructor[this=0x7fffede98340]
destructor[this=0x7fffede98528]
destructor[this=0x7fffede98524]
destructor[this=0x7fffede98520]
destructor[this=0x7fffede984f0]
destructor[this=0x7fffede98500]
destructor[this=0x7fffede98518]
destructor[this=0x7fffede98514]
destructor[this=0x7fffede98510]
destructor[this=0x7fffede984e0]
destructor[this=0x7fffede98768]
destructor[this=0x7fffede98764]
destructor[this=0x7fffede98760]
destructor[this=0x7fffede98720]
destructor[this=0x7fffede98730]
destructor[this=0x7fffede98740]
Operand HIST ===========
copy_constructor 16
move_constructor 4
other_constructor 3
destructor 23
Operand END ============

using the same macro code as the previous section's apply_tuple with my apply and apply_tuple code was deleted and replaced with BCW's code instead.

BCW's work is typically thorough but this instance shows that optimizations aren't magical, the programmer must use rvalue references where they ought to be used, and one should always double-check efficiency claims for code written. While in this instance BCW's code was found to have issues, we're all human and all coders all been guilty on various occassions of the assuming code efficiencies that weren't true!

I tried tweaking BCW's code while remaining faithful to the style of his solution by adding rvalue references. Unfortuantely, I ran into a snag: the snapshot of GCC v4.7 I am using gives "Sorry, unimplemented" messages when one tries to expand the TS… parameter pack inside a template argument. The latter is needed to obtain the type of each tuple argument using std::tuple_element so it can be passed to std::forward. Without such, one must use a class template as per my solution.


Tracking All Constructors, Assignments, and Destructors

Defining The Tracking Class

So how would one go about easily tracking and logging all constructor invocations, copy/move assignments, and when destructors are called? First one starts by writing a class template that does exactly this (say in a file called tracker.hxx):

#ifndef tracker_hxx_
#define tracker_hxx_

#include <list>
#include <map>
#include <tuple>
#include <ostream>

template <typename T>
struct tracker
{
  public:
    // Handle default constructor tracking...
    tracker()
    {
      update(DefaultConstr, this);
    }

    // Invoke for tracking any other constructors...
    explicit tracker(int)
    {
      update(OtherConstr, this);
    }

    // Handle copy constructor tracking...
    tracker(tracker const& src)
    {
      update(
        CopyConstr,
        this,
        &const_cast<tracker&>(src)
      );
    }

    // Handle move constructor tracking...
    tracker(tracker&& src)
    {
      update(MoveConstr, this, &src);
    }

    // Handle copy-assignment operator tracking...
    tracker& operator =(tracker const& src)
    {
      update(
        CopyAssign,
        this,
        &const_cast<tracker&>(src)
      );
      return *this;
    }

    // Handle move-assignment operator tracking...
    tracker& operator =(tracker&& src)
    {
      update(MoveAssign, this, &src);
      return *this;
    }

    // Handle destructor tracking...
    ~tracker()
    {
      update(Destruct, this);
    }

    // Clear out all log and histogram data...
    static void clear()
    {
      log_.clear();
      hist_.clear();
    }

    // Output the log to a stream...
    static std::ostream& output_log(std::ostream& os)
    {
      for (log_entry_type& le : log_)
      {
        output_log_entry(os, le);
        os << '\n';
      }
      return os;
    }

    // Output the histogram to a stream...
    static std::ostream& output_hist(std::ostream& os)
    {
      for (typename hist_type::value_type& he : hist_)
      {
        output_hist_entry(os, he);
        os << '\n';
      }
      return os;
    }

  private:
    // Enumeration to identify the various operations being logged
    // and tracked in the frequency histogram...
    enum op
    {
      DefaultConstr, CopyConstr, MoveConstr, OtherConstr,
      CopyAssign, MoveAssign,
      Destruct
    };

    // Define log_entry_type as a std::tuple such that:
    //   <0> is an instance of op
    //   <1> indicates the this object
    //   <2> indicates the source object (e.g., for copies/moves)
    typedef std::tuple<op, tracker*, tracker*> log_entry_type;

    // Define log_type as a list of log_entry_types...
    typedef std::list<log_entry_type> log_type;

    // Define hist_type as a map from op to size_t...
    typedef std::map<op, std::size_t> hist_type;

    // Update tracked information in the log and histogram...
    static void update(op o, tracker* this_v, tracker* src = nullptr)
    {
      auto log_entry = log_entry_type(o, this_v, src);

      // Append to log...
      log_.push_back(log_entry);

      // Update histogram...
      ++hist_[o];
    }

    // Output a log entry...
    static std::ostream& output_log_entry(std::ostream& os,
      log_entry_type const& le)
    {
      switch (std::get<0>(le))
      {
        case DefaultConstr:
          os
            << "default_constructor[this="
            << std::get<1>(le)
            << ']'
          ;
          break;

        case CopyConstr:
          os
            << "copy_constructor[this="
            << std::get<1>(le)
            << ",src="
            << std::get<2>(le)
            << ']'
          ;
          break;

        case MoveConstr:
          os
            << "move_constructor[this="
            << std::get<1>(le)
            << ",src="
            << std::get<2>(le)
            << ']'
          ;
          break;

        case OtherConstr:
          os
            << "other_constructor[this="
            << std::get<1>(le)
            << ']'
          ;
          break;

        case CopyAssign:
          os
            << "copy_assignment[this="
            << std::get<1>(le)
            << ",src="
            << std::get<2>(le)
            << ']'
          ;
          break;

        case MoveAssign:
          os
            << "move_assignment[this="
            << std::get<1>(le)
            << ",src="
            << std::get<2>(le)
            << ']'
          ;
          break;

        case Destruct:
          os
            << "destructor[this="
            << std::get<1>(le)
            << ']'
          ;
          break;
      }
      return os;
    }

    // Output a hist entry...
    static std::ostream& output_hist_entry(std::ostream& os,
      typename hist_type::value_type const& he)
    {
      switch (he.first)
      {
        case DefaultConstr:
          os << "default_constructor " << he.second;
          break;

        case CopyConstr:
          os << "copy_constructor " << he.second;
          break;

        case MoveConstr:
          os << "move_constructor " << he.second;
          break;

        case OtherConstr:
          os << "other_constructor " << he.second;
          break;

        case CopyAssign:
          os << "copy_assignment " << he.second;
          break;

        case MoveAssign:
          os << "move_assignment " << he.second;
          break;

        case Destruct:
          os << "destructor " << he.second;
          break;
      }
      return os;
    }

    // Define the log_ and hist_ variables statically...
    static log_type log_;
    static hist_type hist_;
};

// All class static variables must be instantiated outside the class...
template <typename T>
typename tracker<T>::log_type tracker<T>::log_;

template <typename T>
typename tracker<T>::hist_type tracker<T>::hist_;

#endif // #ifndef tracker_hxx_

Noteworthy is how almost everything is defined as private in order to not pollute the derived class with symbols. Even the static functions could have been defined outside the tracker class as friends to further avoid any pollution.

The idea behind tracker<T> is to:

  • log all constructor, copy/move assignment, and destructor invocations as they occur for some type T,
  • to log all object names (i.e., their addresses) upon which these invocations occur,
  • to log any source objects for all copying and move operations, and,
  • to keep an up-to-date frequency histogram of how many times each type of invocation occurred.

Using The Tracker Class With The Curiously Recurring Template Pattern (CRTP)

With such, one need only inherit from tracker passing in the name of the class doing the inheriting:

class Op : tracker<Op>
{
  public:
    using tracker<Op>::tracker<Op>;
};

Unfortunately, it appears the version of GCC I am using does not implement using syntax with inheriting constructors yet. As a work-around, one needs to define each constructor, assignment operator, and destructor in the derived class and invoke the same in the parent:

struct Op : tracker<Op>
{
  Op()
    : tracker()
  {
  }

  Op(Op const& a)
    : tracker(a)
  {
  }

  Op(Op&& a)
    : tracker(std::move(a))
  {
  }

  Op& operator =(Op const& a)
  {
    tracker::operator =(a);
    return *this;
  }

  Op& operator =(Op&& a)
  {
    tracker::operator =(std::move(a));
    return *this;
  }

  ~Op() = default;
};

Passing the class one is defining as a template argument to its base class is permitted and extremely useful in C++. It is a C++ programming pattern is called the Curiously Recurring Template Pattern (CRTP) and it works because all template instantiations are lazy (i.e., they only occur when they are required) and C++ permits forward declarations of types. When the compiler encounters class Op it now knows that Op is a class, so when it is told to inherit from tracker<Op> it is not an error provided that the use of Op inside of tracker does not need to know anything inside Op's definition outside of any member function's definition (not prototype)! If one tries to access members of Op from within tracker (but not within its member functions) the C++ compiler will give an error message about Op being an "incomplete type". Since tracker does no such thing, the compiler inherits from tracker<Op> and then processes the definition of Op!

The CRTP design pattern is extremely useful in C++ programs especially to perform the following:

  • to automatically perform tracking/logging functions including automatically adding/removing all instances of a type to data structures, and,
  • to eliminate the overhead of virtual function calls when run-time object polymorphism is not needed.

Since we want to track the function object instances and the function argument instances, two classes will need to be defined using CRTP:

  • Op representing the function object type, and,
  • Operand representing the function arguments' type.

Such would be defined as follows (say in a file called op_operand.hxx):

#ifndef op_operand_hxx_
#define op_operand_hxx_

#include "tracker.hxx"
#include <ostream>
#include <iostream>

//====================================================================

// Define the function argument type. It will hold an int value...
struct Operand : tracker<Operand>
{
  Operand()
    : tracker(), value(0) { }
  Operand(int v)
    : tracker(0), value(v) { }
  Operand(Operand const& a)
    : tracker(a), value(a.value) { }
  Operand(Operand&& a)
    : tracker(std::move(a)), value(a.value) { a.value = 0; }
  Operand& operator =(Operand const& a)
    { tracker::operator =(a); value = a.value; return *this; }
  Operand& operator =(Operand&& a)
  {
    tracker::operator =(std::move(a));
    value = a.value;
    a.value = 0;
    return *this;
  }
  ~Operand() = default;

  int value;
};

// Define the ability for Operands to be written to a stream...
std::ostream& operator <<(std::ostream& os, Operand const& o)
{
  os << o.value;
  return os;
}

//====================================================================

// Define the functor type...
struct Op : tracker<Op>
{
  Op() : tracker() { }
  Op(Op const& a) : tracker(a) { }
  Op(Op&& a) : tracker(std::move(a)) { }
  Op& operator =(Op const& a)
    { tracker::operator =(a); return *this; }
  Op& operator =(Op&& a)
    { tracker::operator =(std::move(a)); return *this; }
  ~Op() = default;

  // Define calling a function with no arguments...
  Operand* operator ()() const
    { return nullptr; }

  // Define calling a function with one argument...
  Operand* operator ()(Operand&& x) const
    { return nullptr; }
#ifdef OPERAND_LVALUE_REF_SUPPORT
  Operand* operator ()(Operand const& x) const
    { return nullptr; }
#endif

  // Define calling a function with two arguments...
  Operand* operator ()(Operand&& x, Operand&& y) const
    { x.value += y.value; return std::move(x); }
#ifdef OPERAND_LVALUE_REF_SUPPORT
  Operand* operator ()(Operand const& x, Operand const& y) const
  {
    Operand retval(x);
    retval.value += y.value;
    return nullptr;
  }
#endif

  // Define calling a function with three arguments...
  Operand* operator ()(Operand&& x, Operand&& y,
    Operand&& z) const
    { x.value += y.value + z.value; return nullptr; }
#ifdef OPERAND_LVALUE_REF_SUPPORT
  Operand* operator ()(Operand const& x, Operand const& y,
    Operand const& z) const
  {
    Operand retval(x);
    retval.value += y.value + z.value;
    return nullptr;
  }
#endif
};

//====================================================================

// Define all testing macros...

#define APPLY_TEST0(i) \
  std::cout << #i " = " << apply(i) << "." << std::endl

#define APPLY_TEST1(i,a) \
  std::cout << #i "(" #a ") = " << apply(i,a) << "." << std::endl

#define APPLY_TEST2(i,a,b) \
  std::cout << #i "(" #a "," #b ") = " << apply(i,a,b) \
  << "." << std::endl

#define APPLY_TEST3(i,a,b,c) \
  std::cout << #i "(" #a "," #b "," #c ") = " \
    << apply(i,a,b,c) << "." << std::endl


#define APPLY_TUPLE_TEST0(i) \
  std::cout << #i " = " << apply_tuple(i,std::tuple<>()) \
  << "." << std::endl

#define APPLY_TUPLE_TEST1(i,a) \
  std::cout << #i "(" #a ") = " \
    << apply_tuple(i,std::make_tuple(a)) << "." << std::endl

#define APPLY_TUPLE_TEST2(i,a,b) \
  std::cout << #i "(" #a "," #b ") = " \
    << apply_tuple(i,std::make_tuple(a,b)) << "." << std::endl

#define APPLY_TUPLE_TEST3(i,a,b,c) \
  std::cout << #i "(" #a "," #b "," #c ") = " \
    << apply_tuple(i,std::make_tuple(a,b,c)) << "." << std::endl


#define OUTPUT(type) \
  std::cout \
    << #type " LOG ============\n" \
    << tracker<type>::output_log \
    << #type " HIST ===========\n" \
    << tracker<type>::output_hist \
    << #type " END ============\n" \
  ;


#define CLEAR(type) tracker<type>::clear();

#define APPLY_FULL0(type) \
  CLEAR(type); \
  CLEAR(Operand); \
  { APPLY_TEST0(type()); } \
  std::endl(std::cout); \
  OUTPUT(type); \
  OUTPUT(Operand); \
  std::endl(std::cout);


#define APPLY_FULL1(type,a) \
  CLEAR(type); \
  CLEAR(Operand); \
  { APPLY_TEST1(type(),Operand(a)); } \
  std::endl(std::cout); \
  OUTPUT(type); \
  OUTPUT(Operand); \
  std::endl(std::cout);


#define APPLY_FULL2(type,a,b) \
  CLEAR(type); \
  CLEAR(Operand); \
  { APPLY_TEST2(type(),Operand(a),Operand(b)); } \
  std::endl(std::cout); \
  OUTPUT(type); \
  OUTPUT(Operand); \
  std::endl(std::cout);


#define APPLY_FULL3(type,a,b,c) \
  CLEAR(type); \
  CLEAR(Operand); \
  { APPLY_TEST3(type(),Operand(a),Operand(b),Operand(c)); } \
  std::endl(std::cout); \
  OUTPUT(type); \
  OUTPUT(Operand); \
  std::endl(std::cout);


#define APPLY_TUPLE_FULL0(type) \
  CLEAR(type); \
  CLEAR(Operand); \
  { APPLY_TUPLE_TEST0(type()); } \
  std::endl(std::cout); \
  OUTPUT(type); \
  OUTPUT(Operand); \
  std::endl(std::cout);


#define APPLY_TUPLE_FULL1(type,a) \
  CLEAR(type); \
  CLEAR(Operand); \
  { APPLY_TUPLE_TEST1(type(),Operand(a)); } \
  std::endl(std::cout); \
  OUTPUT(type); \
  OUTPUT(Operand); \
  std::endl(std::cout);


#define APPLY_TUPLE_FULL2(type,a,b) \
  CLEAR(type); \
  CLEAR(Operand); \
  { APPLY_TUPLE_TEST2(type(),Operand(a),Operand(b)); } \
  std::endl(std::cout); \
  OUTPUT(type); \
  OUTPUT(Operand); \
  std::endl(std::cout);


#define APPLY_TUPLE_FULL3(type,a,b,c) \
  CLEAR(type); \
  CLEAR(Operand); \
  { APPLY_TUPLE_TEST3(type(),Operand(a),Operand(b),Operand(c)); } \
  std::endl(std::cout); \
  OUTPUT(type); \
  OUTPUT(Operand); \
  std::endl(std::cout);


//====================================================================

#endif // #ifndef op_operand_hxx_


Running The Tests

The C++ code to run tests on my code is:

#include <cstddef>

//====================================================================

template <std::size_t...>
struct indices;

template <std::size_t N, typename Indices, typename... Types>
struct make_indices_impl;

template <
  std::size_t N,
  std::size_t... Indices,
  typename Type, typename... Types
>
struct make_indices_impl<N, indices<Indices...>, Type, Types...>
{
  typedef
    typename make_indices_impl<
      N+1,
      indices<Indices...,N>,
      Types...>::type
    type
  ;
};

template <std::size_t N, std::size_t... Indices>
struct make_indices_impl<N, indices<Indices...>>
{
  typedef indices<Indices...> type;
};

template <std::size_t N, typename... Types>
struct make_indices
{
  typedef
    typename make_indices_impl<
      0,
      indices<>,
      Types...
    >::type
    type
  ;
};

//====================================================================

#include <tuple>
#include <functional>

//====================================================================

template <typename Op, typename... Args>
auto apply(Op&& op, Args&&... args)
  -> typename std::result_of<Op(Args...)>::type
{
  return op( std::forward<Args>(args)... );
}

//====================================================================

template <
  typename Indices
>
struct apply_tuple_impl;

template <
  template <std::size_t...> class I,
  std::size_t... Indices
>
struct apply_tuple_impl<I<Indices...>>
{
  template <
    typename Op,
    typename... OpArgs,
    template <typename...> class T = std::tuple
  >
  static auto apply_tuple(Op&& op, T<OpArgs...>&& t)
    -> typename std::result_of<Op(OpArgs...)>::type
  {
    return op( std::forward<OpArgs>(std::get<Indices>(t))... );
  }
};

template <
  typename Op,
  typename... OpArgs,
  typename Indices = typename make_indices<0, OpArgs...>::type,
  template <typename...> class T = std::tuple
>
auto apply_tuple(Op&& op, T<OpArgs...>&& t)
  -> typename std::result_of<Op(OpArgs...)>::type
{
  return apply_tuple_impl<Indices>::apply_tuple(
    std::forward<Op>(op),
    std::forward<T<OpArgs...>>(t)
  );
}

//====================================================================

#include "tracker.hxx"
#include "op_operand.hxx"

int main()
{
  // TEST apply()...
  APPLY_FULL0(Op);
  APPLY_FULL1(Op, -3);
  APPLY_FULL2(Op, -3, -10);
  APPLY_FULL3(Op, -3, -10, -100);

  std::cout
    << "====================================="
    << std::endl
    << std::endl
  ;

  // TEST apply_tuple()...
  APPLY_TUPLE_FULL0(Op);
  APPLY_TUPLE_FULL1(Op, -3);
  APPLY_TUPLE_FULL2(Op, -3, -10);
  APPLY_TUPLE_FULL3(Op, -3, -10, -100);

  return 0;
}

//====================================================================

and the C++ code to run tests on BCW's code is:

#include <cstddef>
#include <tuple>
#include <functional>

//====================================================================

// BCW's apply() renamed to apply_tuple()...

template <typename Func, typename... TupleTypes,
          typename... UnwrappedParams>
inline typename std::result_of<Func(TupleTypes...)>::type
  apply_tuple(Func f, std::tuple<TupleTypes...> ts,
      UnwrappedParams... params)
{
    return apply_tuple(f, ts, params...,
        std::get<sizeof...(UnwrappedParams)>(ts));
}

template <typename Func, typename... TupleTypes>
inline typename std::result_of<Func(TupleTypes...)>::type
  apply_tuple(Func f, std::tuple<TupleTypes...> ts,
      TupleTypes... params)
{
    return f(params...);
}

template <typename Func>
inline auto apply_tuple(Func f, std::tuple<> ts) -> decltype(f())
{
    return f();
}

//====================================================================

#include "tracker.hxx"

// BCW's code requires Op::operator ()() to support lvalues to compile.
// Thus, it is necessary to define the next line to have BCW's code
// to compile...
#define OPERAND_LVALUE_REF_SUPPORT
#include "op_operand.hxx"

int main()
{
  // TEST apply_tuple()...
  APPLY_TUPLE_FULL0(Op);
  APPLY_TUPLE_FULL1(Op, -3);
  APPLY_TUPLE_FULL2(Op, -3, -10);
  APPLY_TUPLE_FULL3(Op, -3, -10, -100);

  return 0;
}

//====================================================================


Closing Comments

Success has been achieved and tested! The apply functions were each tested to see whether or not they would perform optimally using C++11's rvalue references and ability to perfectly forward function arguments. Also demonstrated was how recursively defined code can have very significant overheads but such can be elided if code is written to exploit perfect argument forwarding with only a small amount of C++ template metaprogramming code. Once done, one can use such codes without being concerned about their overheads!

Another important lesson for myself and everyone that reads this post is just how important testing code for efficiency concerns –not just correctness– is before adding such code to a real program or into a library that is used in real programs. Ideally code once tested, debugged, etc. can be used as a blackbox with its documented guarantees. For this ideal to be reached, the properties and claims made in the documentation about the code need to be confirmed through testing –not just code study.

Should any any errors or omissions be found within this post, kindly advise me of such!

Happy Coding!

One Reply to “Testing If std::tuple Is Being Applied To Functors Efficiently”

Leave a Reply

Your email address will not be published. Required fields are marked *