Applying std::tuple To Functors Efficiently

Overview

The new C++11 standard nicely adds tuples (i.e., std::tuple found in <tuple>) and while it is easy to invoke std::make_tuple and pass a number of arguments to create a new std::tuple there does not appear to be any function that efficiently does the reverse (i.e., apply all tuple elements efficiently to a function or a functor). This post details how to do such.

Update (Dec. 18, 2011): I have added some comments concerning the use of std::move in some return statements. When returning values from a function, one should simply write return somevalue; to enable the compiler to choose between moving and return value optimization (RVO). Because this article focuses on discussing move operations, I did not write the code to exploit RVO –it explicitly moves everythng– in order to avoid any confusion, but, please be aware that normally one would not use std::move on a variable being returned in a return statement.

Article Source Code: See the next article.


Using std::make_tuple

Using std::make_tuple to construct a tuple is very straight-forward:

#include <string>
#include <tuple>

int main()
{
  auto x = std::make_tuple(1, 4.3, std::string{"Hello!"});
}

would induce the variable x to have the type:

std::tuple<int, double, std::string>

The std::make_tuple function is a convenience function that allows one to avoid having to specify both the types and the values for a specific tuple. For example, the above code could have been written:

#include <string>
#include <tuple>

int main()
{
  std::tuple<int,double,std::string> x{1, 4.3, std::string{"Hello!"}};
}

It is also particularly handy when one needs to create and pass a tuple to a function as a temporary variable. By simply invoking std::make_tuple in a way that ensures its returned tuple is passed to the function, avoids having to formally declare tuple variables.

Accessing std::tuple's Elements

Given some std::tuple instance, t, the C++11 standard allows us to retrieve a specific element of a std::tuple by using get<N>(t) where N is the zero-based index of the desired element. It is also possible to retrieve the number of elements as well as their types, but, these will not be used in the solution presented in this article.

Requirements Of The Solution

The final solution should concern using a function called apply which should take these parameters:

  • the first argument must be a function (pointer) or a functor (i.e., a struct/class with an overloaded function call operator), and,
  • the second parameter must be a tuple whose sequence of types and values are the same as the first argument's function/functor arguments.

so when apply is called all of the std::tuple's elements are efficiently passed to the function/functor. For example this code's cout lines:

struct add1
{
  int operator ()(int x, int y) const
  {
    return x + y;
  }
};

int add2(int x, int y)
{
  return x + y;
}

// ...

std::cout << apply(add1(), std::make_tuple(3,4)) << std::endl;
std::cout << apply(add2, std::make_tuple(8,2)) << std::endl;

are equivalent to:

std::cout << add1()(3,4) << std::endl;
std::cout << add2(8,2) << std::endl;

and would output:

$ ./a.out
7
10
$

Before considering how to manipulate std::tuple's one should consider how to write the code that does the same without std::tuples:

std::cout << apply(add1(), 3, 4) << std::endl;
std::cout << apply(add2, 8, 2) << std::endl;

Ideally such should be a simpler version of a solution that works with std::tuple and, in this instance, this is what occurs.

The other requirement of the solution is maximum possible efficiency. To achieve this will require careful attention to ensure that no unnecessary copying or moving of any data is done. Ideally, the compiler should be able to use the get function to retrieve all tuple elements and place them in the function arguments all at compile time.

NOTE: This means that std::bind and std::function should not be used in the solution.

Rvalue References And Perfect Forwarding

What Is An Rvalue?

To meet the above requirements two new C++11 features will need to be exploited: r-value references and perfect forwarding of function arguments. Prior to C++11 only these special operations, with or without using templates, could be defined with user-defined types:

  • the ability to create a default-initialized object by defining a constructor that takes no arguments,
  • the ability to create a copy of another object of the same type by defining a constructor that has a single argument that receives the other object,
  • the ability to define how one object is assigned to another (of the same type) by defining a copy assignment operator, and,
  • the ability to define how an object is destroyed by writing a destructor.

In addition to the above C++11 adds the following:

  • the ability to move the contents of an object that is an rvalue into a newly created object because the former object will no longer be used (e.g., it will be destroyed), and,
  • the ability to move the contents of an object that is an rvalue into an existing object via assignment because the former object will no longer be used (e.g., it will be destroyed).

The first ability is called the move constructor and the second is called move assignment. Both operations take rvalue references for their arguments. Know that rvalue references are references and cannot be const. Since they are references, the following code is erroneous:

struct Example2 { };

Example2&& test()
{
  Example2 e;
  // ...
  return e;
}

It is illegal for the same reasons that this code is erroneous:

struct Example2 { };

Example2& test()
{
  Example2 e;
  // ...
  return e;
}

and that this code is erroneous:

struct Example3 { };

Example3* test()
{
  Example3 e;
  // ...
  return &e;
}

as all three return the address (implicitly for Example1 and Example2 and explicitly for Example3) of a local variable that will be destroyed once the test() returns.

What exactly is an rvalue? The name rvalue means right-hand-side-value and stands in constrast to an lvalue (which means left-hand-side-value). The usage of right and left refers to where it is allowed relative to an assignment operator: an rvalue can only appear on the right-hand side whereas an lvalue can appear on the left or right hand sides.

In terms of C++11 coding what is an rvalue and how does it differ from an lvalue? Loosely, an rvalue is:

  • a compiler-created temporary variable, or
  • any unnamed variable.

Hence an rvalue is a value that one cannot explicitly obtain its address via the address-of operator as there is no name to refer to it. (Because of nuances, the C++11 standard redefines lvalues and rvalues more precisely than this using a taxonomy. This level of detail will not be discussed here.)

Once a variable somehow acquires a name then it can be explicitly referenced and manipulated as an lvalue and so C++ no longer treats as an rvalue (with some caveats for explicitly named rvalue-reference variables) unless std::move(variable_name) is invoked on it.

Warning: Once std::move(variable_name) is invoked on a variable and its owned contents has been moved, that variable should only used as as a "zeroed-out" or "empty" object (whatever that means for that type). Effectively, when you use std::move() know that the ownership of that variable's state may be transferred (i.e., moved) to another variable since std::move() makes that variable become an rvalue.

One of the requirements of this article is to be able to have maximum efficiency: one does not want to copy the contents of a tuple into function arguments. In general, for proper program semantics with arbitrary functions copies cannot be made. Consider:

double test(int& i, double const& y)
{
  double retval = y * 3.141592;
  i = retval;
  return retval;
}

where i is passed by reference. If a copy of a variable was passed as an argument then the original variable wouldn't be modified. Clearly, one needs a way to perfectly forward the types in a tuple into each function argument in a compatible way that preserves the semantics of the program code. Prior to C++11 this was not possible. In C++11, it is provided rvalue references with their associated semantics and templates are used.

Lvalue and Rvalue References

Rvalue references are defined similarly to C++98 references. C++98 references are lvalues that are declared using a single ampersand and must be assigned to an lvalue reference at that point (with one exception). The ampersand is used because it is the address-of operator and a C++98 reference implicitly takes the address-of of the expression it is initialized with. For example, this C/C++ code:

int i = 4;
int * const ref = &i;
// ...
*ref = 46; // i will now be set to 46

is semantically equivalent (in terms of its underlying implementation) to this C++98 code:

int i = 4;
int& ref = i;
// ...
ref = 46; // i will now be set to 46

i.e., any C++98 reference is equivalent (in terms of its underlying implementation) to a constant pointer to whatever its type is. This implies that once a reference's (address) value is set, it cannot be changed –as it is a constant pointer internally.

C++98 references allowed the ugliness of C's pointers to disappear as it allowed one to implicitly pass variables by reference:

void set(int &i, int value)
{
  i = value;
}
// ...
int myvar;
set(myvar, 9);

instead of writing this C equivalent:

void set(int *i, int value)
{
  *i = value;
}
// ...
int myvar;
set(&myvar, 9);

which also required a lot more attention to always correctly using pointers, etc. to avoid misuing them –the likely result of which was a program crash. Thus C++98 references allowed the programmer to:

  • avoid having to explicitly use the address-of operator for any function call arguments that were to be implicitly passed by reference, and,
  • avoid having to explicitly dereference pointer variables in functions being passed as references –as they are typically only manipulated as values within those functions.

Thus in C++98 it is no longer a burden to efficiently and easily use pass-by-reference. No doubt this encourages programmers to use pass-by-reference when it is more efficient and appropriate to do so.

C++98 references are lvalues and can only refer to rvalues with one exception: constant references referring to compiler-created temporary variables (e.g., passed as function arguments). Consider converting the value function parameter in the above example to be a constant lvalue reference:

void set(int &i, int const&value)
{
  i = value;
}
// ...
int myvar;
set(myvar, 9); // NOTE: 9 must become a temporary --an rvalue!!

Perhaps to one's horror the above code correctly compiles and runs! For the compiler to call set(myvar, 9) it needs to create a temporary variable of type int so its address can be taken and passed to the function! The issue here is the temporary variable is an rvalue –not an lvalue. The C++98 ISO committee reviewed this specific issue carefully and resolved that since the reference refers to constant data that such should be legal. Why? In part: it is constant, i.e., its value won't be changing, and, casting away const is bad and can easily have implementation-defined results. Thus, this "violation" is a non-issue. The result of permitting such:

  • it allows more efficient variable passing to functions,
  • it enables compiler to better optimize code, and,
  • it completely removes the burden of writing overloaded versions of functions just so that literals and other rvalues can be passed to arguments taking constant references.

So all was well and good: plenty of cleaner and efficient C++ code was subsequently written and used successfully. However, being a language of absolute efficiency and robustness with high- and low-level styles of coding, there were still problems that caused programmers to twist their code to be efficient. Consider this example code that critics of object-oriented programming (OOP) often cite as an argument against OOP:

class Matrix
{
  public:
    // NOTE: All code below assumes that nrows_ and ncols_
    //       is hard-coded to be 1000 by 1000 for simplicity
    //       here since how to properly write a general matrix
    //       class is not the purpose of this blog post! :-)

    Matrix()
      : nrows_(1000), ncols_(1000), data_(nrows_*ncols_) { }

    // ...

    // Permit access to the sizes...
    std::size_t nrows() const;
    std::size_t ncols() const;

    // Permit access using the function call operator...
    double& operator ()(std::size_t row, std::size_t col);
    double const& operator ()(std::size_t row, std::size_t col) const;

  private:
    std::size_t const nrows_;   // Hard-coded to be 1000
    std::size_t const ncols_;   // Hard-coded to be 1000
    std::vector<double> data_;
};

// Define addition...
Matrix& operator +=(Matrix& a, Matrix const& b)
{
  for (std::size_t i=0; i != b.nrows(); ++i)
    for (std::size_t j=0; j != b.ncols(); ++j)
      a(i,j) += b(i,j);   // No copies: just += each element.

  return a;   // Return a by reference as it was passed!
}

Matrix operator +(Matrix const& a, Matrix const& b)
{
  Matrix retval(a);   // Copy a into retval
  retval += b;        // Add b to retval
  return retval;      // Return retval by value
}

// ...
Matrix a, b, c;
// ...
Matrix d = a + b + c; // Can be very inefficient!

i.e., given three matrices add them (non-destructively) together and place their result in a new matrix. The addition code be written written using two for loops in a procedural manner (assuming the memory was previously allocated, all four matrices are the same size, and we are not trying to optimize anything else):

std::size_t const nrows = a.nrows();
std::size_t ncols = a.ncols();

for (std::size_t i=0; i != nrows; ++i)
  for (std::size_t j=0; j != ncols; ++j)
    d(i,j) = a(i,j) + b(i,j) + c(i,j);

Unfortunately, in OOP languages (except in C++ when using a library that optimizes such using the expression templates technique) this is not the code that is generated by the compiler. Instead the compiler generates code based on the operations being applied to the objects reflect in the expression:

Matrix d = a + b + c;

which does the following (assuming the expression templates code is not written induce optimization):

  1. First a + b is evaluated. This requires calling operator +(a,b) and returns a compiler-created temporary Matrix value. Let's call this value t1.
  2. Now t1 + c is evaluated. This also requires calling operator +(t1,b) again and it too returns a compiler-created temporary Matrix value. Let's call this value t2.
  3. Having evaluated the right-hand size of the initialization (as this is not the assignment operator!) the constructor will invoke the copy constructor for d passing in t2 so its value can be copied.
  4. Finally the compiler destroys all temporary values: t1 and t2.

Clearly the generated code, although it will produce the correct answer, allocates more RAM and will take more time to execute (as there will be more cache misses than hits especially with large matrices). Clearly this is unacceptable. On the other hand, one does want the code to execute according to the functions defined for the classes being used. What should one do? The expression templates technique is a form of C++ template metaprogramming that is considered to be advanced, is there an easier way? The short answer is with C++11, "Yes!" Before doing that, with C++98, many programmers would write:

Matrix d = a;
d += b;
d += c;

instead of Matrix d = a * b * c;. The former is more efficient, the latter is easier to read and understand –not to mention commonly written in actual code. Ideally a programming language should not force the programmer to make such trade-offs.

Important: Using rvalue references does not eliminate the need for the expression templates technique –there are still many things that won't be optimized. The expression templates technique allows one to write code that recognizes how types are used and transform them (to presumably equivalent, more efficient forms). With matrices, such can be used to determine at compile-time an optimal way to perform matrix multiplication (in expressions with varying matrix sizes) such as:

Matrix d = a * b * c;

Moving Stuff Around

In the above example the temporary variables, t1 and t2, were rvalues and the compiler should have somehow reused t1 instead of also making t2. But, the compiler is not human and does not know the programmer's intention or anything about matrices! This means the compiler cannot simply reuse t1 and call operator +=(t1,c) instead: it was told to only call operator +!!.

So how can rvalue references help with expressions like Matrix d = a + b + c;? Through the definition of move constructors and move assignment operators, they allow the programmer to implicitly tell the compiler how to efficient move the contents/state of one rvalue variable to another. This is particularly handy since the compiler actively does create temporary variables when evaluating expressions that the programmer otherwise has no control over! Re-examining Matrix d = a + b + c;:

  1. First the compiler must make a temporary to perform a + b. Let's again call this temporary t1. So there are no changes here.
  2. Next the compiler will perform t1 + c this too will return a copy. Hey, there are no improvements!! Let's stop.

Yes, there are no improvements. To get them, the Matrix class (and the types it internally uses) need to define move constructors and assignment operators. Nicely a C++11 compiler will define default ones –which are of benefit if the underlying types define such (e.g., Matrix' use of std::vector). So, I lied: there were improvements, but, unless one understands the move constructor and move assignment operator, it is hard-to-comprehend what benefits are obtained through the use of rvalues.

Consider this simple array-like class:

#include <algorithm>

class MyArray
{
  public:
    // Default constructor...
    MyArray()
      : size_(s), data_(nullptr)
    {
    }

    // Allow one to specify a size...
    MyArray(std::size_t s)
      : size_(s), data_(new int[s])
    {
    }

    // Define copy constructor...
    // Notice that it accepts an lvalue reference to a constant
    //   of the same type as this class (i.e., MyArray).
    MyArray(MyArray const& a)
      : size_(a.size_), data_(new int[a.size_])
    {
      std::copy(a.data_, a.data_+a.size_, data_);
    }

    // Define move constructor...
    // Notice that it accepts an rvalue reference (&&) to
    //   an instance of the same type as this class (i.e., MyArray).
    // Notice that the data_ in a is simply assigned to this object
    //   and then a.data_ is set to be null. Why? The data is being
    //   MOVED from a to this. a will need to still be properly
    //   destroyed so a.data_ needs to be set to null.
    MyArray(MyArray&& a)
      : size_(a.size_), data_(a.data_)
    {
      a.size = 0;           // Reset size
      a.data_ = nullptr;    // Reset pointer
    }

    // Define copy assignment operator...
    // Notice that exception-safe code is presented here.
    // Notice that making a copy is necessary.
    MyArray& operator =(MyArray const& a)
    {
      MyArray temp(a);               // Make copy of a
      std::swap(size_, temp.size_);  // Swap sizes
      std::swap(data_, temp.data_);  // Swap data pointers

      // Destroy temp as *this is now a copy of a! :-)
      return *this;
    }

    // Define move assignment operator...
    // One could write this without temp to be slightly
    // more efficient. This code reuses the above definitions
    // and keeps the actual movement code in the move
    // constructor only.
    MyArray&& operator =(MyArray&& a)
    {
      MyArray temp(std::move(a));    // Move a into temp
      std::swap(size_, temp.size_);  // Swap sizes
      std::swap(data_, temp.data_);  // Swap data pointers
      return *this;
    }

    // ...

  private:
    std::size_t size_;
    int* data_;
};

Perhaps it is much easier to see that moving is actually moving the data from one object to another! It can only be done with rvalues as rvalues are not to be the receivers of assignment operations: they are intended to only be used as values even though we can also acquire their addresses as they are located somewhere in the computer's RAM.

Notice that std::move(a) was used. Isn't a declared to be an rvalue reference? Yes it is, but, once it acquires a name it has to be treated as an lvalue. The std::move function is C++11's way for a programmer to explicitly make a lvalue an rvalue. Once this is done the novice programmer should not use the variable whose content was moved –as its state has been reset so that its destructor can be executed without error and a minimum of overhead.

Note: The purpose of a named rvalue reference declaration is to ensure that it can only be initialized with an rvalue. It does not mean it is itself an rvalue –although it refers to one!

Let's re-examine:

Matrix d = a + b + c;

which clearly computes a rvalued result which is used to initialize d. Provided the std::vector class has a move constructor (it does) and one is in Matrix then the compiler need not use the copy constructor: it can use the move constructor instead. This implies the data in t2 will be moved/transferred to d directly –not copied into d. This implies one fewer Matrix has to be dynamically allocated in the expression. Awesome!

Concerning the expression:

a + b + c;

all we've told the compiler is how to handle this function:

Matrix operator +(Matrix const& a, Matrix const& b);

which states given two Matrix instances return a new one. Returning any value by value from a function implies that its value can be treated as an rvalue. Thus, instead of making a copy of the Matrix returned, copying the contents over, and then destroying the local variable, the compiler is free to simply move it from the local variable to another one that is available –since the local variable will be destroyed.

One can do even better by telling the compiler how to add values that are rvalues. Only looking at operator + and making simple changes:

Matrix operator +(Matrix const& a, Matrix const& b); // as before

// [2]
inline Matrix operator +(Matrix&& a, Matrix const& b)
{
  Matrix retval(std::move(a));   // Move a into retval
  retval += b;                   // Add b to retval
  return retval;                 // Returns as an rvalue
  // Some may prefer: return std::move(a += b);

  // In general don't use std::move() in a return statement
  // that returns a variable. Simply write: return retval;
  // This will allow compilers to choose between RVO or moving.
}

// [3]
inline Matrix operator +(Matrix const& a, Matrix&& b)
{
  // This works since matrix addition is associative...
  return operator +(std::move(b), a); // call [2] above
}

// [4]
inline Matrix operator +(Matrix&& a, Matrix&& b)
{
  return operator +(std::move(a), b); // call [2] above
}

With these defined when rvalues are being added the cost of creating a new temporary matrix state (i.e., the RAM to hold the 1000 by 1000 matrix!) is now completely avoided. The compiler does still create the temporary variables and needs to (as they are allocated on the stack), but, the overheads have been sliced to a bare minimum.

Revisiting what happens when Matrix d = a + b + c; is run with the above additions and changes (assuming std::vector behaves like MyArray: i.e., moves null out the internal pointers):

  1. First a + b is calculated and a full temporary matrix is returned as temporary t1. No change here.
  2. Now t1 + c is performed. Since t1 is an rvalue as it is a compiler-created temporary, it invokes:
    Matrix operator +(Matrix&& a, Matrix const& b)

    which avoids creating a new temporary with its own separate storage. Instead it creates a new temporary, t2, by move constructing it from the stored result in retval which was moved from t1 (given the above code). The savings here are huge: no unnecessary additional RAM was allocated on the heap!

  3. Finally, t2 is moved into d's constructor without allocating any additional RAM on the heap!

So, if you're counting temporaries one might argue that there is not much improvement. But if one is counting the amount of RAM allocated and copying there are major improvements. Essentially, only one Matrix is dynamically allocated in RAM (to hold the initial temporary) and that winds up ultimately being moved into d.

Note: C++ can elide constructors and perform other operations to further optimize the above code. The programmer can do additional things too. If one will not be using the a, b, and c variables again, you can help the compiler out by having all variables become rvalues (which would move everything –no copies):

Matrix d = std::move(a) + std::move(b) + std::move(c);

Perfect Forwarding

If the type of a function parameter is an rvalue reference to a template type parameter for a function template then the type parameter is deduced to be: an lvalue reference if an lvalue is passed, or, a value type (which is an rvalue and, thus, can be treated as an rvalue reference by the compiler) assuming the overall call is valid.

Important: Recall that only with function templates, the compiler will deduce the (template) types of its arguments. This specific type of deduction mentioned here only works for function templates when an rvalue reference to a type that is a template parameter is used. If the argument is an rvalue to a non-template type this specific deduction will not occur.

Consider:

struct Silly { };

template <typename T>
void func(T&& t);

int main()
{
  Silly s;
  func(s);             //    Lvalue. Argument type: Silly&
  func(Silly());       // Temporary. Argument type: Silly&&
  func(std::move(s));  //  Rvalue forced. Arg type: Silly&&
}

By using this deductive property, C++ can preserve the argument types while forwarding them to another function. To do this, one needs to use std::forward<arg_type>(arg_variable_name) which takes care of lvalue/rvalue issues when doing such. All that is needed for one to exploit such is to simply use it.

Enough sidebar issues! Let's figure out how to efficiently pass a std::tuple to regular functions and then functors!


Writing apply() To Not Use std::tuple

Using variadic templates solving this problem is straight-forward:

#include <tuple>
#include <functional>
  // NOTE: g++ v4.5 needs <functional> for std::result_of
  //       g++ v4.7 does not

template <typename Op, typename... Args>
auto apply(Op&& op, Args&&... args)
  -> typename std::result_of<Op(Args...)>::type
{
  return op(args...);
}

It is straight-forward for this reasson: every argument from the second passed to apply is in the Args parameter pack in the order and form it needs to be. This allowed the arguments to be expanded in the return statement as-is. The only remaining item to address here is to use std::forward on those arguments:

#include <tuple>
#include <functional>
  // NOTE: g++ v4.5 needs <functional> for std::result_of
  //       g++ v4.7 does not

template <typename Op, typename... Args>
auto apply(Op&& op, Args&&... args)
  -> typename std::result_of<Op(Args...)>::type
{
  return op( std::forward<Args>(args)... );
}

to ensure perfect forwarding.


Writing apply_tuple() To Use std::tuple

A Problem

Unfortunately, the above solution cannot be trivially applied to std::tuple. When template parameter packs are expanded they don't have the ability to perform the arithmetic (e.g., N+1) needed to unroll the std::tuple elements for each function argument. The latter is needed since the std::get template function requires each index as a compile-time constant template argument, if parameter pack expansion is to be used then a way must be found to increment the value passed to std::get as each tuple element is extracted.

Although parameter pack expansions cannot do arithmetic, one can pre-compute the indices it will need. Why is this useful? If parameter pack expansions are done on expressions involving parameter pack types and variables, then they will all be expanded together as a sequence. For example, given the tuple:

std::tuple{1,2.2,'a'}

whose type is std::tuple<int,double,char> and this:

template <std::size_t...>
struct indices;
// ...
indices<0,1,2>

then it will be possible to expand both (i.e., indices and std::tuple types's template arguments, i.e., the type of each element in the tuple) at the same time using parameter pack expansion. Hopefully by solving this problem a solution similar to the non-tuple version will be possible!

Generating A List Of Indices

By defining a compile-time only container to hold the list of indices:

template <std::size_t...>
struct indices;

struct templates make_indices, make_indices, and make_indices_impl can be defined. What should they contain? For each type, a corresponding index needs to be generated that is zero-based. This implies that when processing each tuple element type from left to right the next highest index needs to be appended to the indices instance's template argument std::size_t constants. This will need to be defined recursively using make_indices as an easy-to-use type to start the indices' generation and hold the result in a typedef called type:

#include <cstddef> // for std::size_t

// Define holder for indices...
template <std::size_t...>
struct indices;

// Define adding the Nth index...
// Notice one argument, Type, is discarded each recursive call.
// Notice N is added to the end of Indices... .
// Notice N is incremented with the recursion.
template <std::size_t N, typename Indices, typename... Types>
struct make_indices_impl;

template <
  std::size_t N,
  std::size_t... Indices,
  typename Type,
  typename... Types
>
struct make_indices_impl<N, indices<Indices...>, Type, Types...>
{
  typedef
    typename make_indices_impl<
      N+1,
      indices<Indices...,N>,
      Types...
    >::type
    type
  ;
};

// Define adding the last index...
// Notice no Type or Types... are left.
// Notice the full solution is emitted into the container.
template <std::size_t N, std::size_t... Indices>
struct make_indices_impl<N, indices<Indices...>>
{
  typedef indices<Indices...> type;
};

// Compute the indices starting from zero...
// Notice indices container is empty.
// Notice Types... will be all of the tuple element types.
// Notice this refers to the full solution (i.e., via ::type).
template <std::size_t N, typename... Types>
struct make_indices
{
  typedef
    typename make_indices_impl<0, indices<>, Types...>::type
    type
  ;
};

Define apply_tuple To Work With std::tuple

Finally apply can be written but we need a helper function. Start by writing apply to do everything it can do:

template <
  typename Op,
  typename... OpArgs,
  typename Indices = typename make_indices<0, OpArgs...>::type,
  template <typename...> class T = std::tuple
>
auto apply_tuple(Op&& op, T<OpArgs...>&& t)
  -> typename std::result_of<Op(OpArgs...)>::type
{
  return apply_tuple_impl<Indices>::apply_tuple(
    std::forward<Op>(op),
    std::forward<T<OpArgs...>>(t)
  );
}

Unfortunately, Indices is not a parameter pack and the indices inside of Indices and OpArgs must both be parameter packs so that they can be expanded together:

[snip]
{
  return op( std::forward<OpArgs>(std::get<Indices>(t))... );
}

It is straight-forward to use partial template specialization with a helper class to lift the std::size_t… parameter pack out of its container and this is why apply_tuple_impl has been used above. Notice that all function arguments are simply forwarded. The "prototype" case for apply_tuple_impl must be mentioned before using the partial template specialization technique:

template <typename Indices>
struct apply_tuple_impl;

Now Indices can be lifted out of the template template parameter I<Indices...> which will be inferred to be indices –the container type:

template <
  template <std::size_t...> class I,
  std::size_t... Indices
>
struct apply_tuple_impl<I<Indices...>>
{
  template <
    typename Op,
    typename... OpArgs,
    template <typename...> class T = std::tuple
  >
  static auto apply_tuple(Op&& op, T<OpArgs...>&& t)
    -> typename std::result_of<Op(OpArgs...)>::type
  {
    return op( std::forward<OpArgs>(std::get<Indices>(t))... );
  }
};

which is the desired solution.


Testing The Solution

The following code will nicely (partially) test the solution:

//=================================================================

#include <cstddef>
#include <tuple>
#include <functional> // Only if needed

// Definition of indices
// Definition of make_indices_impl
// Definition of make_indices
// Definition of apply
// Definition of apply_tuple_impl
// Definition of apply_tuple

//=================================================================

#include <iostream>

struct add { int operator ()(int x, int y) { return x + y; } };
int add2(int x, int y) { return x + y; }

struct A1 { int operator ()() { return -1; } };
int A2() { return -2; }

struct B1 { int operator ()(int x) const { return x; } };
int B2(int x) { return x; }

int main()
{
  using namespace std;

  // Test apply() with no arguments...
  cout
    << "A1()() = "
    << apply(A1())
    << "."
    << endl
  ;
  cout
    << "A2() = "
    << apply(A2)
    << "."
    << endl
  ;

  // Test apply() with one argument...
  cout
    << "B1()(-3) = "
    << apply(B1(),-3)
    << "."
    << endl
  ;
  cout
    << "B2(-4) = "
    << apply(B2,-4)
    << "."
    << endl
  ;

/*
  // Define C1 and C2 elsewhere for two arguments and
  // test apply() with such...
  cout
    << "C1()(-3,-10) = "
    << apply(C1(),-3,-10)
    << "."
    << endl
  ;
  cout
    << "C2(-4,-10) = "
    << apply(C2,-4,-10)
    << "."
    << endl
  ;

  // etc.
*/



  // Test apply_tuple() with no arguments...
  cout
    << "A1()() = "
    << apply_tuple(A1(),std::tuple<>())
    << "."
    << endl
  ;
  cout
    << "A2() = "
    << apply_tuple(A2,std::tuple<>())
    << "."
    << endl
  ;

  // Test apply_tuple() with one argument...
  cout
    << "B1()(-3) = "
    << apply_tuple(B1(),make_tuple(-3))
    << "."
    << endl
  ;
  cout << "B2(-3) = "
    << apply_tuple(B2,make_tuple(-4))
    << "."
    << endl
  ;

/*
  // Define C1 and C2 elsewhere for two arguments and
  // test apply_tuple() with such...
  cout
    << "C1()(-3,-10) = "
    << apply_tuple(C1(),make_tuple(-3,-10))
    << "."
    << endl
  ;
  cout
    << "C2(-4,-10) = "
    << apply_tuple(C2,make_tuple(-4,-10))
    << "."
    << endl
  ;

  // etc.
*/


  return 0;
}

It is wise to use different names for apply and apply_tuple so that tuples themselves can be directly applied to functions/functors.


Closing Comments

As far as I can tell the C++11 standard did not include a make_indices utility. Fortunately, it is not hard to define such to provide a straight-forward way to extract tuple values while performing a parameter pack expansion. That said, the solution is not obvious given the norms of coding and therefore can be hard to figure out. For those of you thinking, "There just has to be a way," I wrote this post! πŸ™‚

Should you notice any errors or issues, kindly let me know.

18 Replies to “Applying std::tuple To Functors Efficiently”

  1. Fantastic! though I see people using part of techniques mentioned in this article here and there, this is the first one that clearly covers the whole picture. Thanks!

  2. Isn't the convenience function for making tuples std::make_tuple, and not std::make_pair? I thought std::make_pair was only for making std::pair objects.

  3. I think the tuple apply implementation so far has a problem because it only accepts rvalues. The reference type contraction apparently only works if the type is a template of the function, so

    template
    void foo(T&& t)
    {
    }

    can be called with an lvalue, but
    template<typename U, template class T>
    void foo(T&& t)
    {
    }

    cannot. Do you have a good solution for this? Fallback to overloading on this argument?

    Best,
    Malte

    1. I am not sure what you are asking about since "template class T" in a template argument is illegal syntax (i.e., from in your second comment since the <> text was eaten).

      If you meant to use a template template argument called T, template template arguments are not types and therefore T cannot be passed to foo() as a normal function argument (without providing all template arguments for T which then turns it into a type –which my solution covers).

      Hmmm, if you are referring to the ability to rely on template argument deduction when using non-type template parameters (e.g., say foo uses non-type template parameters and it is passed something which would otherwise be deduced when not using apply()) that is beyond the scope of what I was aiming to achieve or test. Specifically, if one writes:

      template <int i>
      struct S { };

      template <int i>
      void bar(S<i> s)
      {
      }

      struct bar2
      {
      template <int i>
      void operator ()(S<i> s) const
      {
      }
      };

      // …
      S<4> s;
      bar(s); // works via template arg. deduction
      apply(bar, s); // fails since deduction must occur before apply is invoked
      apply(bar2(), s); // works since the deduction is deferred

      So for my apply() and apply_tuple() to work with non-type template parameters, one would need to write an overload of apply()/apply_tuple() for those non-type template parameters, or, wrap the code in a function object.

    2. I think WordPress cropped quite a lot from the code I posted, also when I made the second comment, sorry for not making clear what I meant.

      The problem I'm having is, that I can't call your version of apply_tuple on lvalues.

      So

      std::tuple t = make_tuple(1,3);
      int a = apply_tuple(add,t);

      does not work, gcc complains that t is not an rvalue reference ( cannot bind β€˜std::tuple’ lvalue to β€˜std::tuple&&’). I think this is because in your code you are having the following signature

      auto apply_tuple(Op&& op, T&& t) -> [….]

      normally a T&& t would collapse to to a tuple reference if called with an lvalue type (so a foo(T&& t) called with the type T=A& collapses to an instanciation foo(A& t)), but T is an template so that this does not work. The only solution I see to that is overloading the apply_tuple function to also have:

      auto apply_tuple(Op&& op, T& t) -> [….]

      I wanted to know whether you see a better solution, i.e somehow avoiding to make T a template?

    3. Ok talking about templates is really cumbersome over WordPress, in the comment above,

      auto apply_tuple(Op&& op, T&& t) -> [….]

      should read (where { is denoting
      auto apply_tuple(Op&& op, T{OpArgs…}&& t) -> [….]

  4. It seems that my second example got garbled up by wordpress its supposed to read

    template< typename U, template class T >
    void foo(T && t)
    {
    }

    1. To save WordPress comment hassles my reply is in the form of a new post here. πŸ™‚

      That said, the post does not solve the (separate) automatic type deduction issue noted in my previous comment, i.e., using apply() with the bar() function.

  5. A very nice article. I think I've spotted a few typos which might help a great reference post. (Apologies if I am mistaken.)

    C++98 references are lvalues and can only refer to lvalues with one exception
    should be
    C++98 references are lvalues and can only refer to rvalues with one exception

    By using this deductive properly C++ can preserve the
    should be
    By using this deductive property C++ can preserve the

    For each type type,
    should be
    For each type,

    Are the 2 code snippets following…
    "Finally apply can be written but we need a helper function. Start by writing apply to do everything it can do:"
    in the wrong order?

    1. Ack typos! Thanks for the corrections! πŸ™‚

      First edit: Fixed ("rvalues" not "lvalues).
      Second edit: Fixed ("property," not "properly")
      Third edit: Redundant "type" removed.

      Last item: I don't see how they are in the wrong order. The second snippet shows what is desired, but, it cannot be obtained until Indices and OpArg can be expanded together. The first snippet is needed code. If you are talking about the code implementation, the second will appear before the first. You can see this clearly in my follow-up article, Testing If std::tuple Is Being Applied To Functors Efficiently, where the code is given.

  6. [Editor's Note: The code in this post was fixed due to WordPress stripping out template code and a simple error. — Paul Preney]

    Can polymorphic functions also be applied? I tried the following, but get a "no matching function call…" error:

    template <typename F, typename S>
    auto addt(std::pair<F,S> p) -> decltype(p.first + p.second) {
    return p.first + p.second;
    }

    void test() {
    auto tup3 = std::make_tuple(std::make_pair(1,2));
    auto res = apply_tuple(addt, tup3);
    }

Leave a Reply

Your email address will not be published. Required fields are marked *