Reduce

Nick Athanasiou

Copyright © 2016 Nick Athanasiou

Distributed under the Apache License Version 2.0 (See accompanying file LICENSE or here)

Table of contents

  1. Introduction
    • The functional genome of reduce
    • C++ fold expressions
      • Syntax
      • Identity elements
      • Examples
  2. Rationale
    • What
    • How
  3. User Manual
    • Syntax
    • Single operant folding
    • Defining identity elements
  4. Examples

1. Introduction

Reduce is a header only library that provides fold expressions for arbitrary callables. Properties of the resulting expressions include:

  • Compile time evaluation (if the callables involved have such an ability)
  • Lazy evaluation (we can step through intermediate computations in a generator fashion)
  • Stateful callables

1.1 The functional genome of reduce

The concept of reducing is central to the "programming experience" in functional languages. A formal definition would go like this:

In functional programming, fold – also known variously as reduce, accumulate, aggregate, compress, or inject – refers to a family of higher-order functions that analyze a recursive data structure and through use of a given combining operation, recombine the results of recursively processing its constituent parts, building up a return value.

To understand this definition, here's some info on the terminology:

  • higher order function: functions that can take functions as parameters and/or return functions as return values
  • recursive data structure: also known as recursively-defined, it's a data type for values that may contain other values of the same type. For an example a list in Haskell is (either an empty list or) a head element of some type followed by a tail element that's a list of the same type.
  • combining operation: we'll be calling it a reducing function or reducers; it's any function which takes a partial result and a new piece of information to produce a new result.
  • accumulator: this is included in the definition as partial result; essentially it "contains" the type of the value we reduce our recursive data structure to.

So for example if we were to reduce the list [1, 2, 3, 4, 5] using the operator (+) we'd get the sum of the list i.e. 15. Depending on where we "fold from" we get left or right folds. Let's look at some formal definitions for the types of these operations:

-- type of the left fold operation
foldl :: (b -> a -> b) -> b -> [a] -> b

The above informs us that foldl is a (higher order) function that accepts:

  1. a function (b -> a -> b) i.e. a function that takes an accumulator of type b and a value of type a and returns a partial result of type b
  2. an accumulator of type b
  3. a list of a

and returns a reduced value of type b. Let's make a visualization of the left fold operation for this expression

-- add numbers 1 to 5 with a zero accumulator
foldl (+) 0 [1..5]
main = do
    putStrLn $ foldl (\x y -> concat ["(",x,"+",y,")"]) "0" (map show [1..5])

Output: ['(((((0+1)+2)+3)+4)+5)']

Symmetric to the above, is the type definition for the right fold operation:

foldr :: (a -> b -> b) -> b -> [a] -> b

Notice however the type of the reducer. We say that the accumulator b "consumes" the operants from the right, so the order of parameters changes (current value on the left, accumulator on the right). This is important when we define custom accumulators that may not be commutative, result-wise or type-wise. Visualizing the expression

foldr (+) 0 [1..5]

we get:

main = do
    putStrLn $ foldr (\x y -> concat ["(",x,"+",y,")"]) "0" (map show [1..5])

Output: ['(1+(2+(3+(4+(5+0)))))']

1.2 C++ fold expressions

Up until the "1z" version, C++ expressed the reduction logic through the standard library function std::accumulate. Common knowledge indicates that accumulate was highly underestimated and it's use decayed to the mandane task of summing, with a choice for a custom binary operator, the contents of a container. While this is just a fragment of its "abilities", the truth is that accumulate cannot express folding in compile time contexts, the pure expression logic underlying reductions and handle variadicness of input.

C++1z ammends this by adding support for fold expressions. According to the related proposal:

A fold expression performs a fold of a template parameter pack ([temp.variadic]) over a binary operator.

Syntax

Let "e" $= e_1, e_2, \dotso, e_n$ be an expression that contains an unexpanded parameter pack and $\otimes$ the fold operator, then fold expressions have the form:

  • Unary left folds

    $(\dotso\; \otimes\; e)$

    which expands to $ (((e_1 \otimes e_2) \dotso ) \otimes e_n)$


  • Unary right folds

    $(e\; \otimes\; \dotso)$

    which expands to $(e_1 \otimes ( \dotso (e_{n-1} \otimes e_n)))$


If we add a non pack argument on the dots' side of each of the above we get their binary versions that have identical expansion behavior:

  • Binary left folds

    $(a \otimes\; \dotso\; \otimes\; e)$

    which expands to $ (((a \otimes e_1) \dotso ) \otimes e_n)$


  • Binary right folds

    $(e\; \otimes\; \dotso\; \otimes\; a)$

    which expands to $(e_1 \otimes ( \dotso (e_n \otimes a)))$


The "" operator can be one of:

+  -  *  /  %  ^  &  |  ~  =  <  >  <<  >>
+=  -=  *=  /=  %=  ^=  &=  |=  <<=  >>=
==  !=  <=  >=  &&  ||  ,  .*  ->*

Identity elements

The fold of an empty parameter pack evaluates to a specific value. The choice of value depends on the operator. The set of operators and their empty expansions are in the table below.

Operator Value when parameter pack is empty
$*$ 1
$+$ 0
$&$ -1
$\mid$ 0
$&&$ true
$\parallel$ false
$,$ void()

If a fold of an empty parameter pack is produced for any other operator, the program is ill-formed

Examples

_

/// Summing the contents of an array at compile time

#include <array>
#include <utility>
#include <iostream>

using namespace std; 

namespace detail
{
    template <class... Ts>
    constexpr auto sum_(Ts&&... args)
    {
        return (args + ...);
    }

    template <typename T, size_t N, size_t... Is>
    constexpr T sum_impl(array<T, N> const &arr, index_sequence<Is...>)
    {
        return sum_(get<Is>(arr)...);
    }
}

template <typename T, size_t N>
constexpr T sum(array<T, N> const &arr)
{
    return detail::sum_impl(arr, make_index_sequence<N>{});
} 

int main()
{
    constexpr array<int, 4> arr1{ { 1, 1, 2, 3 } };
    constexpr array<int, 0> arr2{ };

    cout << integral_constant<int, sum(arr1)>{} << endl;
    cout << integral_constant<int, sum(arr2)>{} << endl;
}

Output : ['7', '0']


// iterating over different types

#include <iostream>

struct Window {
    void show() { std::cout << "showing Window\n"; }
} win;

struct Widget {
    void show(){ std::cout << "showing Widget\n"; }
} wid;

struct Toolbar {
    void show(){ std::cout << "showing Toolbar\n"; }
} tlb;

int main()
{
    auto printer = [](auto&&... args) { (args.show(), ...); };

    printer(win, wid, tlb);
    printer(); // remember void() ? 
}

Output: ['showing Window', 'showing Widget', 'showing Toolbar']


// a for_each lambda

#include <iostream>

struct Printer 
{
    template <class T> 
    void operator()(T&& arg) { std::cout << arg; }
};

int main()
{
    auto ForEach = [](auto&& fun, auto&&... args) 
    { 
        (..., std::forward<decltype(fun)>(fun)(std::forward<decltype(args)>(args)));
    };

    ForEach(Printer{}, 0.5, " a loaf is better than ", 0, " bread", '\n');
}

Output: ['0.5 a loaf is better than 0 bread']


// a modern summing style - showcasing the unfolding properties of std::apply

#include <array>
#include <tuple>
#include <utility>
#include <iostream>
#include <type_traits>

namespace cpp17
{
    template< class F, class... ArgTypes>
    std::result_of_t<F&&(ArgTypes&&...)> invoke(F&& f, ArgTypes&&... args);

    namespace detail 
    {
        template <class F, class Tuple, std::size_t... I>
        constexpr decltype(auto) apply_impl(F&& f, Tuple&& t, std::index_sequence<I...>) 
        {
#if 1
            return (std::forward<F>(f))(std::get<I>(std::forward<Tuple>(t))...);
#else
            return invoke(std::forward<F>(f), std::get<I>(std::forward<Tuple>(t))...);
#endif // TODO: Elaborate on the inconsistency of invoke not being constexpr
        }
    }  

    template <class F, class Tuple>
    constexpr decltype(auto) apply(F&& f, Tuple&& t)
    {
        return detail::apply_impl(std::forward<F>(f), std::forward<Tuple>(t),
            std::make_index_sequence<std::tuple_size<std::decay_t<Tuple>>{}>{});
    }
}

struct Summer 
{
    template <class... Ts> constexpr auto operator()(Ts&&... args) { return (args + ...); }
};

int main()
{
    constexpr std::array<int, 4>        arr{ { 1, 2, 3 } };
    constexpr std::tuple<int, int, int> tup{   1, 2, 3   }; 

    std::cout << "Array sum : " << cpp17::apply(Summer{}, arr) << std::endl;
    std::cout << "Tuple sum : " << cpp17::apply(Summer{}, tup) << std::endl;
}

Output: ['Array sum : 6', 'Tuple sum : 6']


// test whether a type H exists in Ts (compare with a "traditional" implementation)

template <class H, class... Ts>
constexpr bool h_in_ts = (... || std::is_same<H, Ts>::value);

// count results of type queries

#include <type_traits>
#include <iostream>

using namespace std; 

// count the times a predicate P is satisfied in a typelist L
template <template<class> class P, class... L>
constexpr size_t count_if = (P<L>::value + ...); 

// count the occurences of a type V in a typelist L
template <class V, class... L>
constexpr size_t count = (is_same<V, L>::value + ...); 

int main()
{
    cout << count_if <is_integral, float, unsigned, int, double, long> << endl;
    cout << count<float, unsigned, int, double, float> << endl;
}

Output: ['3', '1']


// assign to a given std::array index and beyond, from a list of arguments

#include <array>
#include <utility>
#include <iostream>

using namespace std; 

template <class T, size_t N>
struct A
{
    array<T, N> arr;

    template <class... Ts, size_t... Is>
    void set_from_list(size_t offset, Ts&&... vals)
    {
        return set_from_list(
            offset, index_sequence_for<Ts...>{}, forward<Ts>(vals)...); 
    }

private:
    template <class... Ts, size_t... Is>
    void set_from_list(size_t offset, index_sequence<Is...>&&, Ts&&... vals)
    {
        ((arr[offset + Is] = vals), ...); 
        //             ^^    ^^^^
        // two or more pack aruments can exist in the expression
        // provided they are of the same **cardinality**
    }
};

int main()
{
    A<int, 5> a{{{1, 2, 3, 4, 5}}};

    // the length of the list must equal the length of remaining array elements
    a.set_from_list(2, 6, 6, 6); 

    printf("{%d, %d, %d, %d, %d}", 
        get<0>(a.arr), get<1>(a.arr), get<2>(a.arr), get<3>(a.arr), get<4>(a.arr)); 
}

Output: ['{1, 2, 6, 6, 6}']


2. Rationale

A somewhat limiting feature of c++ fold expressions, is that they’re only available for certain operators, so doing

$(\dotso\; \otimes\; e)$

is not allowed for arbitrary $\otimes$. Reduce works around this by creating add-hoc expression templates for arbitrary operators using thin wrapper types that can be parametrized by the operant/operator pair. An immediate side-effect of this method is that the expression's result can be accessed eagerly, belatedly or lazily (in a generator fashion).

For more elaboration on the way this is done check these resources:

Main design goals in developing reduce are to :

  • Maintain (or at least "approximate") the logic and terminology found in functional programming.
  • Allow compile time computations.

3. User Manual

3.1 Syntax

The code is in the fld namespace, which will be omitted here for the sake of brevity. The focal points of the library are the functions

  • foldl
  • foldr

that behave similarly to their Haskell counterparts. A syntactic difference is that the accumulator is not explicit, even though it can be provided. If you don't provide one the left or rightmost element (for foldl and foldr respectively) becomes it. To form a fold epxression we use the following syntax

$foldr\;(Op,\; a_1,\; \dotso,\; a_n,\; acc)$

$foldl\;(Op,\; acc\;, a_1,\; \dotso,\; a_n)$

where:

  • $[a_1,\; \dotso\;, a_n]$ are the elements we fold
  • acc is the accumulator
  • Op is the reducer. It can be any callable object, temporary or not, provided it can be called as
    • $Op\;(acc,\; a_i)$, for left folds
    • $Op\;(a_i,\; acc)$, for right folds

Calling these functions has no side effects but the formation of the expression. Given that expression we can either:

  • call yield() to evaluate the expression
  • use the expression's iterators to step through the computation process

3.2 Single operant folding

When a single element is passed to the fold functions it is returned as given (we assume the accumulator, hence the partial result as well, equals the final result since no more computation steps can be defined).

3.3 Defining identity elements

To define custom identity elements one has to simply provide a no argument version of the reducer:

  • $Op\;(\;)$

this way the user can customize the behavior in corner cases, e.g. reducing the max of an empty sequence (the user could define std::numeric_limits<int>::min() to be the desired value) or cases where the types involved don't mandate an identity element for their algebra. Defining identity elements is optional (or even unnecessary when there are no "empty folds" instantiations).

4. Examples

#include <string>
#include <sstream>
#include <iostream>
#include "../include/boost/reduce.h"


template <class T>
std::string Stringify(T const& value)
{
    std::stringstream ss;
    ss << value;
    return ss.str();
}

struct JoinAc
{
    template <class T1, class T2>
    std::string& operator()(T1 const& lhs, T2& rhs)
    {
        return (rhs += Stringify(lhs));
    }

    template <class T1, class T2>
    std::string operator()(T1 const& lhs, T2&& rhs)
    {
        return (rhs + Stringify(lhs));
    }

    std::string operator()() { return std::string("identity element"); }
};

int main()
{
    using namespace std::rel_ops;

    std::cout << "\nSTRING TEST\n===============\n";

    // a. Multiple operants
    std::cout << "\t multiple operants\n";
    std::string acuml;
    JoinAc joiner;

    auto expr1 =
        fld::foldr(joiner, std::string(" < in the bush >"), 10,
                   std::string(" < bird in the hand, is worth > "), 1, acuml);

    std::cout << "Yielded result\n------------\n" << expr1.yield() << std::endl;

    auto expr2 = fld::foldr(JoinAc{}, std::string(" < in the bush >"), 10,
                            std::string(" < bird in the hand, is worth > "), 1,
                            std::string{});

    std::cout << "Lazy result\n------------\n";
    for (auto&& elem : expr2)
    {
        std::cout << elem << std::endl;
    }

    // b. One operant
    std::cout << "\t one operant\n";
    std::string acum("default string acum");
    JoinAc joine;

    auto exp1 = fld::foldr(joine, 10);

    std::cout << "Yielded result\n------------\n" << exp1.yield() << std::endl;

    auto exp2 = fld::foldr(JoinAc{}, acum);

    std::cout << "Lazy result\n------------\n";
    for (auto&& elem : exp2)
    {
        std::cout << elem << std::endl;
    }

    // c. Zero operants
    std::cout << "\t zero operants\n";
    JoinAc join;

    auto ex1 = fld::foldr(join);

    std::cout << "Yielded result\n------------\n" << ex1.yield() << std::endl;

    auto ex2 = fld::foldr(JoinAc{});

    std::cout << "Lazy result\n------------\n";
    for (auto&& elem : ex2)
    {
        std::cout << elem << std::endl;
    }
    std::cout << std::endl;
}

#include <iostream>
#include "../include/boost/reduce.h"

struct Max
{
    template <class T1, class T2>
    constexpr auto operator()(T1&& lhs, T2&& rhs)
    {
        return lhs > rhs ? std::forward<T1>(lhs) : std::forward<T2>(rhs);
    }

    constexpr auto operator()()
    {
        return 0;
    }
};

int main()
{
    using namespace std::rel_ops;

    std::cout << "NUMBER TEST\n===============\n";

    // a. Multiple operants
    std::cout << "\t multiple operants\n";
    int num(0);
    Max mx;

    auto xpr1 = fld::foldr(mx, 2, 5, 7, 5, 7, num);

    std::cout << "Yielded result\n------------\n" << xpr1.yield() << std::endl;

    auto xpr2 = fld::foldr(Max{}, 2, 5, 7, 5, 7);

    std::cout << "Lazy result\n------------\n";
    for (auto&& elem : xpr2)
    {
        std::cout << elem << ", ";
    }
    std::cout << std::endl;

    // b. One operant
    std::cout << "\t one operant\n";
    int numa(99);
    Max mxe;

    auto xp1 = fld::foldr(mxe, 10);

    std::cout << "Yielded result\n------------\n" << xp1.yield() << std::endl;

    auto xp2 = fld::foldr(Max{}, numa);

    std::cout << "Lazy result\n------------\n";
    for (auto&& elem : xp2)
    {
        std::cout << elem << std::endl;
    }

    // c. Zero operants
    std::cout << "\t zero operants\n";
    Max mix;

    auto x1 = fld::foldr(mix);

    std::cout << "Yielded result\n------------\n" << x1.yield() << std::endl;

    auto x2 = fld::foldr(Max{});

    std::cout << "Lazy result\n------------\n";
    for (auto&& elem : x2)
    {
        std::cout << elem << std::endl;
    }
}