Fold (higher-order function)
From Wikipedia, the free encyclopedia
In functional programming, fold, also known variously as reduce, accumulate, compress or inject, is a family of higher-order functions that process a data structure in some order and build up a return value. Typically, a fold deals with two things: a combining function, and a data structure, typically a list of elements. The fold then proceeds to combine elements of the data structure using the function in some systematic way.
Folds are in a sense dual to unfolds, which take a starting value and apply a function to it repeatedly to generate a data structure, whereas a fold applies a function repeatedly to a data structure and generates a starting value (anamorphism as opposed to catamorphism).
Contents |
[edit] Folds on lists
The folding of the list [1,2,3,4,5]
with the addition operator would result in 15, the sum of the elements of the list [1,2,3,4,5]
. To a rough approximation, one can think of the fold as replacing the commas in the list with the + operation, giving 1+2+3+4+5.
In the example above, + is an associative operation, so it is irrelevant how the addition is parenthesized. In the general case of non-associative binary functions the order in which the elements are combined matters. On lists, there are two obvious ways to carry this out: either by recursively combining the first element with the results of combining the rest (called a right fold) or by recursively combining the results of combining all but the last element with the last one (called a left fold). With a right fold, the sum would be parenthesized as 1 + (2 + (3 + (4 + 5))), whereas with a left fold it would be parenthesized as (((1 + 2) + 3) + 4) + 5.
In practice, it is convenient and natural to have an initial value which in the case of a right fold is used when one reaches the end of the list, and in the case of a left fold is what is initially combined with the first element of the list. In the example above, the value 0 (the additive identity) would be chosen as an initial value, giving 1 + (2 + (3 + (4 + (5 + 0)))) for the right fold, and ((((0 + 1) + 2) + 3) + 4) + 5 for the left fold.
[edit] List folds as structural transformations
Folds can be viewed as a mechanism for replacing the structural components of a data structure with functions and values in a regular way. In many languages, lists are built up from two primitives: any list is either the empty list, commonly called nil, or it is a list constructed by appending an element to the start of some other list, which we call a cons. The empty list and the cons operation are written as []
and (:)
(colon) in Haskell. One can view a right fold as substituting the nil at the end of the list with a specific value, and each cons with a specific other function. Hence, one gets a diagram which looks something like this:
In the case of a left fold, the structural transformation being performed is somewhat less natural, but is still quite regular:
These pictures illustrate the names left and right fold visually. They also highlight the fact that foldr (:) []
is the identity function on lists, as replacing cons with cons and nil with nil will not change the result. The left fold diagram suggests an easy way to reverse a list, foldl (flip (:)) []
. Note that the parameters to cons must be flipped, because the element to add is now the right hand parameter of the combining function. Another easy result to see from this vantage-point is to write the higher-order map function in terms of foldr, by composing the function to act on the elements with cons, as:
map f = foldr ((:) . f) []
where the period (.) is an operator denoting function composition.
This way of looking at things provides a simple route to designing fold-like functions on other algebraic data structures, like various sorts of trees. One writes a function which recursively replaces the constructors of the datatype with provided functions, and any constant values of the type with provided values. Such a function is generally referred to as a catamorphism.
[edit] Implementation
Using Haskell as an example, foldr
and foldl
can be formulated in a few equations.
foldr f z [] = z -- if the list is empty, the result is the initial value z foldr f z (x:xs) = f x (foldr f z xs) -- if not, apply f to the first element and the result of folding the rest
foldl f z [] = z -- if the list is empty, the result is the initial value foldl f z (x:xs) = foldl f (f z x) xs -- if not, we recurse immediately, making the new initial value the result -- of combining the old initial value with the first element.
In Scheme and other Lisps, the empty list and the list construction operator are written as ()
and cons
. Using Scheme, right and left fold can be written as:
(define (foldr f z xs) (if (null? xs) z (f (car xs) (foldr f z (cdr xs)))))
(define (foldl f z xs) (if (null? xs) z (foldl f (f z (car xs)) (cdr xs))))
Here null?
denotes a predicate function that returns a true value if given the empty list as argument.
Common Lisp provides a reduce
function which provides both right and left fold.
The C++ Standard Template Library implements left fold as the function "accumulate" (in the header <numeric>). In Python 2.5 left fold is implemented by the built-in function reduce(), and in Ruby by the Enumerable method #inject.
[edit] Evaluation order considerations
In the presence of lazy, or normal-order evaluation, foldr
will immediately return the application of f to the recursive case of folding over the rest of the list. Thus, if f is able to produce some part of its result without reference to the recursive case, and the rest of the result is never demanded, then the recursion will stop. This allows right folds to operate on infinite lists. By contrast, foldl
will immediately call itself with new parameters until it reaches the end of the list. This tail recursion can be efficiently compiled as a loop, but can't deal with infinite lists at all — it will recurse forever in an infinite loop.
Reversing a list is also tail-recursive. (It can be implemented using rev = foldl (\ys x -> x : ys) []
.) On finite lists, that means that left-fold and reverse can be composed to perform a right fold in a tail-recursive way (with a modification to the function f so it reverses the order of its arguments).
Another technical point to be aware of in the case of left folds using lazy evaluation is that the new initial parameter is not being evaluated before the recursive call is made. This can lead to stack overflows when one reaches the end of the list and tries to evaluate the resulting potentially gigantic expression. For this reason, such languages often provide a stricter variant of left folding which forces the evaluation of the initial parameter before making the recursive call, in Haskell, this is the foldl'
(note the apostrophe, pronounced 'prime') function in the Data.List
library. Combined with the speed of tail recursion, such folds are very efficient when lazy evaluation of the final result is impossible or undesirable.
One often wants to choose the identity element of the operation f as the initial value z. When no initial value seems appropriate, for example, when one wants to fold the function which computes the maximum of its two parameters over a list in order to get the maximum element of the list, there are variants of foldr and foldl which use the last and first element of the list respectively as the initial value. In Haskell and several other languages, these are called foldr1 and foldl1, the 1 making reference to the automatic provision of an initial element, and the fact that the lists they are applied to must have at least one element.
[edit] Examples
Using a Haskell interpreter, we can show the structural transformation which foldr and foldl perform by constructing a string as follows:
Prelude> foldr (\x y -> concat ["(f ",x," ",y,")"]) "z" (map show [1..5]) "(f 1 (f 2 (f 3 (f 4 (f 5 z)))))" Prelude> foldl (\x y -> concat ["(f ",x," ",y,")"]) "z" (map show [1..5]) "(f (f (f (f (f z 1) 2) 3) 4) 5)"
[edit] See also
[edit] External links
- "Higher order functions — map, fold and filter"
- "Unit 6: The Higher-order fold Functions"
- "Fold"
- A pair of cute websites for the difference between foldr and foldl.
- "Constructing List Homomorphism from Left and Right Folds"
- "The magic foldr"