Monday, May 21, 2007

Foldl and Foldr

The Haskell prelude defines the types for foldl and foldr as follows.

foldl
type: foldl :: (a -> b -> a) -> a -> [b] -> a
foldr
type: foldr :: (a -> b -> b) -> b -> [a] -> b
In Qi, I have followed that convention in the Qi Prelude.


The reason for defining it this way is fairly easy to see. If we have a binary operator that works over one type and we have an element and a list of that type, then we can use one functions as a drop in replacement for the other. The following code randomly selects foldr or foldl and uses it on a list.


(47-) ((if (> (random 2) 0) foldr foldl) / 1 [1 2 3])
3/2
(48-) ((if (> (random 2) 0) foldr foldl) / 1 [1 2 3])
1/6
But for as how humans tend to think about calling functions, these type orders make reasoning about foldr and foldl more difficult than they need to be. Allow me a few examples to illustrate.

(26-) (foldl + 1 [ 1 2 3 4 5])
16
(27-) (foldr + 1 [ 1 2 3 4 5])
16
(28-) (foldr * 1 [ 1 2 3 4 5])
120
(29-) (foldl * 1 [ 1 2 3 4 5])
120
(30-) (foldr / 1 [ 1 2 3 4 5])
15/8
(31-) (foldl / 1 [ 1 2 3 4 5])
1/120
(32-) (foldr - 0 [ 1 2 3 4 5])
3
(33-) (foldl - 0 [ 1 2 3 4 5])
-15
Obviously there is no difference between foldl and foldr for associative operators. But when we have non-associative operators I think the type order ends up harming readability. I will define a foldr* and foldl* with the type order that feels more natural to me, to display my point. Perhaps this will serve as a useful guide to others who are attempting to understand the difference between foldl and foldr. It should also serve as a reminder that when we try to make one type of programming task easy, we may end up making other reasoning tasks about programs more difficult.

(define foldl*
{ A --> ( A --> B --> A ) --> [B] --> A }
Z F Ls -> (foldl F Z Ls))
(define foldr*
{ [A] --> ( A --> B --> B ) --> B --> B }
Ls F Z -> (foldr F Z Ls))

(51-) (foldl* 10 - [1 2 3 4])
0
(52-) (foldr* [1 2 3 4] - 10)
8
I think this quite clearly shows that foldl starts from 10 and keeps subtracting to the right while foldr performs the 4 - 10 and moves to the left. In fact, I think that this shows that even the names of foldl and foldr are backwards from the perspective of the neophyte programmer. Nothing about a quick glance at the following gives away the fact that the first operation will be 4 - 10.

Hugs> foldr (-) 10 [ 1, 2, 3, 4]
8
vs. the re-ordered version, where this is clear.

(52-) (foldr* [1 2 3 4] - 10)
8
I believe that my reordering makes the purpose of foldl and foldr more clear. We can instinctively feel, at a glance, that foldl is taking the list and pushing it left into the first argument by way of the operator. Likewise, we can feel that foldr is "smooshing" the list to the right into the starting value.

I hope that this helps by giving both a little insight into how making code easier to be drop in replacements for other code can actually hurt readability and by helping people to better understand foldl and foldr. As always, let me know what you think.

2 Comments:

I think the ordering it's better. foldr and foldl have always been kinda complicated to me, I often have a hard time understanding what they do, even if I know how they work.

By Blogger niv, at 7:34 PM  

Nice post. A slight change makes a big difference.

By Blogger Shredder, at 10:36 PM  

Post a Comment