[Folds! Bryan O'Sullivan **20070801061423] { hunk ./en/Makefile 17 - $(wildcard ../examples/ch04/*.cpp) + $(wildcard ../examples/ch04/*.cpp) \ + $(wildcard ../examples/ch04/*.java) hunk ./en/Makefile 152 +vpath %.java $(addprefix ../examples/,$(dir $(src-examples))) hunk ./en/Makefile 167 + +x/.stamp-%.java: %.java ../tools/bin/snippets x + ../tools/bin/snippets $(CURDIR)/x $< > $@ hunk ./en/ch04-fp.xml 167 + + + + Computing one answer over a collection + + Another common thing to do with a loop is to fold + it up. A simple example of this is summing the + values of a list. + + &Sum.hs:mySum; + + Our helper function is tail + recursive, and uses an accumulator + parameter, acc, to hold the current partial + sum of the list. This is a natural way to + represent a loop in a pure functional language. + + For something a little more complicated, let's take a look + at the Adler-32 checksum. Here's a Java implementation. + + &Adler32.java:Adler32; + + Although Adler-32 is a simple checksum, this code isn't + particularly easy to read on account of the bit-twiddling + involved. Can we do any better with a Haskell + implementation? + + &Adler32.hs:adler32; + + This isn't exactly easier to follow than the Java code, + but let's look at what's going on. Once again, + helper function is tail recursive. We've + turned the two variables we updated on every loop iteration in + Java into accumulator parameters. When our recursion + terminates on the end of the input list, we compute our + checksum and return it. + + If we take a step back, we can restructure our Haskell + adler32 to more closely resemble our + earlier mySum function. Instead of two + accumulator parameters, we can use a single accumulator that's + a two-tuple. + + &Adler32.hs:adler32_try2; + + Why would we want to make this seemingly meaningless + structural change? Because as we've already seen with + map and filter, we + can extract the common behaviour shared by + mySum and + adler32_try2 into a higher-order + function. We can describe this behaviour as do + something to every element of a list, updating an + accumulator as we go, and returning the accumulator when + we're done. + + This kind of function is called a + fold, because it folds up + a list, and it has two variants, foldl + and foldr. + + + + The left fold + + &Fold.hs:foldl; + + The foldl function takes a + stepper function, an initial value for its + accumulator, and a list. The stepper takes an + accumulator and an element from the list, and returns a new + accumulator value. All foldl does is call + the stepper on the current accumulator and an + element of the list, and passes the new accumulator value to + itself recursively to consume the rest of the list. + + We refer to foldl as a left + fold because it consumes the list from left (the + head) to right. + + Here's a rewrite of mySum using + foldl. + + &Sum.hs:foldlSum; + + Notice how much simpler this code is? We're no longer + using explicit recursion, because foldl + takes care of that for us. We've simplified our problem down + to two things: what the initial value of the accumulator + should be (the second parameter to + foldl), and how to update the accumulator + (the step function). As an added bonus, + our code is now shorter, too, which makes it easier to + understand. + + We can rewrite adler32_try2 in a + similar way, using foldl to let us focus + on the details that are important. + + &Adler32.hs:adler32_foldl; + + Here, our accumulator is a two-tuple, so the result of + foldl will be, too. We pull the final + accumulator apart when foldl returns, and + bit-twiddle it into a proper checksum. + + + + Why use folds, maps, and filters? + + A quick glance reveals that + adler32_foldl isn't really any shorter + than adler32_try2. Why should we use a + fold in this case? The advantage here lies in the fact that + folds are extremely common in Haskell, and they have regular, + predictable behaviour. + + This means that a reader with a little experience will + have an easier time understanding a function that uses a fold + than one that uses explicit recursion. Where a fold isn't + going to produce any surprises, the behaviour of a function + that recurses explicitly isn't immediately obvious. Explicit + recursion requires us to read closely to understand exactly + what's going on. + + This line of reasoning applies to other higher-order + library functions, including those we've already seen, + map and filter. + Because they're library functions with well-defined behaviour, + we only need to learn what they do once, and we'll have an + advantage when we need to understand any code that uses + them. + + + + Folding from the right + + &Fold.hs:foldr; + + The counterpart to foldl is + foldr, which folds from the right of a + list. addfile ./examples/ch04/Adler32.hs hunk ./examples/ch04/Adler32.hs 1 +import Control.Arrow ((***)) +import Control.Monad (join) +{-- snippet adler32 --} +import Data.Char (ord) +import Data.Bits (shiftL, (.&.), (.|.)) + +base = 65521 + +adler32 xs = helper 1 0 xs + where helper a b (x:xs) = let a' = (a + (ord x .&. 0xff)) `mod` base + b' = (a' + b) `mod` base + in helper a' b' xs + helper a b _ = (b `shiftL` 16) .|. a +{-- /snippet adler32 --} + +{-- snippet adler32_try2 --} +adler32_try2 xs = helper (1,0) xs + where helper (a,b) (x:xs) = let a' = (a + (ord x .&. 0xff)) `mod` base + in helper (a', (a' + b) `mod` base) xs + helper (a,b) _ = (b `shiftL` 16) .|. a +{-- /snippet adler32_try2 --} + +{-- snippet adler32_foldl --} +adler32_foldl xs = let (a, b) = foldl step (1, 0) xs + in (b `shiftL` 16) .|. a + where step (a, b) x = let a' = a + (ord x .&. 0xff) + in (a' `mod` base, (a' + b) `mod` base) +{-- /snippet adler32_foldl --} + +adler32_golf = uncurry (flip ((.|.) . (`shiftL` 16))) . foldl f (1,0) + where f (a,b) x = join (***) ((`mod` base) . (a + (ord x .&. 0xff) +)) (0,b) addfile ./examples/ch04/Adler32.java hunk ./examples/ch04/Adler32.java 1 +/** snippet Adler32 */ +public class Adler32 +{ + private static final int base = 65521; + + public static int compute(byte[] data, int offset, int length) + { + int a = 1, b = 0; + + for (int i = offset; i < offset + length; i++) { + a = (a + (data[i] & 0xff)) % base; + b = (a + b) % base; + } + + return (b << 16) | a; + } +} +/** /snippet Adler32 */ addfile ./examples/ch04/Fold.hs hunk ./examples/ch04/Fold.hs 1 +import Prelude hiding (foldl, foldr) + +{-- snippet foldl --} +foldl :: (a -> b -> a) -> a -> [b] -> a + +foldl f z xs = helper z xs + where helper z [] = z + helper z (x:xs) = helper (f z x) xs +{-- /snippet foldl --} + +{-- snippet foldr --} +foldr :: (a -> b -> b) -> b -> [a] -> b + +foldr f z xs = helper xs + where helper [] = z + helper (y:ys) = f y (helper ys) +{-- /snippet foldr --} addfile ./examples/ch04/Sum.hs hunk ./examples/ch04/Sum.hs 1 +{-- snippet mySum --} +mySum xs = helper 0 xs + where helper acc (x:xs) = helper (acc + x) xs + helper acc _ = acc +{-- /snippet mySum --} + +{-- snippet foldlSum --} +foldlSum xs = foldl step 0 xs + where step acc x = acc + x +{-- /snippet foldlSum --} hunk ./tools/Snip.hs 39 + "java" -> (startC, endC) }