[More progress on chapter 10. Bryan O'Sullivan **20071105210000] { hunk ./en/ch10-binary.xml 117 - after it has finished. It deconstructs each result in turn, - either failing if the function failed, or building up a piece of - the result as it continues. The bodies of the functions that it - calls aren't especially interesting. + after it has consumed all it needs from its input string. This + residual string can then be passed along to the next step. It + deconstructs each result in turn, either failing if the function + failed, or building up a piece of the result as it continues. + The bodies of the functions that it calls aren't especially + interesting. hunk ./en/ch10-binary.xml 218 + The Parse type is encoding two concepts in one. + The first is the possibility of failure, reported via an error + message. This we achieve using Either to represent + two possible results of a parse. The second is the update of + the parser state and presentation of an intermediate result, + represented by the type of runParse. + hunk ./en/ch10-binary.xml 406 + + Functors as operators + + Quite often, you'll see fmap called as + an operator. + + &functor.ghci:operator; + + Perhaps strangely, plain old map is + almost never used in this way. + + One possible reason for the stickiness of the + fmap-as-operator meme is that this use + lets us omit parentheses from its second argument. Fewer + parentheses leads to reduced mental juggling while reading a + function. + + &functor.ghci:prefix; + + If you really want to use fmap as an + operator, the Control.Applicative module contains + an operator (<$>) that is an alias + for fmap. The $ in its name + appeals to the similarity between applying a function to its + arguments (using the ($) operator) and + lifting a function into a functor. + + hunk ./en/ch10-binary.xml 514 + + + + Writing a functor instance for Parse + + For the types we have surveyed so far, the behaviour we + ought to expect of fmap has been obvious. + This is a little less clear for Parse, due to its + complexity. A reasonable guess is that the function we're + fmapping should be applied to the current + result of a parse, and leave the parse state untouched. + + &Parse.hs:Functor; + + Since this definition isn't especially easy to read, let's + perform a few quick experiments to see if we're following our + rules for functors. + + First, we'll check that identity is preserved. Let's try + this first on a parse that should fail: trying to parse a byte + from an empty string. + + &parse.ghci:id.fail; + + Good. Now for a parse that should succeed. + + &parse.ghci:id.success; + + By inspecting the results above, we can also see that our + functor instance is obeying our second rule, that of preserving + shape. Failure is preserved as failure, and success as + success. + + Finally, we'll ensure that composability is + preserved. + + &parse.ghci:compose; + + On the basis of this brief inspection, our + Functor instance appears to be well behaved. + + + + Using functors for parsing + + All of this talk about functors had a purpose: they often + let us write tidy, expressive code. Recall the + parseByte function that we introduced + earlier. In recasting our PGM parser to use our new parser + infrastructure, we'll often want to work with ASCII characters + instead of Word8 values. + + While we could write a parseChar + function that has a similar structure to + parseByte, we can now avoid this code + duplication by taking advantage of the functor nature of + Parse. Our functor takes the result of a parse and + applies a function to it, so what we need is a function that + turns a Word8 into a Char. + + &Parse.hs:parseChar; + + We can also use functors to write a compact + peek function. This returns Nothing + if we're at the end of the input string. Otherwise, it returns + the next character without consuming it (i.e. it inspects, but + doesn't disturb, the current parsing state). + + &Parse.hs:peekByte; + + The same lifting trick that let us define + parseChar lets us write a compact + definition for peekChar. + + &Parse.hs:peekChar; + + Notice that peekByte and + peekChar each make two calls to + fmap, one of which is disguised as + (<$>). This is necessary because the + type Parse (Maybe a) is a functor within a functor. + We thus have to lift a function twice to get it + into the inner function. + + Finally, we'll write another generic combinator, which is + the Parse analogue of the familiar + takeWhile: it consumes its input while its + predicate returns True. + + &Parse.hs:parseWhile; + + Once again, we're using functors in several places to reduce + the verbosity of our code. Here's a rewrite of the same + function in a more direct style that does not use functors. + + &Parse.hs:parseWhileVerbose; + + + + Rewriting our PGM parser + + With our new parsing code, what does the raw PGM parsing + function look like now? + + &Parse.hs:parseRawPGM; + + This definition makes use of a few more helper functions + that we present here, following a pattern that should by now be + familiar. + + &Parse.hs:helpers; + + The (==>&) combinator chains + parsers like (==>), but the right hand + side ignores the result from the left. The + assert function lets us check a property, + and abort parsing with a useful error message if the property is + False. + + Notice how few of the functions that we have written make + any reference to the current parsing state. Most notably, where + our old parseP5 function explicitly passed + two-tuples down the chain of dataflow, all of the state + management in parseRawPGM is hidden from + us. + + Of course, we can't completely avoid inspecting and + modifying the parsing state. Here's a case in point, the last + of the helper functions needed by + parseRawPGM. + + &Parse.hs:parseBytes; + hunk ./examples/ch10/Parse.hs 1 +module Parse where + +import Control.Applicative ((<$>)) hunk ./examples/ch10/Parse.hs 6 -import Data.Char (chr, ord, isDigit) +import Data.Char (chr, ord, isDigit, isSpace) hunk ./examples/ch10/Parse.hs 10 +import PNM (Greymap(..)) + hunk ./examples/ch10/Parse.hs 43 -x ==>! f = x ==> \_ -> f - hunk ./examples/ch10/Parse.hs 80 +{-- snippet peekByte --} hunk ./examples/ch10/Parse.hs 82 -peekByte = getState ==> (identity . fmap fst . uncons . string) +peekByte = (fmap fst . uncons . string) <$> getState +{-- /snippet peekByte --} hunk ./examples/ch10/Parse.hs 85 +{-- snippet Functor --} hunk ./examples/ch10/Parse.hs 91 +{-- /snippet Functor --} hunk ./examples/ch10/Parse.hs 93 -parseEnd :: Parse Bool -parseEnd = (L.null . string) `fmap` getState +{-- snippet peekChar --} +peekChar :: Parse (Maybe Char) +peekChar = fmap w2c <$> peekByte +{-- /snippet peekChar --} hunk ./examples/ch10/Parse.hs 98 +{-- snippet parseChar --} +w2c :: Word8 -> Char hunk ./examples/ch10/Parse.hs 102 -peekChar :: Parse (Maybe Char) -peekChar = fmap w2c `fmap` peekByte hunk ./examples/ch10/Parse.hs 103 -parseChar = w2c `fmap` parseByte +parseChar = w2c <$> parseByte +{-- /snippet parseChar --} + +{-- snippet parseWhile --} +parseWhile :: (Word8 -> Bool) -> Parse [Word8] +parseWhile p = (fmap p <$> peekByte) ==> \mp -> + if mp == Just True + then parseByte ==> \b -> + (b:) <$> parseWhile p + else identity [] +{-- /snippet parseWhile --} + +{-- snippet parseWhileVerbose --} +parseWhileVerbose p = + peekByte ==> \mc -> + case mc of + Nothing -> identity [] + Just _ -> parseByte ==> \b -> + if p b + then parseWhileVerbose p ==> \bs -> + identity (b:bs) + else identity [] +{-- /snippet parseWhileVerbose --} hunk ./examples/ch10/Parse.hs 127 -parseWhile :: (Char -> Bool) -> Parse [Char] -parseWhile p = peekChar ==> \mc -> - case mc of - Nothing -> identity [] - Just c | p c -> parseChar ==>! - parseWhile p ==> \cs -> - identity (c:cs) - | otherwise -> identity [] +{-- snippet parseNat --} +parseWhileWith :: (Word8 -> a) -> (a -> Bool) -> Parse [a] +parseWhileWith f p = fmap f <$> parseWhile (p . f) hunk ./examples/ch10/Parse.hs 132 -parseNat = parseWhile isDigit ==> \digits -> +parseNat = parseWhileWith w2c isDigit ==> \digits -> hunk ./examples/ch10/Parse.hs 139 +{-- /snippet parseNat --} + +{-- snippet helpers --} +(==>&) :: Parse a -> Parse b -> Parse b +p ==>& f = p ==> \_ -> f + +skipSpaces :: Parse () +skipSpaces = parseWhileWith w2c isSpace ==>& identity () + +assert :: Bool -> String -> Parse () +assert True _ = identity () +assert False err = bail err +{-- /snippet helpers --} + +{-- snippet parseBytes --} +parseBytes :: Int -> Parse L.ByteString +parseBytes n = + getState ==> \st -> + let n' = fromIntegral n + (h, t) = L.splitAt n' (string st) + st' = st { offset = offset st + L.length h, string = t } + in putState st' ==>& + assert (L.length h == n') "end of input" ==>& + identity h +{-- /snippet parseBytes --} + +{-- snippet parseRawPGM --} +parseRawPGM = + parseWhileWith w2c (/= '\n') ==> \header -> + assert (header == "P5") "invalid raw header" ==>& + parseNat ==> \width -> + skipSpaces ==>& + parseNat ==> \height -> + skipSpaces ==>& + parseNat ==> \maxGrey -> + parseByte ==>& + parseBytes (width * height) ==> \bitmap -> + identity (Greymap width height maxGrey bitmap) +{-- /snippet parseRawPGM --} addfile ./examples/ch10/ParsePNM.hs hunk ./examples/ch10/functor.ghci 45 +--# operator + +(1+) `fmap` [1,2,3] ++ [4,5,6] + +--# prefix + +fmap (1+) ([1,2,3] ++ [4,5,6]) + +--# applicative + +:m +Control.Applicative +(1+) <$> Just 2 + hunk ./examples/ch10/parse.ghci 5 +--# id.fail + +parse parseByte L.empty +parse (id <$> parseByte) L.empty + +--# id.success + +let input = pack "foo" +L.head input +parse parseByte input +parse (id <$> parseByte) input + +--# compose + +parse ((chr . fromIntegral) <$> parseByte) input +parse (chr <$> fromIntegral <$> parseByte) input + }