[Get started on chapter 9. Bryan O'Sullivan **20071011062145] { adddir ./examples/ch09 hunk ./en/book-shortcuts.xml 121 +where"> hunk ./en/ch09-find-dsl.xml 4 - FIXME - FIXME. + I/O case study: a library for searching the + filesystem + + The problem of I know I have this file, but I don't + know where it is has been around for as long as + computers have had hierarchical filesystems. The fifth edition of + Unix introduced the find command in 1974; it + remains indispensable today. The state of the art has come a long + way: modern operating systems ship with advanced document indexing + and search capabilities. + + There's still a valuable place for + find-like capability in the programmer's + toolbox. In this chapter, we'll develop a library that gives us + many of find's capabilities, without leaving + Haskell. + + + The find command + + If you don't use a Unix-like operating system, or you're not + a heavy shell user, it's quite possible you won't have heard of + find. Given a list of directories, it + searches each one recursively and prints the name of every entry + that matches an expression. + + Individual expressions can take such forms as name + matches this glob pattern, entry is a plain + file, last modified before this date, + and many more. They can be stitched together into more complex + expressions using and and or + operators. + + + + Starting simple: recursively listing a directory + + Before we plunge into designing our library, let's solve a + few smaller problems. Our first problem is to get a list of the + contents of a directory at all, and to do so recursively. + + &RecursiveContents.hs:RecursiveContents; + + The filter expression ensures that a + listing for a single directory won't contain the special + directory names . or .., + which refer to the current and parent directory, respectively. + If we were to forget to filter these out, we'd recurse + endlessly. + + The forM function is a + flipped version of + mapM, which we first saw in . + + &recursivecontents.ghci:forM; + + The body of the loop checks to see whether the current entry + is a directory. If it is, it recursively lists that directory. + Otherwise, it returns a single-element list that is the name of + the current entry. Note that in the body of the loop, the + return is returning a value + from the anonymous function that is the + loop body to its caller, + forM. It is not + returning from getRecursiveContents. + + Another thing worth pointing out is the use of the variable + isDirectory. In an imperative language such + as Python, we'd normally write if + os.path.isdir(path). However, the + doesDirectoryExist function is an + action; its return type is IO + Bool, not Bool. Since an &if; expression + requires an expression of type Bool, we have to use + <- to get the Bool result of the + action, and use that in the &if;. + + Finally, each call to the loop body yields a list of names, + so the result of forM is a list of lists. + We use concat to flatten it into a single + list. + + + Why have both mapM and forM? + + It might seem a bit odd that there exist two functions + that are identical but for the order in which they accept + their arguments. However, mapM and + forM are convenient in different + circumstances. + + In the example above, we're using an anonymous function as + the body of our loop. If we were to use + mapM instead of + forM, we'd have to place the variable + properNames after the body of the anonymous + function. We'd have to wrap the entire anonymous function in + parentheses, or replace it with a local function in a &where; + clause, to make the code parse properly. + + By contrast, if the body of the loop was already a named + function, and the list over which we were looping was computed + by a complicated expression, we'd have a good case for using + mapM instead. + + The stylistic rule of thumb to follow here is to use + whichever of mapM or + forM lets you write the tidiest code; you + should decide this on a case by case basis. With just a + little practice, it will be obvious which to use in every + instance. + + addfile ./examples/ch09/RecursiveContents.hs hunk ./examples/ch09/RecursiveContents.hs 1 +{-- snippet RecursiveContents --} +module RecursiveContents (getRecursiveContents) where + +import Control.Monad (forM) +import System.Directory (doesDirectoryExist, getDirectoryContents) +import System.FilePath (()) + +getRecursiveContents :: FilePath -> IO [FilePath] + +getRecursiveContents topdir = do + names <- getDirectoryContents topdir + let properNames = filter (not . (`elem` [".", ".."])) names + paths <- forM properNames $ \name -> do + let path = topdir name + isDirectory <- doesDirectoryExist path + if isDirectory + then getRecursiveContents path + else return [path] + return (concat paths) +{-- /snippet RecursiveContents --} addfile ./examples/ch09/recursivecontents.ghci hunk ./examples/ch09/recursivecontents.ghci 1 +--# forM +:m +Control.Monad + +:type mapM +:type forM + +--# doesDirectoryExist +:m +System.Directory + +:type doesDirectoryExist }