[Split chapter 3, big edits too Bryan O'Sullivan **20080116073938] { addfile ./en/ch03a-defining-types.xml hunk ./en/00book.xml 14 + hunk ./en/00book.xml 121 + &ch03a; hunk ./en/bibliography.xml 55 + + + + Wadler89 + + PhilipWadler + <ulink + url="http://citeseer.ist.psu.edu/wadler89theorems.html">Theorems + for free!</ulink> + + + September, 1989 + International Conference on Functional + Programming and Computer Architecture + 4 +
London, England
+
hunk ./en/ch02-starting.xml 446 - normal syntax in . + normal syntax in . hunk ./en/ch03-funcs-types.xml 6 - Topics: built-in types. Writing functions. Creating new - types. + Topics: built-in types. Writing functions. hunk ./en/ch03-funcs-types.xml 13 - static, and they can usually be - automatically inferred by an - implementation. Let's talk in more detail about each of these - ideas. Because Haskell is quite different from mainstream - programming languages in how it treats types, let's not assume - too much shared understanding as we discuss each idea. + static, and they can be automatically + inferred. Let's talk in more detail about + each of these ideas. When possible, we'll present similarities + between concepts from Haskell's type system and related ideas in + other languages. hunk ./en/ch03-funcs-types.xml 19 - Every value and expression in Haskell has a - type, such as Integer or - [Char]. The type of a value indicates that it + Every expression and function in Haskell has a + type. For example, the value + True has the type Bool, while + the value "foo" has the type + String. The type of a value indicates that it hunk ./en/ch03-funcs-types.xml 26 - these are properties of those types. We say a value or - expression has type X, or is - of type X; the two phrases mean the same - thing. Functions, being values, have types too. + these are properties of those types. We say an expression + has type X, or is of type + X. hunk ./en/ch03-funcs-types.xml 30 - Here are a few examples of values, expressions, - and their types: the value 'a' has type - Char; the expression "a" ++ "b" - has type String (the ++ is the - concatenation operator). We call an expression that obeys the - language's type rules well typed. An - expression that disobeys the type rules, such as 10 + - "hello", is ill typed, and will be - rejected by a Haskell implementation. (Our example 10 + - "hello" is ill typed because the - (+) operator works with numbers.) + Here are a few more examples of values and + expressions, and their types: the value 'a' + has type Char; the expression "a" ++ + "b" has type String (the + (++) is the concatenation operator). + + We call an expression that obeys the language's type rules + well typed. An expression that disobeys + the type rules, such as 10 + "hello", is + ill typed, and will be rejected by a + Haskell implementation. Our example 10 + "hello" + is ill typed because the (+) operator is + only defined to work with numbers. hunk ./en/ch03-funcs-types.xml 45 - strong type system, we mean that every expression - and value has exactly one most general - type, which is a notion we can explain by example. - The expression x - 1 doesn't represent a - specific type of number: it will work equally well if - x is an Integer or a + strong type system, we mean that every + expression has exactly one most general + type. This is a notion we can most easily explain + by example. The expression x - 1 doesn't + represent a specific type of number: it will work equally well + if x is an Integer or a hunk ./en/ch03-funcs-types.xml 53 - numeric type. The expression 1 < 2 is defined - to return a Bool, so its most general type is + numeric type. The expression 1 < 2 is a + comparison that can only evaluate to True or + False, so its most general type is hunk ./en/ch03-funcs-types.xml 70 - So what's a strong type? + Weaker and stronger types hunk ./en/ch03-funcs-types.xml 73 - communities define a strong type differently. - Broadly speaking, an implementation of a weak type system will - not report an error when presented with an expression like - 10 + "foo". A stronger type system is less - permissive in how it allows values of different types to be - mixed, and will report some an error for such an expression. - Haskell is less permissive than C, so it has a stronger type - system. + communities have their own definitions of a strong + type. We won't get into the details here, but it's + still useful to speak broadly about the notion of strength in + type systems. + + A hypothetical weak type system will accept as valid an + expression like 10 + "foo". A stronger type + system will be less permissive in how it allows expressions of + different types to be mixed. Under some strong type systems, + this expression is not valid. + + Our example expression is valid in C, but not in Haskell, + so we say that Haskell has a stronger type system. hunk ./en/ch03-funcs-types.xml 88 - Having a static type system means - that the compiler knows the type of every value and expression - at compile time, before any code is executed. A Haskell - compiler or interpreter will detect when we try to use + Having a static type system + means that the compiler knows the type of every value and + expression at compile time, before any code is executed. A + Haskell compiler or interpreter will detect when we try to use hunk ./en/ch03-funcs-types.xml 97 - This error message is of a kind we've seen before. It's - telling us that the value "false" cannot be treated - as a Bool. + This error message is of a kind we've seen before. + The compiler has inferred that the most general type of the + expression "false" is [Char], + which is a synonym for String. The + (&&) operator requires each of its + operands to be of type Bool, and its left operand + indeed has this type. Since the actual type of + "false" does not match the required type, the + compiler rejects this expression as ill typed. hunk ./en/ch03-funcs-types.xml 109 - this forces us to do more thinking up front, it - also eliminates large numbers of simple errors that can + this means that we need to do a little more thinking up + front, it also eliminates many simple errors that can hunk ./en/ch03-funcs-types.xml 113 - work correctly than in other languages. (Perhaps a more + work correctly than in other languages. (Perhaps a more hunk ./en/ch03-funcs-types.xml 117 - Finally, the compiler can automatically deduce the - types of almost all values in a program. This process is known - as type inference. Haskell allows us to explicitly declare the - type of any value, but the presence of type inference means that - this is almost always optional, not something we must do. + Finally, a Haskell compiler can automatically + deduce the types of (almost) all values in a program. This + process is known as type inference. Haskell allows us to + explicitly declare the type of any value, but the presence of + type inference means that this is almost always optional, not + something we must do. hunk ./en/ch03-funcs-types.xml 127 - languages, and often more expressive than dynamic + languages, and often more expressive than dynamically typed hunk ./en/ch03-funcs-types.xml 141 - are more of the most ubiquitous base types, i.e. those that + are some more of the most common base types, i.e. types that hunk ./en/ch03-funcs-types.xml 193 - A handy rule of thumb for remembering which is - which is that a list has varying size and uniform type, while a - tuple has a uniform size and varying type. We'll see what this - means below. - - We've already seen the list type mentioned in We've already seen the list type mentioned in + [Char]. You can use the two + simply an alias for [Char]. We can use the two hunk ./en/ch03-funcs-types.xml 218 - of values of type [Int], i.e. a list of - lists of Int. + of values of type [Int], i.e. a list of lists of + Int. hunk ./en/ch03-funcs-types.xml 221 - Lists are the bread and butter of Haskell - collections. However, their utility goes beyond merely + Lists are the bread and butter of + Haskell collections. However, their utility goes beyond merely hunk ./en/ch03-funcs-types.xml 231 - lot more time discussing lists in . + lot more time discussing lists in . hunk ./en/ch03-funcs-types.xml 235 - of which can be of any type. Unlike a list (the elements of - which must all have the same type), there's no need for the - elements of a tuple to have related types. In fact, tuples are - important precisely because they're the easiest way to group - multiple values that can have different types. We write a tuple - by enclosing its elements in parentheses and separating them - with commas. We use the same notation for writing its - type. + of which can be of any type. Whereas the elements of a list + must all have the same type, there's no need for the elements of + a tuple to have related types. In fact, tuples are important + precisely because they're often the most convenient way to group + several values of different types. + + A handy rule of thumb for remembering which is + which is that a list has varying size and uniform type, while a + tuple has a uniform size and varying type. + + We write a tuple by enclosing its elements in parentheses + and separating them with commas. We use the same notation for + writing its type. hunk ./en/ch03-funcs-types.xml 256 - Haskell doesn't have a notion of a one-element tuple. Larger - tuples are often spoken of the number of elements as a prefix, - hence 3-tuple (for a tuple of three elements), - 5-tuple, and so on. + Haskell doesn't have a notion of a one-element tuple. Tuples + are often referred to using the number of elements as a prefix, + hence 3-tuple for a tuple of three elements, + 5-tuple for five, and so on. + + In practice, working with tuples that contain more + than a handful of elements quickly leads to unwieldy code, so + tuples of more than half a dozen elements are not often + seen. hunk ./en/ch03-funcs-types.xml 266 - We can construct tuples with fairly large numbers - of elements. In practice, working with more than a handful of - elements quickly leads to unwieldy code, so tuples of more than - half a dozen elements are uncommon. + A tuple's type represents the number, positions, and types + of its elements. This means that tuples containing different + numbers or types of elements have distinct types of their + own. hunk ./en/ch03-funcs-types.xml 271 - A tuple's type represents both the number and types of its - elements. This means that tuples containing different numbers - or types of elements have distinct types of their own. + &tuple.ghci:type1; hunk ./en/ch03-funcs-types.xml 273 - &tuple.ghci:type; + In this example, the expression (False, 'a') + has the type (Bool, Char). If we swapped the two + elements, we'd get a tuple of type (Char, Bool). + Even though the number of elements and their types are the same, + these two types are distinct because the positions of the + element types are different. hunk ./en/ch03-funcs-types.xml 280 - A 2-tuple of Int and - String has a different type than a 2-tuple of - Bool and Bool, for example, and a - 3-tuple has a different type than a 4-tuple. 2-tuples are often - referred to as pairs, and 3-tuples less frequently as - triples. + &tuple.ghci:type2; hunk ./en/ch03-funcs-types.xml 282 - A common use of tuples is to let us return - multiple values from a function. We can also use them in other - places where we want a fixed-size collection of values, but - don't need a custom container type. + This type, (Bool, Char, Char), is distinct from + (Bool, Char) because it contains an added + element. + + + Jargon watch + + 2-tuples are often referred to as + pairs, and 3-tuples are sometimes (though + less frequently) called triples. + + + We often use tuples to return multiple values from + a function. We can also use them any time we need a fixed-size + collection of values, if the circumstances don't require a + custom container type. hunk ./en/ch03-funcs-types.xml 312 - (["foo", "bar"], 'a') + (["foo", "bar"], + 'a') hunk ./en/ch03-funcs-types.xml 316 - [(True, []), (False, [['a']])] + [(True, []), (False, + [['a']])] hunk ./en/ch03-funcs-types.xml 333 - looks quite as it does in other languages. + looks quite similar to arithmetic in other languages. + + Our discussion of lists and tuples mentioned how + we can construct them, but not how we do anything with them + afterwards. Haskell defines a large library of functions, the + Prelude, for working with lists and (to a lesser extent) tuples, + so let's find out how to use a few of those functions. + + To apply a function in Haskell, we write the name + of the function followed by its arguments. hunk ./en/ch03-funcs-types.xml 344 - Our discussion of lists and tuples mentioned how we - can construct them, but not how we do anything with them - afterwards. Haskell defines a large library of functions for - working with lists and (to a lesser extent) tuples, so let's - find out how to use a few of those functions. + + Jargon watch + + When talking about functions, we'll use the terms + application and + calling interchangeably. We'll do the + same with parameter and + argument. + hunk ./en/ch03-funcs-types.xml 354 - To call a function in Haskell, we write the name - of the function followed by its arguments. We don't use - parentheses or commas to group or separate arguments; mere - juxtaposition is enough. As an example, let's call the - head function, which returns the first - element of a list. + We don't use parentheses or commas to group or separate the + arguments to a function; merely writing the name of the + function, followed by each argument in turn, is enough. As an + example, let's apply the head function, + which returns the first element of a list. hunk ./en/ch03-funcs-types.xml 373 - elements of the list. These functions take two arguments; - notice that we simply separate the each function and its - arguments using white space. + elements of the list. As these functions take two arguments, + notice that we still separate each function and its arguments + using white space. hunk ./en/ch03-funcs-types.xml 381 - it's simple and uniform. Here's what we mean by uniform: the - following expression calls snd with one - argument, a pair. + it's simple and uniform. + + Here's something to watch out for, until you gain a little + more familiarity with the language. Under Haskell's convention + for function application, the following expression is an + application of snd to a single argument, + which is a pair. hunk ./en/ch03-funcs-types.xml 391 - In some other languages, the call to - snd above might mean call - snd with two arguments, - 1 and 2. + In a popular imperative language, this would + probably instead mean call snd with + two arguments, 1 and + 2, which is quite different. hunk ./en/ch03-funcs-types.xml 401 + hunk ./en/ch03-funcs-types.xml 408 - you into unfamiliar language territory. + you into unfamiliar linguistic territory. hunk ./en/ch03-funcs-types.xml 413 - with tuples of other sizes. It's tricky to write a - generalised get the second element from any tuple, no - matter how wide function in Haskell, but people do - fine without it. + with tuples of other sizes. Haskell's type system makes it + tricky to write a generalised get the second element + from any tuple, no matter how wide function. hunk ./en/ch03-funcs-types.xml 421 - Haskell parses an expression from left to right. If we - want to use the result of one expression as an argument to + Haskell parses an expression from left to right. + If we want to use the one expression as an argument to hunk ./en/ch03-funcs-types.xml 429 - We can read this as pass the result of - the expression drop 4 "azerty" as the argument - to head. If we were to leave - out the parentheses, Haskell would instead interpret the + We can read this as pass the expression + drop 4 "azerty" as the argument to + head. If we were to leave out + the parentheses, Haskell would instead interpret the hunk ./en/ch03-funcs-types.xml 434 - argument to head . Compilation + argument to head. The + offending expression would actually be parsed as (((head + drop) 4) "azerty"), where we've introduced lots of + extra parentheses to eliminate any ambiguity. Compilation hunk ./en/ch03-funcs-types.xml 440 - drop is a function. (The offending - expression would actually be parsed as (((head drop) 4) - "azerty"), where we've introduced lots of extra - parentheses to eliminate any ambiguity.) - - - - - Understanding a function's type signature - - Let's take a look at a function's type. - - &func.ghci:lines.type; - - We can read the -> above as - returns. The entire signature thus tells us that - lines takes a value of type - String and returns a value with type - [String]: it takes one string, and returns a list - of strings. - - &func.ghci:lines; - - The \n in the input string is an - escaped newline character: lines splits a - string on line boundaries. Notice that its type signature gave - us a strong hint as to what the functions might actually - do. - - This is an incredibly valuable property of types in a - functional language. If a function has the possibility of - causing side effects, this will be represented in its type (as we'll - see in ). If a function's type does not - say that it can cause side effects, we refer to it as - pure: its output can only depend on its - inputs. Understanding what a pure function does can often be a - matter of reading its name and understanding its type signature. - As an example, let's look at not. - - &func.ghci:not.type; - - Even if we didn't know the name of this function, its - signature alone limits the possible behaviours it could have. - (Let's omit crashing.) - - - - Ignore its argument, and always return either - True or False. - - - Return its argument unmodified. - - - Negate its argument. - - - - - - Polymorphism in Haskell - - When we introduced lists, we mentioned that the list type is - polymorphic. We'll talk about polymorphism in more detail here, - because it's a generic term that might cause confusion if we - don't nail it down. - - If we want to fetch the last element of a list, we can call - the last function. The value that it - returns must have the same type as the values in the list, but - last operates in the same way no matter - what that type actually is. - - &func.ghci:last; - - To capture the idea that last needs its - argument and result types to be the same, but doesn't need to - know their details, its type signature contains a type - variable. - - &func.ghci:last.type; - - Here, a is the type - variable. - - - Identifying a type variable - - Type variables always start with a lowercase letter. You - can always tell a type variable from a normal variable by - context, because the languages of types and functions are - separate: type variables live in type signatures, and regular - variables live in normal expressions. - - It's common Haskell practice to keep the names of type - variables very short. One letter is overwhelmingly common; - two is rare; and I don't think I've ever seen a type variable - that was three or more characters long. Type signatures are - usually brief, so we gain more in readability by keeping names - short than we would by making them descriptive. - - - When a function has type variables in its - signature, indicating that some of its arguments can be of any - type, we call the function polymorphic. - - When we want to call last on, say, a - list of Char, we can substitute Char - for a throughout the type - signature, which gives us the type of - last-over-[Char] as being - [Char] -> Char. - - This kind of polymorphism is called - parametric polymorphism. The choice of - naming is easy to understand by analogy: just as a function can - have parameters that we can later bind to real values, a Haskell - type can have parameters that we can later bind to other - types. - - - A little nomenclature - - If the type of a function contains type parameters, we - could call it a polymorphic function, or say that it has a - parameterised type. This is also the case for types. - - - When a function or type has a parameterised type, we've - already said that it doesn't care what the real type is. - However, we can make a stronger statement: it has no - way to find out what the real type is, or to - manipulate a value of that type. It can't create a value; it - can't inspect it. All it can do is treat it as a fully abstract - black box. We'll cover one reason that this is - important soon. - - Parametric polymorphism is the most visible kind of - polymorphism that Haskell supports. Here are a few popular - forms of polymorphism that are not present in Haskell. - - In mainstream object oriented languages, - subtype polymorphism is more widespread - than parametric polymorphism. The subclassing mechanisms of C++ - and Java give them subtype polymorphism. A base class defines a - set of behaviours that its subclasses can modify and - extend. Since Haskell isn't an object oriented language, it - doesn't provide subtype polymorphism. - - Also common is coercion polymorphism, - which allows a value of one type to be implicitly converted into - a value of another type. Many languages provide some form of - coercion polymorphism: one example is automatic conversion - between integers and floating point numbers. Haskell avoids - even this kind of simple automatic coercion. - - This is not the whole story: we'll refine our understanding - of Haskell's support for polymorphism in . - - - Reasoning about polymorphic functions - - In , we talked about - figuring out the behaviour of a function based on its type - signature. We can apply the same kind of reasoning to - polymorphic functions. Let's look again at - fst. - - &func.ghci:fst.type; - - First of all, notice that its argument contains two type - variables, a and b, signifying that the elements of the - tuple can be of different types. - - The result type of fst is a. We've already mentioned that - parametric polymorphism makes the real type inaccessible: - fst doesn't have enough information to - construct a value of type a, - nor can it turn an a into a - b. So the - only possible behaviour it can have is to - return the first element of the pair. + drop is a function. hunk ./en/ch03-funcs-types.xml 444 - - The type of a function of more than one argument - - So far, we haven't seen a signature for a function - that takes more than one argument. We've - already used a few such functions; let's look at the signature - of one, take. - - &func.ghci:take.type; - - It's pretty clear that there's something going on - with an Int and some lists, but why are there two - -> symbols in the signature? Haskell - parses this chain of arrows from right to left; that is, it's - right-associative. If we introduce parentheses, we can make it - clearer how this type signature is interpreted. - - &Take.hs:type; - - From this, it looks like we ought to read the type - signature as a function that takes one argument, an - Int, and returns another function. That other - function also takes one argument, a list, and returns a list of - the same type as its argument. - - This is an intriguing idea, but it's not yet easy - to see just yet what its consequences might be. We'll return to - this topic in , once we've - spent a bit of time writing functions. - - hunk ./en/ch03-funcs-types.xml 447 - Now that we know how to call functions, it's time + Now that we know how to apply functions, it's time hunk ./en/ch03-funcs-types.xml 598 - Next, we've written a type signature for - myDrop. This is not required, as a - Haskell implementation will infer the type of - myDrop if we omit an explicit type, but - it's a good habit to get into. - hunk ./en/ch03-funcs-types.xml 859 - - Reporting errors - - Haskell provides a standard function, - error :: Char -> a, that we can call - when something has gone terribly wrong in our code. We give it - a string parameter, which is the error message to display. Its - type signature looks peculiar; how can it produce a value of any - type a given just a string? The - answer is that it doesn't, because error is - special. - - It has that type so that we can call it anywhere - and it will always have the right type. However, instead of - returning a value like a normal function, it immediately - terminates evaluation, and prints the error message we give it. - Here's an example. The mySecond function - returns the second element of its input list, but fails if its - input list isn't long enough. - - &MySecond.hs:mySecond; - - As usual, we can see how this works in practice in - &ghci;. - - &error.ghci:mySecond; - - Notice that in the third case above, where we're - trying to use the result of the call to - mySecond as the argument to another - function, evaluation still terminates and drops us back to the - &ghci; prompt. This is both the strength and weakness calling - error: it doesn't let us distinguish - between recoverable errors and problems so severe that they - really should terminate a program. We'll see one way around - this problem in - . - - Would this be a good point at which to split this chapter - into two? - - - - Exercises - - - - - Haskell provides a standard function, - last :: [a] -> a, that returns the last - element of a list. Write a function - lastButOne, that returns the element - before the last. - - - - - - Load your lastButOne - function into &ghci;, and try it out on lists of different - lengths. What happens when you pass it a list that's too - short? - - - - - - - - Defining a new data type - - Although lists and tuples are useful, we'll still often want - to construct new data types of our own. We define a new data - type using the data keyword. - - &MyType.hs:MyType; - - The MyType after the data keyword - is the name of our new type. (As we've already mentioned, a - type name must start with a capital letter.) The string - MyConstructor is the name of the - constructor we'll call to create a value of - this type. (As with a type name, a constructor name must start - with a capital letter.) Finally, the Int and - String are the components of - the type. A component serves the same purpose in Haskell as a - field in a structure or class would in another language. - - - Deriving what? - - We'll explain the full meaning of deriving - (Show) later, in - . For now, - it's enough to know that we need to tack this onto a type - declaration so that &ghci; will automatically know how to - print a value of this type. - - - We can create a new value of type MyType by - treating MyConstructor as a function, and - calling it with arguments of types Int and - String. - - &MyType.hs:myValue; - - Once we've defined a type, we can experiment with it in - &ghci;, starting by using the :load command - to load our source file. - - &mytype.ghci:load; - - - Remember the myValue variable we defined? Here - it is. - - &mytype.ghci:myValue; - - A constructor serves as both a function for creating a value - and a tag identifying what type of value we have. - The value that &ghci; prints is telling us that we have created - a value with the tag MyConstructor, with the given - values in each slot. (By the way, this is why we - had to add deriving (Show) to the definition of our - type; without that, &ghci; would print an error message, telling - us that it doesn't know how to print a value of this - type.) - - We can construct new values interactively in &ghci;, - too. - - &mytype.ghci:newValue; - - The &ghci; command :type lets us see what - the type of that expression is. - - &mytype.ghci:valueType; - - To find out more about a type, we can use some of &ghci;'s - browsing capabilities. The :info command - gets &ghci; to tell us everything it knows about a - type. - - &mytype.ghci:info; - - We can also find out why we use - MyConstructor to construct a new value of - type MyType. - - &mytype.ghci:type; - - From Haskell's perspective, then, a constructor is just - another function, one that happens to return a value of the type - we want to construct. - - - - Algebraic data types - - The Bool type that we introduced earlier is the - simplest example of a sort of type called an algebraic - data type. An algebraic data type has a fixed set - of possible values, each of which is identified by a distinct - constructor. - - The use of the word algebraic simply - indicates that the components of an algebraic data type are used - together. - - In the case of Bool, the type has two - constructors, True and False. Each - constructor is separated by a | character, - which we can read as or. These are usually - referred to as alternatives or cases. - - &Bool.hs:Bool; - - Each constructor of an algebraic type can take - zero or more arguments; the numbers and types of the arguments - accepted by each constructor are independent. For example, - here's one way we might represent versions of the Windows - operating system, where old releases were monolithic, and newer - releases have service pack levels denoting major - updates after their initial releases. - - &OsVersion.hs:WindowsVersion; - - The alternatives that represent older releases don't need - arguments, but those for the newer releases need an - Int to represent the patch level. - - - Analogues to algebraic data types in other - languages - - Algebraic data types provide a single structuring - mechanism in instances where other languages have several - different building blocks. Here are some analogues from C and - C++, which might make it clearer what we can do with algebraic - data types. - - With just one constructor, an algebraic data type groups - related values into one, and gives that value an identity - distinct from other types. It corresponds to a - struct in C or C++, and its components correspond to the - fields of a struct. Here's such a struct. - - &types.c:coord; - - And here's an equivalent type in Haskell. - - &SimpleTypes.hs:Coord; - - The only significant difference is that the fields in the - Haskell type are anonymous and positional. In , we'll see how to use names - to access the fields of the Coord type. - - If an algebraic data type has multiple alternatives, we - can think of it as similar to a union in C or - C++. A big difference between the two is that a union doesn't - tell us which alternative is actually present; we have to - record which alternative we're using ourselves, usually in - another field of a struct. This means that unions can - sometimes be sources of bugs, where our notion of which - alternative we should be using is incorrect. - - With an algebraic data type, Haskell stores the - constructor that we use in the value that we create, so we - don't need to manually sock it away somewhere else. - - Algebraic data types also serve where we'd use - an enum in C or C++, to represent a range of - discrete symbolic values. Such algebraic data types are - sometimes referred to as enumeration types. Here's an example - from C. - - &types.c:roygbiv; - - And here's a Haskell equivalent. - - &SimpleTypes.hs:Roygbiv; - - - - - A few final notes - - From reading the preceding sections, it should be clear - that all of the data types that we define - with the data keyword are algebraic data types. - Some may have just one alternative; others have several; but - they're all using the same organising machinery. - - Another useful thing to know is that it's perfectly okay - for the name of an algebraic type to have the same name as one - of its constructors to be the same. It's always obvious from - context whether we're using a name to refer to a type or a - constructor, so this doesn't introduce any ambiguity. - - Giving a type and its constructor the same name is in fact - something that we do frequently when the type has just one - constructor. - - &TypeName.hs:PerfectlyNormal; - - When a type has multiple constructors, it's still legal to - give one of the constructors the same name as the type, but - this is much less common. - - &TypeName.hs:LegalButWeird; - - - - - - Parameterised types - - In our discussion of lists, we mentioned that we can create - a list of values of any type. We can define our own types that - allow this, too. To do this, we introduce variables into a type - declaration. - - &Wrapper.hs:Wrapper; - - Here, the variable a is not a - regular variable; it's called a type - variable, because it indicates that our - Wrapper type takes another type as its parameter - (hence calling it a parameterised type). What this lets us do is - use Wrapper on values of any type. - - &Wrapper.hs:wrappedTypes; - - As usual, we can load our source file into &ghci; and - experiment with it. - - &wrapper.ghci:experiment; - - Wrapper is a generic container - type (albeit a fairly useless one); we can construct a - Wrapper from a value of any type. It is also - strongly typed; the type of whatever it contains is represented in - its own type. - - To once again extend an analogy to more familiar languages, - this gives us a facility that bears some resemblance to - templates in C++, and to generics in Java. (In fact, Java's - generics facility was inspired by several aspects of Haskell's - type system.) - - We can nest uses of parameterised types inside each other, - but when we do, we may need to use parentheses to tell the - Haskell compiler what we mean. - - &Wrapper.hs:parens; - - - - Back to writing functions: local variables - - Let's take a break from writing about types for a few - moments. Within the body of a function, we can introduce new - local variables whenever we need them, using a let - expression. As an example, let's write a function that - calculates the real-valued roots of the quadratic equation - a * (x ** 2) + b * x + c == 0. - - &Roots.hs:realRoots; - - The keywords to look out for here are let, - which starts a block of variable declarations, and - in, which ends it. Each line introduces a new - variable. The name is on the left of the =, - and its value on the right. We can use these variables both - within our block of variable declarations and in the expression - that follows the in keyword. - - There's no problem with a variable earlier in a - let block referring to a later one, or even with - them referring to each other. (In some functional languages, - this sort of flexible let is named - letrec.) - - We can have multiple let blocks within an - expression. There's also another mechanism we can use to - introduce local variables, called a where block. - The definitions in a where block apply to the code - that precedes it. Let's illustrate what we - mean with another example. - - &Roots.hs:roots; - - Here, the roots function returns the - real roots when they're defined, and the complex roots - otherwise. (We left out the divide-by-zero case for simplicity.) - While a where clause initially looks very weird - to non-Haskell programmers, it's a great way to put the - important code early, followed by the auxiliary - definitions that support it. After a while, you'll find - yourself missing where clauses in languages that - lack them! + + Understanding a function's type signature hunk ./en/ch03-funcs-types.xml 862 - The main difference between let and - where is one of scope. The scope of a - let only extends to the expression after the - in keyword, while the variables introduced by a - where clause are visible upwards to the beginning - of the block that it belongs to. Also, - let is always paired with an expression, but - where is paired with a block of equations. + Let's take a look at a function's type. hunk ./en/ch03-funcs-types.xml 864 - We'll be talking more about how to write let - expressions and where clauses in . + &func.ghci:lines.type; hunk ./en/ch03-funcs-types.xml 866 - - A few observations about complex numbers + We can read the -> above as + to, which loosely translates to + returns. The signature as a whole thus reads as + lines has the type + String to list-of-String. + Let's try applying the function. hunk ./en/ch03-funcs-types.xml 873 - It's probably obvious from context above, but - (:+) is the constructor for a complex - number, taking the real part on the left and the imaginary - part on the right. + &func.ghci:lines; hunk ./en/ch03-funcs-types.xml 875 - Also, Complex is parameterised over the type - of complex number it should represent. In practice, it only - makes much sense to use Complex Double, since - &GHC; implements Double more efficiently than - Float. - + The \n in the input string is an + escaped newline (line ending) character: + lines splits a string on line boundaries. + Notice that its type signature gave us a strong hint as to what + the function might actually do. This is an incredibly valuable + property of types in a functional language. hunk ./en/ch03-funcs-types.xml 882 - - Local functions + A side effect modifies the state of the + system, for example by reading or writing a file. Most Haskell + functions can't cause side effects, so we refer to them as + pure. The result of applying a pure + function can only depend on its arguments. We can often get a + strong hint of what a pure function does by simply reading its + name and understanding its type signature. As an example, let's + look at not. hunk ./en/ch03-funcs-types.xml 891 - You'll have noticed that Haskell's syntax for defining a - variable looks very similar to its syntax for defining a - function. This symmetry is preserved in let and - where blocks; we can define local - functions just as easily as local - variables. All of the same syntax - applies as at the top level: we can use multiple equations, - patterns, and guards. + &func.ghci:not.type; hunk ./en/ch03-funcs-types.xml 893 - &LocalFunction.hs:pluralise; + Even if we didn't know the name of this function, its + signature alone limits the possible valid behaviours it could + have. hunk ./en/ch03-funcs-types.xml 897 - In this example, we define and use a local function - plural using several equations. Local - functions can freely use variables from the scopes that - enclose them; here, we use word from the - definition of the outer function - pluralise. In the definition of - pluralise, the map - function (which we'll be revisiting in the next chapter) - applies the local function plural to - every element of the counts list. - + + + Ignore its argument, and always return either + True or False. + + + Return its argument unmodified. + + + Negate its argument. + + hunk ./en/ch03-funcs-types.xml 911 - - How to represent a complicated result - - We sneaked two previously unseen standard types, - Maybe and Either, into our - root-finding examples. We use Maybe when it might - not make sense to return a normal result, for example because a - function's result is undefined for some inputs. We use - Just to say we have a - result, and the argument to Just - is that result. When we can't give a result, we use - Nothing, which takes no arguments. - - &realRoots.ghci:maybe; - - Why do we need Maybe here? The real-valued - roots of a quadratic equation are infinity when - a, the coefficient of x ** - 2, is zero. - - &realRoots.ghci:a0; - - They're also not defined when b ** 2 - 4 * a * - c is negative, because we would need to use complex - numbers to represent a negative square root. - - &realRoots.ghci:complex; - - Otherwise, we can return a normal result, wrapped in - Just. - - &realRoots.ghci:just; - - Compared to error, which we saw in - , Maybe has - the huge advantage that it's a normal value, which we return to - our caller to deal with. Calling error, by - contrast, is more akin to pulling on the program's ejector seat - handle as it disintegrates around us; something terrible has - happened, and we need to give up right now. - - The Either a b type gives us even more - flexibility than Maybe, as it's got two type - parameters. We can wrap a value of any type a with Left, or a value of - an unrelated type b with - Right. Our roots function - uses this to return a Double when the real roots - are defined, and a Complex Double when they're - not. - + + Polymorphism in Haskell hunk ./en/ch03-funcs-types.xml 914 - - The offside rule, and white space in a function - body + When we introduced lists, we mentioned that the list type is + polymorphic. We'll talk about polymorphism in more detail here: + like strong type, it's a generic term that has + distinct meanings in different programming language + communities. hunk ./en/ch03-funcs-types.xml 920 - In our definition of realRoots, the - left margin of our text wandered around quite a bit. This was - not an accident: in Haskell, white space has meaning. + If we want to fetch the last element of a list, we use the + last function. The value that it returns + must have the same type as the elements of the list, but + last operates in the same way no matter + what type those elements actually + are. hunk ./en/ch03-funcs-types.xml 927 - Haskell uses indentation as a cue to parse sections of code. - This use of layout to convey structure is sometimes called the - offside rule. At the top level, the first declaration or - definition can start in any column, and the Haskell compiler or - interpreter remembers that indentation level. Every subsequent - top-level declaration must have the same indentation. + &func.ghci:last; hunk ./en/ch03-funcs-types.xml 929 - Here's an illustration of the top-level indentation rule. - Our first file, GoodIndent.hs, is well - behaved. + To capture the idea that last needs its + argument and result types to be the same, but doesn't need to + know their details, its type signature contains a type + variable. hunk ./en/ch03-funcs-types.xml 934 - &GoodIndent.hs:good; + &func.ghci:last.type; hunk ./en/ch03-funcs-types.xml 936 - Our second, BadIndent.hs, doesn't play - by the rules. + Here, a is the type variable. + We can read the signature as takes a list, all of whose + elements have some type a, and + returns a value of type a. hunk ./en/ch03-funcs-types.xml 942 - &BadIndent.hs:bad; + + Identifying a type variable hunk ./en/ch03-funcs-types.xml 945 - Here's what happens when we try to load the two files into - &ghci;. + Type variables always start with a lowercase letter. You + can always tell a type variable from a normal variable by + context, because the languages of types and functions are + separate: type variables live in type signatures, and regular + variables live in normal expressions. hunk ./en/ch03-funcs-types.xml 951 - &indent.ghci:load; + It's common Haskell practice to keep the names of type + variables very short. One letter is overwhelmingly common; + two is rare; and three or more is practically unheard of. + Type signatures are usually brief, so we gain more in + readability by keeping names short than we would by making + them descriptive. + hunk ./en/ch03-funcs-types.xml 959 - An empty line is treated as a continuation of the current - item, as is a line indented to the right of the current current - item. + When a function has type variables in its + signature, indicating that some of its arguments can be of any + type, we call the function polymorphic. hunk ./en/ch03-funcs-types.xml 963 - The rules for let expressions and - where clauses are similar. After a - let or where keyword, the Haskell - compiler or interpreter remembers the indentation of the next - token it sees. If the next line is empty, or its indentation is - further to the right than the previous line, this counts as - continuing the previous line. On the other hand, if the - indentation is the same as the previous line, this is treated as - beginning a new item in the same block. + When we want to call last on, say, a + list of Char, we substitute Char for + each a throughout the type + signature, which gives us the type of + last-over-[Char] as + [Char] -> Char. hunk ./en/ch03-funcs-types.xml 970 - Here are nested uses of let and - where. + This kind of polymorphism is called + parametric polymorphism. The choice of + naming is easy to understand by analogy: just as a function can + have parameters that we can later bind to real values, a Haskell + type can have parameters that we can later bind to other + types. hunk ./en/ch03-funcs-types.xml 977 - &letwhere.hs:let; + + A little nomenclature hunk ./en/ch03-funcs-types.xml 980 - In the body of bar, the variable - a is only visible within the let - expression that defines it. It's not visible to the - let expression that defines b; - only the result of the inner - let expression is visible. + If the type of a function contains type parameters, we can + call the function polymorphic, or say that it has a + parameterised type. This is also the case for types. + hunk ./en/ch03-funcs-types.xml 985 - &letwhere.hs:where; + When a function or type has a parameterised type, we've + already noted that it doesn't care what the actual type is. + However, we can make a stronger statement: it has no + way to find out what the real type is, or to + manipulate a value of that type. It can't create a value; + neither can it inspect one. All it can do is treat it as a + fully abstract black box. We'll cover one reason + that this is important soon. hunk ./en/ch03-funcs-types.xml 994 - Similarly, the scope of the first where clause - is the definition of foo, but the scope of - the second is just the first where clause. + Parametric polymorphism is the most visible kind of + polymorphism that Haskell supports. Here are a few forms of + polymorphism that are common in other languages, but not present + in Haskell. hunk ./en/ch03-funcs-types.xml 999 - The indentation we use for the let and - where clauses makes our intentions easy to figure - out. + In mainstream object oriented languages, + subtype polymorphism is more widespread + than parametric polymorphism. The subclassing mechanisms of C++ + and Java give them subtype polymorphism. A base class defines a + set of behaviours that its subclasses can modify and + extend. Since Haskell isn't an object oriented language, it + doesn't provide subtype polymorphism. hunk ./en/ch03-funcs-types.xml 1007 - - A note about tabs versus spaces + Also common is coercion polymorphism, + which allows a value of one type to be implicitly converted into + a value of another type. Many languages provide some form of + coercion polymorphism: one example is automatic conversion + between integers and floating point numbers. Haskell avoids + even this kind of simple automatic coercion. hunk ./en/ch03-funcs-types.xml 1014 - If you are using a Haskell-aware text editor (e.g. Emacs), - it is probably already configured to use space characters for - all white space within a line. If your editor is - not Haskell-aware, you should configure - it to only use space characters. + This is not the whole story of polymorphism in Haskell: + we'll return to the subject in . hunk ./en/ch03-funcs-types.xml 1018 - The reason for this is simple portability. In an editor - that uses a fixed-width font, tab stops are by default placed - at different intervals on Unix-like systems (every eight - characters) than on Windows (every four characters). This - means that no matter what your personal beliefs are about - where tabs belong, you can't rely on someone else's editor - honouring your preferences. Any indentation that uses tabs is - going to look broken under someone's - configuration. In fact, this could lead to compilation - problems, as the Haskell language standard requires - implementations to use the Unix tab width convention. - Using space characters avoids these problem - entirely. - + + Reasoning about polymorphic functions hunk ./en/ch03-funcs-types.xml 1021 - - The offside rule is not mandatory + In , we talked about + figuring out the behaviour of a function based on its type + signature. We can apply the same kind of reasoning to + polymorphic functions. Let's look again at + fst. hunk ./en/ch03-funcs-types.xml 1027 - We can use explicit structuring instead of layout to - indicate what we mean. To do so, we start a block of - equations with an opening curly brace; separate each item with - a semicolon; and finish the block with a closing curly brace. - The following two uses of let have the same - meanings. + &func.ghci:fst.type; hunk ./en/ch03-funcs-types.xml 1029 - &Braces.hs:braces; + First of all, notice that its argument contains two type + variables, a and b, signifying that the elements of the + tuple can be of different types. hunk ./en/ch03-funcs-types.xml 1034 - When we use explicit structuring, the normal layout rules - don't apply, which is why we can get away with unusual - indentation in the second let expression. + The result type of fst is a. We've already mentioned that + parametric polymorphism makes the real type inaccessible: + fst doesn't have enough information to + construct a value of type a, + nor can it turn an a into a + b. So the + only possible behaviour it can have is to + return the first element of the pair. hunk ./en/ch03-funcs-types.xml 1044 - We can use explicit structuring anywhere that we'd - normally use layout. It's valid for where - clauses, and even top-level declarations. Just remember that - although the facility exists, explicit structuring is hardly - ever actually used in Haskell - programs. + + Further reading + + We can extend this line of reasoning to more complicated + polymorphic functions. The paper covers this + procedure in depth. + hunk ./en/ch03-funcs-types.xml 1055 - - Recursive types - - Here's a definition of a binary tree type. - - &Tree.hs:Tree; - - We call this a recursive type because - Tree, the type we're defining, appears both on the - left hand side and the right hand side of the definition: we - define the type in terms of itself. - - - - A little more about lists - - Now that we're getting familiar with some of the jargon - around types, we can revisit lists. Haskell's list type is a - parameterised type, because we can make lists of any other type. - It is also an algebraic data type, with two constructors. One - is the empty list, written [] (sometimes pronounced - nil, which is borrowed from Lisp). - - &list.ghci:empty; - - The other is the (:) operator, often - pronounced cons (this is short for - construct, and also borrowed from Lisp). The - (:) operator takes an element and a list, - and constructs a new list. - - &list.ghci:cons; - - We can use (:) repeatedly to add new - elements to the front of a list. - - &list.ghci:cons2; - - The right hand side of (:) must be a - list, and of the correct type. If it's not, we'll get an - error. - - &list.ghci:cons.bad; - - Because (:) constructs a list from - another list, the list type is recursive. So here we have a - built-in type that's parameterised, recursive, and - algebraic. - - One consequence of lists being generic is that lists of - lists, for example, aren't special in any way. - - &list.ghci:listlist; - - This has type [[String]], a list of lists of - strings. But since String is just a synonym for - [Char], it's really a list of - lists of lists of Char. Whew! - - We're not limited to building up lists one element at a - time. Haskell defines an inline function, - (++), that we can use to append one list - onto the end of another. - - &list.ghci:append; - - The concat function takes a list of - lists, and concatenates the whole lot into a single list. - - &list.ghci:concat; - - - - - Unit, the zero-element tuple - - Haskell has a special tuple type with no elements, written - (), and pronounced unit. - - &unit.ghci:unit; - - This type is only really used with parameterised data types, - to indicate that one of the type parameters isn't being used. - Since it doesn't encode any information, it's a rough equivalent - to void in C. - - Here's an example of () in use. We can - generalise our earlier Tree type a little, so that - internal nodes contain values of type a, while leaves contain values of type - b. - - &Tree.hs:ComplexTree; - - If we wanted to create a ComplexTree where we - wanted to store Ints on the leaves, but don't care - about the internal nodes. We would write its type as - ComplexTree () Int. - - - - Pattern matching - - Although we introduced a handful of functions earlier that - can operate on lists, we've yet to see how we might generally - get values out of a constructed algebraic data type. Haskell - has a simple pattern matching facility that we can use to this - end. - - A pattern lets us peer inside a compound value and bind - variables to the values it contains. In fact, when we define a - function, the parameters to that function are really patterns - that bind our variables to an entire value. - - Here's an example of pattern matching in action on a list; - we're going to add all elements of the list together. - - &add.hs:sumList; - - See that (x:xs) on the left of the first line? - The : means match the head of a - list; that's the familiar list constructor, - (:), in action in a new way. The variables - x and xs are given the - values of (bound to) the head and tail of the - list, respectively. The whole pattern is wrapped in parentheses - so Haskell won't parse it as three separate arguments. - - What effect does pattern matching have? Haskell will only - evaluate the right hand side of an equation if it can match all - of the patterns on the left hand side. In the definition of - sumList above, the right hand side of the - first equation won't be evaluated if the input list is empty. - Instead, Haskell will fall through to the - equation on the following line, which does - have a pattern for the empty list, and it will evaluate - that. - - It might initially look like we have two functions - named sumList here, but Haskell lets us - define a function as a series of equations; so in fact these two - clauses are defining the behaviour of one function, over - different inputs. (By the way, there's already a standard - function, sum, that does this - adding-of-a-list for us. This sumList is - purely for illustration.) - - The syntax for pattern matching on a tuple is similar to the - syntax for constructing a tuple. Here's a function that returns - the third element from a three-tuple. - - &Tuple.hs:third; - - There's no limit on how deep within a value a - pattern can look. Here's a definition that looks both inside a - tuple and inside a list within that tuple. - - &Tuple.hs:complicated; - - We can try this out interactively. - - &tuple.ghci:complicated; - - Wherever a literal value is present in a pattern - (True and 5 in the tuple - pattern above), that value must match exactly for the pattern - match to succeed. If every pattern within a series of equations - fails to match, we get a runtime error. - - &tuple.ghci:nomatch; - - We can pattern match on algebraic data types using their - constructors. Remember the Wrapper type we defined - earlier? Here's how we can extract a wrapped value from a - Wrapper. - - &Wrapper.hs:unwrap; - - Let's see it in action. - - &wrapper.ghci:unwrap; - - Notice that Haskell infers the type of the - unwrap function based on the constructor - we're using in our pattern. If we're trying to match a value - whose constructor is Wrapper, then the type - of that parameter must be Wrapper a. - - And for good measure, here's how we can define functions to - access the fields of the Coord type we defined in - . - - &SimpleTypes.hs:accessors; - - - The ordering of patterns is important - - Haskell tests patterns for matches in the order in which - we list them in our code. It goes from top to bottom and - stops at the first match; it does not - check every pattern and use the best match. - - If you're familiar with pattern matching from a logic - programming language like Prolog, Haskell's facility is - simpler and less powerful. It doesn't provide backtracking or - unification. - - - Here are some rules of thumb to help with remembering how - pattern matching works. A constructor in a pattern - checks that the matched value has the right shape. A - literal value ensures that that portion of the value has exactly - the matching contents. And a variable makes no assertions about - either the shape or contents of the matched value; it matches anything, - and gives the variable that value. - - So the pattern (3:xs) first of all is an - assertion that a matching value is a non-empty list, by matching - against the (:) constructor. It also - ensures that the head of the list is the literal value - 3. And whatever the tail of the list is, it - will be bound to the variable xs. - - - The don't-care, or wild card, pattern - - When we're writing a pattern, we can specify that we don't - care what value a particular value within a structure has, - without actually binding that value to a name. The notation - for this is _ (called a wild card or don't - care), and we use it as follows. This function - tells us whether the result of the roots - function we defined earlier is real-valued or not. - - &Roots.hs:isRealValued; - - Here, we don't care about the value of the result, just - about which constructor was used to create it. If it was - Left, the result must be a complex - number, otherwise it must be real. We can use a wild card for - the entire second pattern; there's no need to see if the - constructor is Right, because it - must be; Either only has two - constructors. - - In a pattern, a wild card acts similarly - to a variable, only it doesn't bind the value to a name. - While we can't put the same variable name multiple times in a - single pattern, we can use a wild card as many times as we - need to. - - Another advantage of wild cards is that a Haskell compiler - can warn us if we introduce a variable name in a pattern, but - don't use it in a function's body; defining something but - forgetting to use it can often indicate a bug. Using a wild - card instead of an unused variable makes it explicit that we - really don't care what value is present, and will prevent such - a warning. - - Wild cards also help readability, as they make it easier - to tell which values we're really using. - - &SimpleTypes.hs:niceAccessors; - - - - - The case expression - - We're not limited to using patterns in function - definitions. The case expression lets us match - patterns at any time. Here's what it looks like. - - &Roots.hs:hasRealRoots; - - The case keyword is followed by an arbitrary - expression; the result of this expression is what we're - pattern matching on. The of keyword signifies - the end of the expression and the beginning of the block of - patterns and expressions. - - Each item in the block consists of a pattern, followed by - an arrow ->, followed by an expression to - evaluate if that pattern matches. The result of the - case expression is the result of the expression - associated with the first pattern to match, taken from top to - bottom. + + The type of a function of more than one argument hunk ./en/ch03-funcs-types.xml 1058 - To express here's the expression to evaluate if - none of the other patterns match, we would just use - the wild card pattern _ as the last in our list - of patterns. - - - - A flying visit back to the where clause - - Now that we've seen that we can define a function as a - series of equations, the usefulness of the where - clause should be a bit more clear. Variables that we define - inside a where clause are visible across all of - the equations that precede it in a single block. - - - - Early pattern matching pitfalls - - There are a few ways in which new Haskell programmers can - misunderstand or misuse patterns. Here are a few potential - missteps that you can easily avoid. - - There's no way to write a pattern that compares - a value with a variable. Matching a pattern only lets us - perform exact comparisons against combinations of constructors - and simple values. - - Here's a well-intentioned example of pattern matching gone - awry. This code compiles cleanly, but depending on what you - expect it to do, it might surprise you. - - &BogusPattern.hs:isHead; + So far, we haven't seen a signature for a function + that takes more than one argument. We've + already used a few such functions; let's look at the signature + of one, take. hunk ./en/ch03-funcs-types.xml 1063 - A naive glance suggests that this code is trying - to check the value of f to see if it's - actually the standard function head, but - here's what it is really doing. + &func.ghci:take.type; hunk ./en/ch03-funcs-types.xml 1065 - Because the first pattern in the - case expression is a variable, this branch of the - case will always match, no - matter what the value of f is. The name - head thus acts as a local variable whose - value is the value of f, which hides the - global definition of the well-known head - function. + It's pretty clear that there's something going on + with an Int and some lists, but why are there two + -> symbols in the signature? Haskell + parses this chain of arrows from right to left; that is, + -> is right-associative. If we introduce + parentheses, we can make it clearer how this type signature is + interpreted. hunk ./en/ch03-funcs-types.xml 1073 - - Irrefutable patterns + &Take.hs:type; hunk ./en/ch03-funcs-types.xml 1075 - A pattern that consists only of a variable will always - match, because it's not being compared against any value that - could cause the match to fail. We refer to patterns that - always match as irrefutable. - + From this, it looks like we ought to read the type + signature as a function that takes one argument, an + Int, and returns another function. That other + function also takes one argument, a list, and returns a list of + the same type as its result. hunk ./en/ch03-funcs-types.xml 1081 - The first pattern always matches, because it's - irrefutable. But the second pattern also - always matches, because it uses a wild card. However, because - Haskell attempts to match patterns in the order in which we - write them, the first pattern will always succeed, and the - second pattern will never actually be reached. Because the two - patterns will match the same values, they are said to - overlap. If &GHC; ever complains to you - about overlapping patterns, it's telling you that one of your - patterns is the same as another, and so it will never actually - be matched. + This is an intriguing idea, but it's not yet easy + to see just yet what its consequences might be. We'll return to + this topic in , once we've + spent a bit of time writing functions. hunk ./en/ch03-funcs-types.xml 1086 - Another thing to be aware of is that a variable - can only appear once in a pattern. For example, we can't put - a variable in multiple places within a pattern to express the - notion this value and that should be - identical. + We can now write a type signature for the + myDrop function that we defined + earlier. hunk ./en/ch03-funcs-types.xml 1090 - The way around these restrictions of Haskell's patterns is - to use patterns in combination with a language facility called - guards, which we'll talk about - next. + &myDrop.hs:myDrop.type; hunk ./en/ch03-funcs-types.xml 1092 - hunk ./en/ch03-funcs-types.xml 1094 - - Conditional evaluation with guards - - We can further extend our expressive arsenal using - guards. A guard is an expression of type - Bool; if it evaluates to True, - the equation that follows it is evaluated. Otherwise, the next - guard in the series is evaluated, and so on. A series of guards - is only checked if the patterns that they're associated with - match. Here's an example of guards in action. - - &Roots.hs:guardedRoots; - - Each guard is introduced by a | symbol, - followed by the guard expression, then an = symbol - (or -> if within a case - expression), then the expression to evaluate if the guard - succeeds. A guard expression can use any variables matched in - the pattern that precedes it. - - The otherwise used in the second guard - has an obvious meaning: it's the expression to evaluate if - previous guards all evaluate to False. It's - not a special piece of syntax, though; it's just a predefined variable - whose value is True. - - We can use guards anywhere that we can use - patterns. The advantage of writing a function as a series of - equations using pattern matching and guards is that it often - makes code much clearer. Remember the - myDrop function we defined in ? - - &myDrop.hs:myDrop.noid; - - Here's a reformulation of that function using - patterns and guards. Instead of reasoning about what an - if expression is doing and which branch will be - evaluated, the code uses a series of equations with simple - patterns and guards. Hoisting the control decisions to the - outside of the code, instead of burying it inside - with if expressions, lets us enumerate up front - the cases in which we expect the behaviour of the function to - differ. - - &myDrop.hs:niceDrop; + + Exercises hunk ./en/ch03-funcs-types.xml 1097 - Let's return to one of the limitations of patterns that we - mentioned in the previous section: the fact that we can't - check two variables within a pattern for equality. - We can express this quite easily by following the pattern with a - guard. + + + + Haskell provides a standard function, + last :: [a] -> a, that returns the last + element of a list. Write a function + lastButOne, that returns the element + before the last. + + hunk ./en/ch03-funcs-types.xml 1108 - &Guard.hs:secondEqualsThird; + + + Load your lastButOne + function into &ghci;, and try it out on lists of different + lengths. What happens when you pass it a list that's too + short? + + + hunk ./en/ch03-funcs-types.xml 1118 - Here, for good measure, we've illustrated guard - syntax in a case expression. This guard expression - compares the variables matched in the pattern for equality. hunk ./en/ch03-funcs-types.xml 1120 - - Infix functions - - Usually, when we define or call a function in Haskell, we - write the name of the function, followed by its arguments; this - is called prefix notation, because the name of the function - comes before its arguments. For a function that takes two - arguments, we have the option of using it in - infix form, between its first and second - arguments. This allows us to write expressions using functions - as if they were infix operators. - - The syntax for defining or calling a function in infix form - is to enclose the name of the function in backtick characters - (sometimes known as backquotes). Here's a simple infix - definition. - - &Plus.hs:plus; - - Defining a function in infix form doesn't change anything - about the behaviour of the function. We can call the function - using infix or prefix notation, as we prefer. - - &infix.ghci:plus; - - Infix notation is useful for more than just our own - functions. For example, Haskell's standard - Data.List module defines a function, - isPrefixOf, that indicates whether all - elements of its first argument are equal to the first elements - of its second argument. - - &infix.ghci:type; - - Let's define a few variables in &ghci;. - - &infix.ghci:vars; - - If we call isPrefixOf using prefix - notation, we can have a hard time remembering which argument - we're checking for as a prefix of the other. - - &infix.ghci:prefix; - - But if we use infix notation, the code reads - more naturally; it's now obvious that we're checking the - variable on the left to see if it's a prefix of the variable on - the right. - - &infix.ghci:infix; - - There's no hard-and-fast rule that dictates when you ought - to use infix versus prefix notation, although prefix notation is - far more common. It's best to choose whichever makes your code - more readable in a specific situation. - - - The backtick notation is not a general mechanism: it's a - piece of special syntax that applies only to names. For - example, we can't put backticks around an expression that - returns a function, and then treat that as an infix - function. - - hunk ./en/ch03-funcs-types.xml 1124 - In this chapter, we've had a whirlwind overview of Haskell's - type system and much of its syntax. We've read about basic - types, compound types, and how to write our own algebraic data - types. We've seen how to write functions, and how to declare - local variables within them. We've read about Haskell's offside - rule for laying out functions, and how we can avoid it if we - need to. We've seen conditional evaluation, pattern matching - and guards. We've discussed error handling. + In this chapter, we've had a whirlwind overview of + Haskell's type system and much of its syntax. We've read about + the most common types, and discovered how to write simple + functions. We've been introduced to polymorphism, conditional + expressions, and how to reason about lazy evaluation. hunk ./en/ch03-funcs-types.xml 1130 - This all amounts to a lot of information to absorb. In - , we'll build on this basic knowledge to - understand how we can write, and think about, code in + This all amounts to a lot of information to + absorb. In , we'll build on this + basic knowledge to further enhance our understanding of hunk ./en/ch03a-defining-types.xml 1 + + + + Defining types, and more about functions + + Topics: Creating new types. Ways to write clearer + functions. + + + Defining a new data type + + Although lists and tuples are useful, we'll still often want + to construct new data types of our own. We define a new data + type using the data keyword. + + &MyType.hs:MyType; + + The MyType after the data keyword + is the name of our new type. (As we've already mentioned, a + type name must start with a capital letter.) The string + MyConstructor is the name of the + constructor we'll call to create a value of + this type. (As with a type name, a constructor name must start + with a capital letter.) Finally, the Int and + String are the components of + the type. A component serves the same purpose in Haskell as a + field in a structure or class would in another language. + + + Deriving what? + + We'll explain the full meaning of deriving + (Show) later, in + . For now, + it's enough to know that we need to tack this onto a type + declaration so that &ghci; will automatically know how to + print a value of this type. + + + We can create a new value of type MyType by + treating MyConstructor as a function, and + calling it with arguments of types Int and + String. + + &MyType.hs:myValue; + + Once we've defined a type, we can experiment with it in + &ghci;, starting by using the :load command + to load our source file. + + &mytype.ghci:load; + + + Remember the myValue variable we defined? Here + it is. + + &mytype.ghci:myValue; + + A constructor serves as both a function for creating a value + and a tag identifying what type of value we have. + The value that &ghci; prints is telling us that we have created + a value with the tag MyConstructor, with the given + values in each slot. (By the way, this is why we + had to add deriving (Show) to the definition of our + type; without that, &ghci; would print an error message, telling + us that it doesn't know how to print a value of this + type.) + + We can construct new values interactively in &ghci;, + too. + + &mytype.ghci:newValue; + + The &ghci; command :type lets us see what + the type of that expression is. + + &mytype.ghci:valueType; + + To find out more about a type, we can use some of &ghci;'s + browsing capabilities. The :info command + gets &ghci; to tell us everything it knows about a + type. + + &mytype.ghci:info; + + We can also find out why we use + MyConstructor to construct a new value of + type MyType. + + &mytype.ghci:type; + + From Haskell's perspective, then, a constructor is just + another function, one that happens to return a value of the type + we want to construct. + + + + Algebraic data types + + The Bool type that we introduced earlier is the + simplest example of a sort of type called an algebraic + data type. An algebraic data type has a fixed set + of possible values, each of which is identified by a distinct + constructor. + + The use of the word algebraic simply + indicates that the components of an algebraic data type are used + together. + + In the case of Bool, the type has two + constructors, True and False. Each + constructor is separated by a | character, + which we can read as or. These are usually + referred to as alternatives or cases. + + &Bool.hs:Bool; + + Each constructor of an algebraic type can take + zero or more arguments; the numbers and types of the arguments + accepted by each constructor are independent. For example, + here's one way we might represent versions of the Windows + operating system, where old releases were monolithic, and newer + releases have service pack levels denoting major + updates after their initial releases. + + &OsVersion.hs:WindowsVersion; + + The alternatives that represent older releases don't need + arguments, but those for the newer releases need an + Int to represent the patch level. + + + Analogues to algebraic data types in other + languages + + Algebraic data types provide a single structuring + mechanism in instances where other languages have several + different building blocks. Here are some analogues from C and + C++, which might make it clearer what we can do with algebraic + data types. + + With just one constructor, an algebraic data type groups + related values into one, and gives that value an identity + distinct from other types. It corresponds to a + struct in C or C++, and its components correspond to the + fields of a struct. Here's such a struct. + + &types.c:coord; + + And here's an equivalent type in Haskell. + + &SimpleTypes.hs:Coord; + + The only significant difference is that the fields in the + Haskell type are anonymous and positional. In , we'll see how to use names + to access the fields of the Coord type. + + If an algebraic data type has multiple alternatives, we + can think of it as similar to a union in C or + C++. A big difference between the two is that a union doesn't + tell us which alternative is actually present; we have to + record which alternative we're using ourselves, usually in + another field of a struct. This means that unions can + sometimes be sources of bugs, where our notion of which + alternative we should be using is incorrect. + + With an algebraic data type, Haskell stores the + constructor that we use in the value that we create, so we + don't need to manually sock it away somewhere else. + + Algebraic data types also serve where we'd use + an enum in C or C++, to represent a range of + discrete symbolic values. Such algebraic data types are + sometimes referred to as enumeration types. Here's an example + from C. + + &types.c:roygbiv; + + And here's a Haskell equivalent. + + &SimpleTypes.hs:Roygbiv; + + + + + A few final notes + + From reading the preceding sections, it should be clear + that all of the data types that we define + with the data keyword are algebraic data types. + Some may have just one alternative; others have several; but + they're all using the same machinery. + + Another useful thing to know is that it's perfectly okay + for the name of an algebraic type to have the same name as one + of its constructors to be the same. It's always obvious from + context whether we're using a name to refer to a type or a + constructor, so this doesn't introduce any ambiguity. + + Giving a type and its constructor the same name is in fact + something that we do frequently when the type has just one + constructor. + + &TypeName.hs:PerfectlyNormal; + + When a type has multiple constructors, it's still legal to + give one of the constructors the same name as the type, but + this is much less common. + + &TypeName.hs:LegalButWeird; + + + + + + Parameterised types + + In our discussion of lists, we mentioned that we can create + a list of values of any type. We can define our own types that + allow this, too. To do this, we introduce variables into a type + declaration. + + &Wrapper.hs:Wrapper; + + Here, the variable a is not a + regular variable; it's called a type + variable, because it indicates that our + Wrapper type takes another type as its parameter + (hence calling it a parameterised type). What this lets us do is + use Wrapper on values of any type. + + &Wrapper.hs:wrappedTypes; + + As usual, we can load our source file into &ghci; and + experiment with it. + + &wrapper.ghci:experiment; + + Wrapper is a generic container + type (albeit a fairly useless one); we can construct a + Wrapper from a value of any type. It is also + strongly typed; the type of whatever it contains is represented in + its own type. + + To once again extend an analogy to more familiar languages, + this gives us a facility that bears some resemblance to + templates in C++, and to generics in Java. (In fact, Java's + generics facility was inspired by several aspects of Haskell's + type system.) + + We can nest uses of parameterised types inside each other, + but when we do, we may need to use parentheses to tell the + Haskell compiler what we mean. + + &Wrapper.hs:parens; + + + + Back to writing functions: local variables + + Let's take a break from writing about types for a few + moments. Within the body of a function, we can introduce new + local variables whenever we need them, using a let + expression. As an example, let's write a function that + calculates the real-valued roots of the quadratic equation + a * (x ** 2) + b * x + c == 0. + + &Roots.hs:realRoots; + + The keywords to look out for here are let, + which starts a block of variable declarations, and + in, which ends it. Each line introduces a new + variable. The name is on the left of the =, + and its value on the right. We can use these variables both + within our block of variable declarations and in the expression + that follows the in keyword. + + There's no problem with a variable earlier in a + let block referring to a later one, or even with + them referring to each other. (In some functional languages, + this sort of flexible let is named + letrec.) + + We can have multiple let blocks within an + expression. There's also another mechanism we can use to + introduce local variables, called a where block. + The definitions in a where block apply to the code + that precedes it. Let's illustrate what we + mean with another example. + + &Roots.hs:roots; + + Here, the roots function returns the + real roots when they're defined, and the complex roots + otherwise. (We left out the divide-by-zero case for simplicity.) + While a where clause initially looks very weird + to non-Haskell programmers, it's a great way to put the + important code early, followed by the auxiliary + definitions that support it. After a while, you'll find + yourself missing where clauses in languages that + lack them! + + The main difference between let and + where is one of scope. The scope of a + let only extends to the expression after the + in keyword, while the variables introduced by a + where clause are visible upwards to the beginning + of the block that it belongs to. Also, + let is always paired with an expression, but + where is paired with a block of equations. + + We'll be talking more about how to write let + expressions and where clauses in . + + + A few observations about complex numbers + + It's probably obvious from context above, but + (:+) is the constructor for a complex + number, taking the real part on the left and the imaginary + part on the right. + + Also, Complex is parameterised over the type + of complex number it should represent. In practice, it only + makes much sense to use Complex Double, since + &GHC; implements Double more efficiently than + Float. + + + + Local functions + + You'll have noticed that Haskell's syntax for defining a + variable looks very similar to its syntax for defining a + function. This symmetry is preserved in let and + where blocks; we can define local + functions just as easily as local + variables. All of the same syntax + applies as at the top level: we can use multiple equations, + patterns, and guards. + + &LocalFunction.hs:pluralise; + + In this example, we define and use a local function + plural using several equations. Local + functions can freely use variables from the scopes that + enclose them; here, we use word from the + definition of the outer function + pluralise. In the definition of + pluralise, the map + function (which we'll be revisiting in the next chapter) + applies the local function plural to + every element of the counts list. + + + + + Reporting errors + + Haskell provides a standard function, + error :: Char -> a, that we can call + when something has gone terribly wrong in our code. We give it + a string parameter, which is the error message to display. Its + type signature looks peculiar; how can it produce a value of any + type a given just a string? The + answer is that it doesn't, because error is + special. + + It has that type so that we can call it anywhere + and it will always have the right type. However, instead of + returning a value like a normal function, it immediately + terminates evaluation, and prints the error message we give it. + Here's an example. The mySecond function + returns the second element of its input list, but fails if its + input list isn't long enough. + + &MySecond.hs:mySecond; + + As usual, we can see how this works in practice in + &ghci;. + + &error.ghci:mySecond; + + Notice that in the third case above, where we're + trying to use the result of the call to + mySecond as the argument to another + function, evaluation still terminates and drops us back to the + &ghci; prompt. This is both the strength and weakness calling + error: it doesn't let us distinguish + between recoverable errors and problems so severe that they + really should terminate a program. We'll see one way around + this problem in . + + + + How to represent a complicated result + + We sneaked two previously unseen standard types, + Maybe and Either, into our + root-finding examples. We use Maybe when it might + not make sense to return a normal result, for example because a + function's result is undefined for some inputs. We use + Just to say we have a + result, and the argument to Just + is that result. When we can't give a result, we use + Nothing, which takes no arguments. + + &realRoots.ghci:maybe; + + Why do we need Maybe here? The real-valued + roots of a quadratic equation are infinity when + a, the coefficient of x ** + 2, is zero. + + &realRoots.ghci:a0; + + They're also not defined when b ** 2 - 4 * a * + c is negative, because we would need to use complex + numbers to represent a negative square root. + + &realRoots.ghci:complex; + + Otherwise, we can return a normal result, wrapped in + Just. + + &realRoots.ghci:just; + + Compared to error, which we saw in + , Maybe has + the huge advantage that it's a normal value, which we return to + our caller to deal with. Calling error, by + contrast, is more akin to pulling on the program's ejector seat + handle as it disintegrates around us; something terrible has + happened, and we need to give up right now. + + The Either a b type gives us even more + flexibility than Maybe, as it's got two type + parameters. We can wrap a value of any type a with Left, or a value of + an unrelated type b with + Right. Our roots function + uses this to return a Double when the real roots + are defined, and a Complex Double when they're + not. + + + + The offside rule, and white space in a function + body + + In our definition of realRoots, the + left margin of our text wandered around quite a bit. This was + not an accident: in Haskell, white space has meaning. + + Haskell uses indentation as a cue to parse sections of code. + This use of layout to convey structure is sometimes called the + offside rule. At the top level, the first declaration or + definition can start in any column, and the Haskell compiler or + interpreter remembers that indentation level. Every subsequent + top-level declaration must have the same indentation. + + Here's an illustration of the top-level indentation rule. + Our first file, GoodIndent.hs, is well + behaved. + + &GoodIndent.hs:good; + + Our second, BadIndent.hs, doesn't play + by the rules. + + &BadIndent.hs:bad; + + Here's what happens when we try to load the two files into + &ghci;. + + &indent.ghci:load; + + An empty line is treated as a continuation of the current + item, as is a line indented to the right of the current current + item. + + The rules for let expressions and + where clauses are similar. After a + let or where keyword, the Haskell + compiler or interpreter remembers the indentation of the next + token it sees. If the next line is empty, or its indentation is + further to the right than the previous line, this counts as + continuing the previous line. On the other hand, if the + indentation is the same as the previous line, this is treated as + beginning a new item in the same block. + + Here are nested uses of let and + where. + + &letwhere.hs:let; + + In the body of bar, the variable + a is only visible within the let + expression that defines it. It's not visible to the + let expression that defines b; + only the result of the inner + let expression is visible. + + &letwhere.hs:where; + + Similarly, the scope of the first where clause + is the definition of foo, but the scope of + the second is just the first where clause. + + The indentation we use for the let and + where clauses makes our intentions easy to figure + out. + + + A note about tabs versus spaces + + If you are using a Haskell-aware text editor (e.g. Emacs), + it is probably already configured to use space characters for + all white space within a line. If your editor is + not Haskell-aware, you should configure + it to only use space characters. + + The reason for this is simple portability. In an editor + that uses a fixed-width font, tab stops are by default placed + at different intervals on Unix-like systems (every eight + characters) than on Windows (every four characters). This + means that no matter what your personal beliefs are about + where tabs belong, you can't rely on someone else's editor + honouring your preferences. Any indentation that uses tabs is + going to look broken under someone's + configuration. In fact, this could lead to compilation + problems, as the Haskell language standard requires + implementations to use the Unix tab width convention. + Using space characters avoids these problem + entirely. + + + + The offside rule is not mandatory + + We can use explicit structuring instead of layout to + indicate what we mean. To do so, we start a block of + equations with an opening curly brace; separate each item with + a semicolon; and finish the block with a closing curly brace. + The following two uses of let have the same + meanings. + + &Braces.hs:braces; + + When we use explicit structuring, the normal layout rules + don't apply, which is why we can get away with unusual + indentation in the second let expression. + + We can use explicit structuring anywhere that we'd + normally use layout. It's valid for where + clauses, and even top-level declarations. Just remember that + although the facility exists, explicit structuring is hardly + ever actually used in Haskell + programs. + + + + + Recursive types + + Here's a definition of a binary tree type. + + &Tree.hs:Tree; + + We call this a recursive type because + Tree, the type we're defining, appears both on the + left hand side and the right hand side of the definition: we + define the type in terms of itself. + + + + A little more about lists + + Now that we're getting familiar with some of the jargon + around types, we can revisit lists. Haskell's list type is a + parameterised type, because we can make lists of any other type. + It is also an algebraic data type, with two constructors. One + is the empty list, written [] (sometimes pronounced + nil, which is borrowed from Lisp). + + &list.ghci:empty; + + The other is the (:) operator, often + pronounced cons (this is short for + construct, and also borrowed from Lisp). The + (:) operator takes an element and a list, + and constructs a new list. + + &list.ghci:cons; + + We can use (:) repeatedly to add new + elements to the front of a list. + + &list.ghci:cons2; + + The right hand side of (:) must be a + list, and of the correct type. If it's not, we'll get an + error. + + &list.ghci:cons.bad; + + Because (:) constructs a list from + another list, the list type is recursive. So here we have a + built-in type that's parameterised, recursive, and + algebraic. + + One consequence of lists being generic is that lists of + lists, for example, aren't special in any way. + + &list.ghci:listlist; + + This has type [[String]], a list of lists of + strings. But since String is just a synonym for + [Char], it's really a list of + lists of lists of Char. Whew! + + We're not limited to building up lists one element at a + time. Haskell defines an inline function, + (++), that we can use to append one list + onto the end of another. + + &list.ghci:append; + + The concat function takes a list of + lists, and concatenates the whole lot into a single list. + + &list.ghci:concat; + + + + + Unit, the zero-element tuple + + Haskell has a special tuple type with no elements, written + (), and pronounced unit. + + &unit.ghci:unit; + + This type is only really used with parameterised data types, + to indicate that one of the type parameters isn't being used. + Since it doesn't encode any information, it's a rough equivalent + to void in C. + + Here's an example of () in use. We can + generalise our earlier Tree type a little, so that + internal nodes contain values of type a, while leaves contain values of type + b. + + &Tree.hs:ComplexTree; + + If we wanted to create a ComplexTree where we + wanted to store Ints on the leaves, but don't care + about the internal nodes. We would write its type as + ComplexTree () Int. + + + + Pattern matching + + Although we introduced a handful of functions earlier that + can operate on lists, we've yet to see how we might generally + get values out of a constructed algebraic data type. Haskell + has a simple pattern matching facility that we can use to this + end. + + A pattern lets us peer inside a compound value and bind + variables to the values it contains. In fact, when we define a + function, the parameters to that function are really patterns + that bind our variables to an entire value. + + Here's an example of pattern matching in action on a list; + we're going to add all elements of the list together. + + &add.hs:sumList; + + See that (x:xs) on the left of the first line? + The : means match the head of a + list; that's the familiar list constructor, + (:), in action in a new way. The variables + x and xs are given the + values of (bound to) the head and tail of the + list, respectively. The whole pattern is wrapped in parentheses + so Haskell won't parse it as three separate arguments. + + What effect does pattern matching have? Haskell will only + evaluate the right hand side of an equation if it can match all + of the patterns on the left hand side. In the definition of + sumList above, the right hand side of the + first equation won't be evaluated if the input list is empty. + Instead, Haskell will fall through to the + equation on the following line, which does + have a pattern for the empty list, and it will evaluate + that. + + It might initially look like we have two functions + named sumList here, but Haskell lets us + define a function as a series of equations; so in fact these two + clauses are defining the behaviour of one function, over + different inputs. (By the way, there's already a standard + function, sum, that does this + adding-of-a-list for us. This sumList is + purely for illustration.) + + The syntax for pattern matching on a tuple is similar to the + syntax for constructing a tuple. Here's a function that returns + the third element from a three-tuple. + + &Tuple.hs:third; + + There's no limit on how deep within a value a + pattern can look. Here's a definition that looks both inside a + tuple and inside a list within that tuple. + + &Tuple.hs:complicated; + + We can try this out interactively. + + &tuple.ghci:complicated; + + Wherever a literal value is present in a pattern + (True and 5 in the tuple + pattern above), that value must match exactly for the pattern + match to succeed. If every pattern within a series of equations + fails to match, we get a runtime error. + + &tuple.ghci:nomatch; + + We can pattern match on algebraic data types using their + constructors. Remember the Wrapper type we defined + earlier? Here's how we can extract a wrapped value from a + Wrapper. + + &Wrapper.hs:unwrap; + + Let's see it in action. + + &wrapper.ghci:unwrap; + + Notice that Haskell infers the type of the + unwrap function based on the constructor + we're using in our pattern. If we're trying to match a value + whose constructor is Wrapper, then the type + of that parameter must be Wrapper a. + + And for good measure, here's how we can define functions to + access the fields of the Coord type we defined in + . + + &SimpleTypes.hs:accessors; + + + The ordering of patterns is important + + Haskell tests patterns for matches in the order in which + we list them in our code. It goes from top to bottom and + stops at the first match; it does not + check every pattern and use the best match. + + If you're familiar with pattern matching from a logic + programming language like Prolog, Haskell's facility is + simpler and less powerful. It doesn't provide backtracking or + unification. + + + Here are some rules of thumb to help with remembering how + pattern matching works. A constructor in a pattern + checks that the matched value has the right shape. A + literal value ensures that that portion of the value has exactly + the matching contents. And a variable makes no assertions about + either the shape or contents of the matched value; it matches anything, + and gives the variable that value. + + So the pattern (3:xs) first of all is an + assertion that a matching value is a non-empty list, by matching + against the (:) constructor. It also + ensures that the head of the list is the literal value + 3. And whatever the tail of the list is, it + will be bound to the variable xs. + + + The don't-care, or wild card, pattern + + When we're writing a pattern, we can specify that we don't + care what value a particular value within a structure has, + without actually binding that value to a name. The notation + for this is _ (called a wild card or don't + care), and we use it as follows. This function + tells us whether the result of the roots + function we defined earlier is real-valued or not. + + &Roots.hs:isRealValued; + + Here, we don't care about the value of the result, just + about which constructor was used to create it. If it was + Left, the result must be a complex + number, otherwise it must be real. We can use a wild card for + the entire second pattern; there's no need to see if the + constructor is Right, because it + must be; Either only has two + constructors. + + In a pattern, a wild card acts similarly + to a variable, only it doesn't bind the value to a name. + While we can't put the same variable name multiple times in a + single pattern, we can use a wild card as many times as we + need to. + + Another advantage of wild cards is that a Haskell compiler + can warn us if we introduce a variable name in a pattern, but + don't use it in a function's body; defining something but + forgetting to use it can often indicate a bug. Using a wild + card instead of an unused variable makes it explicit that we + really don't care what value is present, and will prevent such + a warning. + + Wild cards also help readability, as they make it easier + to tell which values we're really using. + + &SimpleTypes.hs:niceAccessors; + + + + + The case expression + + We're not limited to using patterns in function + definitions. The case expression lets us match + patterns at any time. Here's what it looks like. + + &Roots.hs:hasRealRoots; + + The case keyword is followed by an arbitrary + expression; the result of this expression is what we're + pattern matching on. The of keyword signifies + the end of the expression and the beginning of the block of + patterns and expressions. + + Each item in the block consists of a pattern, followed by + an arrow ->, followed by an expression to + evaluate if that pattern matches. The result of the + case expression is the result of the expression + associated with the first pattern to match, taken from top to + bottom. + + To express here's the expression to evaluate if + none of the other patterns match, we would just use + the wild card pattern _ as the last in our list + of patterns. + + + + A flying visit back to the where clause + + Now that we've seen that we can define a function as a + series of equations, the usefulness of the where + clause should be a bit more clear. Variables that we define + inside a where clause are visible across all of + the equations that precede it in a single block. + + + + Early pattern matching pitfalls + + There are a few ways in which new Haskell programmers can + misunderstand or misuse patterns. Here are a few potential + missteps that you can easily avoid. + + There's no way to write a pattern that compares + a value with a variable. Matching a pattern only lets us + perform exact comparisons against combinations of constructors + and simple values. + + Here's a well-intentioned example of pattern matching gone + awry. This code compiles cleanly, but depending on what you + expect it to do, it might surprise you. + + &BogusPattern.hs:isHead; + + A naive glance suggests that this code is trying + to check the value of f to see if it's + actually the standard function head, but + here's what it is really doing. + + Because the first pattern in the + case expression is a variable, this branch of the + case will always match, no + matter what the value of f is. The name + head thus acts as a local variable whose + value is the value of f, which hides the + global definition of the well-known head + function. + + + Irrefutable patterns + + A pattern that consists only of a variable will always + match, because it's not being compared against any value that + could cause the match to fail. We refer to patterns that + always match as irrefutable. + + + The first pattern always matches, because it's + irrefutable. But the second pattern also + always matches, because it uses a wild card. However, because + Haskell attempts to match patterns in the order in which we + write them, the first pattern will always succeed, and the + second pattern will never actually be reached. Because the two + patterns will match the same values, they are said to + overlap. If &GHC; ever complains to you + about overlapping patterns, it's telling you that one of your + patterns is the same as another, and so it will never actually + be matched. + + Another thing to be aware of is that a variable + can only appear once in a pattern. For example, we can't put + a variable in multiple places within a pattern to express the + notion this value and that should be + identical. + + The way around these restrictions of Haskell's patterns is + to use patterns in combination with a language facility called + guards, which we'll talk about + next. + + + + + + Conditional evaluation with guards + + We can further extend our expressive arsenal using + guards. A guard is an expression of type + Bool; if it evaluates to True, + the equation that follows it is evaluated. Otherwise, the next + guard in the series is evaluated, and so on. A series of guards + is only checked if the patterns that they're associated with + match. Here's an example of guards in action. + + &Roots.hs:guardedRoots; + + Each guard is introduced by a | symbol, + followed by the guard expression, then an = symbol + (or -> if within a case + expression), then the expression to evaluate if the guard + succeeds. A guard expression can use any variables matched in + the pattern that precedes it. + + The otherwise used in the second guard + has an obvious meaning: it's the expression to evaluate if + previous guards all evaluate to False. It's + not a special piece of syntax, though; it's just a predefined variable + whose value is True. + + We can use guards anywhere that we can use + patterns. The advantage of writing a function as a series of + equations using pattern matching and guards is that it often + makes code much clearer. Remember the + myDrop function we defined in ? + + &myDrop.hs:myDrop.noid; + + Here's a reformulation of that function using + patterns and guards. Instead of reasoning about what an + if expression is doing and which branch will be + evaluated, the code uses a series of equations with simple + patterns and guards. Hoisting the control decisions to the + outside of the code, instead of burying it inside + with if expressions, lets us enumerate up front + the cases in which we expect the behaviour of the function to + differ. + + &myDrop.hs:niceDrop; + + Let's return to one of the limitations of patterns that we + mentioned in the previous section: the fact that we can't + check two variables within a pattern for equality. + We can express this quite easily by following the pattern with a + guard. + + &Guard.hs:secondEqualsThird; + + Here, for good measure, we've illustrated guard + syntax in a case expression. This guard expression + compares the variables matched in the pattern for equality. + + + + Infix functions + + Usually, when we define or call a function in Haskell, we + write the name of the function, followed by its arguments; this + is called prefix notation, because the name of the function + comes before its arguments. For a function that takes two + arguments, we have the option of using it in + infix form, between its first and second + arguments. This allows us to write expressions using functions + as if they were infix operators. + + The syntax for defining or calling a function in infix form + is to enclose the name of the function in backtick characters + (sometimes known as backquotes). Here's a simple infix + definition. + + &Plus.hs:plus; + + Defining a function in infix form doesn't change anything + about the behaviour of the function. We can call the function + using infix or prefix notation, as we prefer. + + &infix.ghci:plus; + + Infix notation is useful for more than just our own + functions. For example, Haskell's standard + Data.List module defines a function, + isPrefixOf, that indicates whether all + elements of its first argument are equal to the first elements + of its second argument. + + &infix.ghci:type; + + Let's define a few variables in &ghci;. + + &infix.ghci:vars; + + If we call isPrefixOf using prefix + notation, we can have a hard time remembering which argument + we're checking for as a prefix of the other. + + &infix.ghci:prefix; + + But if we use infix notation, the code reads + more naturally; it's now obvious that we're checking the + variable on the left to see if it's a prefix of the variable on + the right. + + &infix.ghci:infix; + + There's no hard-and-fast rule that dictates when you ought + to use infix versus prefix notation, although prefix notation is + far more common. It's best to choose whichever makes your code + more readable in a specific situation. + + + The backtick notation is not a general mechanism: it's a + piece of special syntax that applies only to names. For + example, we can't put backticks around an expression that + returns a function, and then treat that as an infix + function. + + + + + Conclusion + + In this chapter, we've seen how to define our own + algebraic data types. We've read about Haskell's offside rule + for laying out functions, and how we can avoid it if we need to. + We've seen pattern matching and guards. We've discussed error + handling. + + At this point, our basic toolbox is complete. In + , we'll use this knowledge to develop our + functional programming and thinking skills. + + + + + hunk ./en/ch04-fp.xml 277 - data type in , and know + data type in , and know hunk ./en/ch07-io.xml 94 - linkend="hs.funcstypes.unit"/> for details. + linkend="hs.deftypes.unit"/> for details. hunk ./en/ch09-find-dsl.xml 916 - linkend="hs.funcstypes.tabs"/>, never use tab characters + linkend="hs.deftypes.tabs"/>, never use tab characters hunk ./en/ch14-monads.xml 1078 - Back in , we + Back in , we hunk ./examples/ch03/func.ghci 30 -lines "the quick\nbrown fox" +lines "the quick\nbrown fox\njumps" hunk ./examples/ch03/myDrop.hs 1 +{-- snippet myDrop.type --} +myDrop :: Int -> [a] -> [a] +{-- /snippet myDrop.type --} + hunk ./examples/ch03/myDrop.hs 8 --- this is the function's type signature -myDrop :: Int -> [a] -> [a] - hunk ./examples/ch03/tuple.ghci 5 ---# type -:type (False, ['a', 'm'], ("nested", True)) +--# type1 +:type (False, 'a') + +--# type2 +:type (False, 'a', 'b') }