Chapter 3. Types and Functions

Table of Contents

Haskell's type system
Some common basic types
Useful composite data types: lists and tuples
Exercises
Calling functions
Passing an expression to a function
Understanding a function's type signature
Haskell source files, and writing simple functions
Just what is a variable, anyway?
Conditional evaluation
Reporting errors
Exercises
Defining a new data type
Algebraic data types
Analogues to algebraic data types in other languages
A few final notes
Parameterised types
Back to writing functions: local variables
How to represent a complicated result
The offside rule, and white space in a function body
A note about tabs versus spaces
The offside rule is not mandatory
Recursive types
A little more about lists
Unit, the zero-element tuple
Pattern matching
The don't-care, or wild card, pattern
The case expression
A flying visit back to the where clause
Limits to pattern matching
Conditional evaluation with guards
Infix functions
Conclusion

Topics: built-in types. Writing functions. Creating new types.

Haskell's type system

Haskell has a strong, static type system with inference.” For a short sentence, this one gives us a lot for us to pore over. Because Haskell is quite different from mainstream programming languages in how it treats types, let's not assume too much shared understanding as we talk about types.

Every value and expression in Haskell has a type. The type of a value indicates that it shares certain properties with other values that have the same type. (You'll see us refer to a value “having the type X”, or “being of type X”. The two phrases mean the same thing.) All values that have the type Integer have the ability to be added to other values of type Integer, and so on.

Haskell has a “strong” type system, in which every expression and value has exactly one type. Another aspect of Haskell's view of strong typing is that it will not automatically convert values from one type to another, a feature present in some other languages.

[Note]Note

Conversations about type systems can lead to many a misunderstanding among programmers. Some people say that C has a strong type system; others claim that Python does. Haskell programmers think strong typing is still something else.

Since there's no universally agreed upon meaning for the phrase “strong type”, it's best not to assume that someone else shares the same notion of it as you do. (We also suggest that putting forth one definition as better than another doesn't often lead to enlightened discussion.)

Having a “static” type system means that the compiler knows the type of every value and expression at compile time, before any code is ever executed. A Haskell compiler or interpreter will detect when we try to use types inconsistently in our code, and reject the code with an error message.

ghci> 'j' :: Int

<interactive>:1:0:
    Couldn't match expected type `Int' against inferred type `Char'
    In the expression: 'j'
    In the expression: 'j' :: Int
    In the definition of `it': it = 'j' :: Int
ghci> 'j' :: Int

<interactive>:1:0:
    Couldn't match expected type `Int' against inferred type `Char'
    In the expression: 'j'
    In the expression: 'j' :: Int
    In the definition of `it': it = 'j' :: Int

Haskell's combination of strong and static typing also makes it impossible for type errors to occur at runtime.

Finally, “type inference” means what it says: the compiler can automatically figure out the types of most values for us, so that we don't have to explicitly label them.

We've already briefly seen Haskell's notation for types in the section called “First steps with types”. We write expression :: MyType to say that expression has the type MyType.

Some common basic types

We've already encountered a few types in the section called “First steps with types”. Haskell has a number of built-in types that we'll use over and over. Here are some of the language's basic types, types that aren't composed of other types

  • The Char type represents a character. The values of Char are drawn from the Unicode character set, which covers most of the world's written languages.

  • The Bool type represents a value in Boolean logic. The possible values of Bool are True and False.

  • The Int type represents a signed, fixed-width integer. The exact range of values represented by Int depends on the system's longest “native” integer: on a 32-bit machine, an Int is usually 32 bits wide, while on a 64-bit machine, it is usually 64 bits wide.

  • The Integer type represents a signed integer of arbitrary size. Integers are not used as often as Ints, because they're a lot more expensive to work with.

  • The Double type is the type usually used to represent floating point numbers. It is typically 64 bits wide, and uses the machine's native floating point representation. (A narrower type, Float, also exists, but its use is discouraged; Haskell compiler writers concentrate more on making Double efficient.)

Useful composite data types: lists and tuples

The most common composite data types in Haskell are the list and tuple.

We've already seen the list type mentioned in the section called “Text, strings and lists”, where we found that Haskell represents a text string as a list of Char values, and that the type “list of Char” is written [Char].

More generally, we can write the type “list of a” for any type a by enclosing a in square brackets, [a]. Lists are strongly typed: a list of one type has an identity of its own, distinct from a list of another type. The type [Int] is a list that can only contain values of type Int, for example. We also call the list type polymorphic, because it can contain any type of value.

Lists are the “bread and butter” of Haskell collections. Whereas in an imperative language, we might repeat a task over many items by iterating through a loop, this is something that we tend to do in Haskell by recursing over a list. We'll be spending a lot more time discussing lists in Chapter 4, Functional programming.

A tuple is a fixed-size collection of values, each of which can be of any type. Unlike a list (the elements of which must all have the same type), there's no need for the elements of a tuple to have related types. We write a tuple by enclosing its elements in parentheses and separating them with commas. We use the same notation for writing its type.

ghci> (True, "hello") :: (Bool, String)
(True,"hello")
ghci> (4, ['a', 'm'], (16, True))
(4,"am",(16,True))
ghci> (True, "hello") :: (Bool, String)
(True,"hello")
ghci> (4, ['a', 'm'], (16, True))
(4,"am",(16,True))

We can construct a tuple with any number of elements (although in practice, a tuple with more than a handful becomes unwieldy). A tuple's type encodes the number and types of its elements in its own type. This means that tuples containing different numbers or types of elements have distinct types of their own.

ghci> :type (4 :: Int, ['a', 'm'], (16 :: Integer, True))
(4 :: Int, ['a', 'm'], (16 :: Integer, True)) :: (Int, [Char], (Integer, Bool))
ghci> :type (4 :: Int, ['a', 'm'], (16 :: Integer, True))
(4 :: Int, ['a', 'm'], (16 :: Integer, True)) :: (Int, [Char], (Integer, Bool))

A two-tuple of an Int and a String has a different type than a two-tuple of a Bool and a Bool, for example, and a three-tuple has a different type than a four-tuple.

Probably the most common use of tuples is to let us return multiple values from a function. We can also use them in other places where we want a fixed-size collection of values, for which the kind of container we're using isn't of much importance. An example of this might be to represent a row in the result of a a database query.

Exercises

1.

What are the types of the following expressions?

  • False

  • (["foo", "bar"], 'a')

  • [(True, []), (False, [['a']])]

Calling functions

Now that we've had our fill of data types for a while, let's turn our attention to working with some of the types we've seen. We've already seen how to perform arithmetic in the section called “Simple arithmetic”, and it looks quite as it does in other languages.

However, our discussion of lists and tuples mentioned how we can construct them, but not how we do anything with them afterwards. Haskell defines a large library of functions for working with lists and (to a lesser extent) tuples, so let's find out how to use a few of those functions.

To call a function in Haskell, we provide the name of the function followed by its arguments, all separated by white space. As an example, let's call the head function, which returns the first element of a list.

ghci> head [1,2,3]
1
ghci> head [1,2,3]
1

Its counterpart, tail, returns all but the head of a list.

ghci> tail [True, False]
[False]
ghci> tail [True, False]
[False]

A related pair of functions, take and drop, take two arguments: given a number and a list, they return either the first, or all but the first, number of elements of the list. (Here, remember that a String is a list of Char.)

ghci> take 2 "abcdefg"
"ab"
ghci> drop 3 "abcdefg"
"defg"
ghci> take 2 "abcdefg"
"ab"
ghci> drop 3 "abcdefg"
"defg"

If you're used to function call syntax in other languages, this notation can take a little getting used to, but it's undeniably simple and uniform. Here's what we mean by uniform.

ghci> snd (1,2)
2
ghci> snd (1,2)
2

In some other languages, the call to snd above might mean “call snd with two arguments, 1 and 2”, but in Haskell, it's a single-argument call, passing the tuple (1,2).

By the way, snd has a companion function, fst, which returns the first element of a tuple.

ghci> fst ("you", True)
"you"
ghci> fst ("you", True)
"you"

Passing an expression to a function

Haskell evaluates an expression from left to right. If we want to use the result of one expression as an argument to another, we have to keep this in mind and use parentheses to tell the parser what we really mean. Here's an example.

ghci> head (drop 4 "azerty")
't'
ghci> head (drop 4 "azerty")
't'

We can read this as “pass the result of the expression drop 4 "azerty" as the argument to head”. If we were to leave out the parentheses, Haskell would instead interpret the expression as “pass drop as the argument to head”. Compilation would fail with a type error, as drop is a function, not a list.

Understanding a function's type signature

A consequence of Haskell's strong typing is that fst and snd only accept two-tuples as arguments. Let's see what ghci tells us about their types.

ghci> :type fst
fst :: (a, b) -> a
ghci> :type snd
snd :: (a, b) -> b
ghci> :type fst
fst :: (a, b) -> a
ghci> :type snd
snd :: (a, b) -> b

We can read the -> above as “returns”. The entire signature thus tells us that fst takes any two-tuple whose elements are of types a and b (each of which can be of any type), and returns a value with type a. Sure enough, if we try calling fst with a three-tuple, things don't go so well.

ghci> fst (1,True,"foo")

<interactive>:1:4:
    Couldn't match expected type `(a, b)'
	   against inferred type `(a1, b1, c)'
    In the first argument of `fst', namely `(1, True, "foo")'
    In the expression: fst (1, True, "foo")
    In the definition of `it': it = fst (1, True, "foo")
ghci> fst (1,True,"foo")

<interactive>:1:4:
    Couldn't match expected type `(a, b)'
	   against inferred type `(a1, b1, c)'
    In the first argument of `fst', namely `(1, True, "foo")'
    In the expression: fst (1, True, "foo")
    In the definition of `it': it = fst (1, True, "foo")

Notice that the type signatures above give us a strong hint as to what these functions might actually do. This is an incredibly valuable property of types in a functional language. Since there aren't usually any side effects for us to worry about, figuring what a function does can often be a matter of reading its name and understanding its type signature, with no “regular documentation” required.

Just as we called the list type polymorphic because it can contain values of any type, when a function has type variables in its signature, indicating that some of its arguments can be of any type, we call the function polymorphic, too.

So far, we haven't seen a signature for a function that takes more than one argument. We've already encountered a few such functions; let's look at take.

ghci> :type take
take :: Int -> [a] -> [a]
ghci> :type take
take :: Int -> [a] -> [a]

It's pretty clear that there's something going on with an Int and some lists, but why are there two -> symbols in the signature? Haskell parses this chain of arrows from right to left. If we introduce parentheses, it makes it clearer how Haskell is interpreting this type signature.

take :: Int -> ([a] -> [a])take :: Int -> ([a] -> [a])

From this, it looks like we ought to read the type signature as a function that takes one argument, an Int, and returns another function. That other function also takes one argument, a list, and returns a list of the same type as its argument.

This is an intriguing idea, but it's not easy to see just yet what its consequences might be. We'll return to this topic in the section called “Partial function application and currying”, once we've spent a bit of time writing functions.

Haskell source files, and writing simple functions

Now that we know how to call functions, it's time we turned our attention to writing them. While we can write functions in ghci, it's not a good environment for doing so, because it limits any expression or definition to one line in length. So instead, we'll finally break down and create a source file.

Haskell source files are usually identified with a suffix of .hs. Here's a simple function definition: open up a file named add.hs, and add these contents to it.

add a b = a + badd a b = a + b

On the left hand side of the = is the name of the function, followed by the arguments to the function. On the right hand side is the body of the function. With our source file saved, we can load it into ghci, and use our new add function straight away.

ghci> :load add.hs
[1 of 1] Compiling Main             ( add.hs, interpreted )
Ok, modules loaded: Main.
ghci> add 1 2
3
ghci> :load add.hs
[1 of 1] Compiling Main             ( add.hs, interpreted )
Ok, modules loaded: Main.
ghci> add 1 2
3

When we call add, the variables a and b on the left hand side of our definition are given (or “bound to”) the values 1 and 2, then the right hand side is evaluated, and the result returned.

Haskell doesn't need a return statement; the result of a function is the result of evaluating whatever expression is in the function's body.

Just what is a variable, anyway?

If you're used to imperative programming languages, you're likely to think of a variable as a way of identifying a “box”. You can get the value that's currently in a box, and you can change the value in the box using assignment.

In Haskell, a variable is the name of a value. This is critically different from the name of a box: we're used to being able to change what's inside a box from our dealings with other languages, but we can't change the “twoness” of the value 2.

Creating a variable in Haskell just gives a convenient name to a value; we can't change the value associated with a variable. This bears a strong similarity to the way we think about variables in mathematics.

Conditional evaluation

Like other languages, Haskell has an if expression. Let's see it in action, then we'll explain what's going on. As an example, we'll write our own version of the standard take function. Before we begin, let's probe a little bit of how take behaves, so we can replicate its behaviour.

ghci> take 0 "foo"
""
ghci> take 1 "foo"
"f"
ghci> take 4 [1,2]
[1,2]
ghci> take 7 []
[]
ghci> take (-2) "foo"
""
ghci> take 0 "foo"
""
ghci> take 1 "foo"
"f"
ghci> take 4 [1,2]
[1,2]
ghci> take 7 []
[]
ghci> take (-2) "foo"
""

From the above, it seems that take returns an empty list if the number to remove is greater than the number of elements, and that it treats negative numbers as zero. Here's a myTake function that has the same behaviour, and uses Haskell's if expression to decide what to do.

myTake :: Int -> [a] -> [a]

myTake n xs = if n <= 0 || null xs
              then xs
              else myTake (n - 1) (tail xs)myTake :: Int -> [a] -> [a]

myTake n xs = if n <= 0 || null xs
              then xs
              else myTake (n - 1) (tail xs)

Let's save it in a file named myTake.hs, then load it into ghci.

ghci> :load myTake.hs
[1 of 1] Compiling Main             ( myTake.hs, interpreted )
Ok, modules loaded: Main.
ghci> myTake 0 "foo"
"foo"
ghci> myTake 1 "foo"
"oo"
ghci> myTake 4 [1,2]
[]
ghci> myTake 7 []
[]
ghci> myTake (-2) "foo"
"foo"
ghci> :load myTake.hs
[1 of 1] Compiling Main             ( myTake.hs, interpreted )
Ok, modules loaded: Main.
ghci> myTake 0 "foo"
"foo"
ghci> myTake 1 "foo"
"oo"
ghci> myTake 4 [1,2]
[]
ghci> myTake 7 []
[]
ghci> myTake (-2) "foo"
"foo"

Now that we've seen myTake in action, let's return to the source code and look at a few of the novelties we've introduced.

First is the if keyword itself. It takes an expression of type Bool. If that evaluates to True, it evaluates the expression on the then branch. Otherwise, it evaluates the expression on the else branch.

The combination of if, then, else and the expression after each keyword combine to make up a single expression, as far as Haskell is concerned. Whichever branch is evaluated is the result of the if. Because an expression can only have one type, the expressions in the then and else branches must have the same type. If they don't, an if expression won't typecheck.

Whereas it can make sense in an imperative language to omit the else branch from an if, this would be nonsensical in Haskell. An if expression that was missing an else wouldn't have a value if the condition after the if evaluated to False, so it couldn't typecheck.

The second novelty is almost trivial: the null function, which we use in the Boolean portion of the if, indicates whether a list is empty.

ghci> :type null
null :: [a] -> Bool
ghci> :type null
null :: [a] -> Bool

Third is that our function calls itself recursively. This is an early example of how, in Haskell, we use recursion where in an imperative language we'd probably use a loop.

Finally, our if expression spans several lines. We line the then and else branches up under the if for neatness, but this is not mandatory. We could put all of them on a single line, for example, but we'd end up with less readable code.

myTake2 n xs = if n <= 0 || null xs then xs else myTake (n - 1) (tail xs)myTake2 n xs = if n <= 0 || null xs then xs else myTake (n - 1) (tail xs)

Reporting errors

Haskell provides a standard function, error :: Char -> a, that we can call when something has gone terribly wrong in our code. We give it a string parameter, which is the error message to display. Its type signature looks peculiar; how can it produce a value of any type a given just a string? The answer is that it doesn't, because error is special.

It has that type so that we can call it anywhere and it will always have the right type. However, instead of returning a value like a normal function, it immediately terminates evaluation, and prints the error message we give it. Here's an example. The mySecond function returns the second element of its input list, but fails if its input list isn't long enough.

mySecond :: [a] -> a

mySecond xs = if null (tail xs)
              then error "list too short"
              else head (tail xs)mySecond :: [a] -> a

mySecond xs = if null (tail xs)
              then error "list too short"
              else head (tail xs)

As usual, we can see how this works in practice in ghci.

ghci> mySecond "xi"
'i'
ghci> mySecond [2]
*** Exception: list too short
ghci> head (mySecond [[9]])
*** Exception: list too short
ghci> mySecond "xi"
'i'
ghci> mySecond [2]
*** Exception: list too short
ghci> head (mySecond [[9]])
*** Exception: list too short

Notice that in the third case above, where we're trying to use the result of the call to mySecond as the argument to another function, evaluation still terminates and drops us back to the ghci prompt. This is both the strength and weakness calling error: it doesn't let us distinguish between recoverable errors and problems so severe that they really should terminate a program. We'll see one way around this problem in the section called “How to represent a complicated result”.

Exercises

1.

Haskell provides a standard function, last :: [a] -> a, that returns the last element of a list. Write a function lastButOne, that returns the element before the last.

2.

Load your lastButOne function into ghci, and try it out on lists of different lengths. What happens when you pass it a list that's too short?

Defining a new data type

Although lists and tuples are useful, we'll still often want to construct new data types of our own. We define a new data type using the data keyword.

data MyType = MyConstructor Int String
            deriving (Show)data MyType = MyConstructor Int String
            deriving (Show)

The MyType after the data keyword is the name of our new type. (As we've already mentioned, a type name must start with a capital letter.) The string MyConstructor is the name of the constructor we'll call to create a value of this type. (As with a type name, a constructor name must start with a capital letter.) Finally, the Int and String are the components of the type. A component serves the same purpose in Haskell as a field in a structure or class would in another language.

[Note]Note

We'll explain the full meaning of deriving (Show) later, in the section called “Show”. For now, it's enough to know that we need to tack this onto a type declaration so that ghci will automatically know how to print a value of this type.

We can create a new value of type MyType by treating MyConstructor as a function, and calling it with arguments of types Int and String.

myValue = MyConstructor 31337 "Creating a value!"myValue = MyConstructor 31337 "Creating a value!"

Once we've defined a type, we can experiment with it in ghci, starting by using the :load command to load our source file.

ghci> :load MyType.hs
[1 of 1] Compiling Main             ( MyType.hs, interpreted )
Ok, modules loaded: Main.
ghci> :load MyType.hs
[1 of 1] Compiling Main             ( MyType.hs, interpreted )
Ok, modules loaded: Main.

Remember the myValue variable we defined? Here it is.

ghci> myValue
MyConstructor 31337 "Creating a value!"
ghci> :type myValue
myValue :: MyType
ghci> myValue
MyConstructor 31337 "Creating a value!"
ghci> :type myValue
myValue :: MyType

We can construct new values interactively in ghci, too.

ghci> MyConstructor 1 "foo"
MyConstructor 1 "foo"
ghci> MyConstructor 1 "foo"
MyConstructor 1 "foo"

To find out more about a type, we can use some of ghci's browsing capabilities. The :info command gets ghci to tell us everything it knows about a type.

ghci> :info MyType
data MyType = MyConstructor Int String 	-- Defined at MyType.hs:2:5
instance Show MyType -- Defined at MyType.hs:2:5
ghci> :info MyType
data MyType = MyConstructor Int String 	-- Defined at MyType.hs:2:5
instance Show MyType -- Defined at MyType.hs:2:5

We can also find out why we use MyConstructor to construct a new value of type MyType.

ghci> :type MyConstructor
MyConstructor :: Int -> String -> MyType
ghci> :type MyConstructor
MyConstructor :: Int -> String -> MyType

From Haskell's perspective, then, a constructor is just another function, one that happens to return a value of the type we want to construct.

Algebraic data types

The Bool type that we introduced earlier is the simplest example of a sort of type called an algebraic data type. An algebraic data type has a fixed set of possible values, each of which is identified by a distinct constructor.

In the case of Bool, the type has two constructors, True and False. Each constructor is separated by a | character, which we can read as “or”. These are usually referred to as alternatives or cases.

data Bool = False | Truedata Bool = False | True

Each constructor can take zero or more arguments; the numbers and types of the arguments accepted by each constructor are independent. For example, here's one way we might represent versions of the Windows operating system, where old releases were monolithic, and newer releases have “service pack levels” denoting major updates after their initial releases.

data WindowsVersion = Win95
                    | Win98
                    | WinME
                    | WinNT Int
                    | WinXP Int
                    | WinVista Int
                      deriving (Show)data WindowsVersion = Win95
                    | Win98
                    | WinME
                    | WinNT Int
                    | WinXP Int
                    | WinVista Int
                      deriving (Show)

The alternatives that represent older releases don't need arguments, but those for the newer releases need an Int to represent the patch level.

Analogues to algebraic data types in other languages

Algebraic data types provide a single structuring mechanism in instances where other languages have several different building blocks. Here are some analogues from C and C++, which might make it clearer what we can do with algebraic data types.

With just one constructor, an algebraic data type groups related values into one, and gives that value an identity distinct from other types. It corresponds to a struct in C or C++, and its components to the fields of a struct.

If an algebraic data type has multiple alternatives, we can think of it as similar to a union in C or C++. A big difference between the two is that a union doesn't tell us which alternative is actually present; we have to record which alternative we're using ourselves, usually in another field of a struct. This means that unions can sometimes be sources of bugs, where our notion of which alternative we should be using is incorrect.

With an algebraic data type, Haskell stores the constructor that we use in the value that we create, so we don't need to manually sock it away somewhere else.

Algebraic data types also serve where we'd use an enum in C or C++, to represent a range of discrete symbolic values.

data Roygbiv = Red
             | Orange
             | Green
             | Blue
             | Indigo
             | Violet
               deriving (Show)

-- Equivalent in C or C++:
--
--   enum roygbiv { red, orange, green, blue, indigo, violet };
data Roygbiv = Red
             | Orange
             | Green
             | Blue
             | Indigo
             | Violet
               deriving (Show)

-- Equivalent in C or C++:
--
--   enum roygbiv { red, orange, green, blue, indigo, violet };

A few final notes

From reading the preceding sections, it should be clear that all of the data types that we define with the data keyword are algebraic data types. Some may have just one alternative; others have several; but they're all using the same organising machinery.

Another useful thing to know is that it's perfectly okay for the name of an algebraic type to have the same name as one of its constructors to be the same. It's always obvious from context whether we're using a name to refer to a type or a constructor, so this doesn't introduce any ambiguity.

Giving a type and its constructor the same name is in fact something that we do frequently when the type has just one constructor.

data PerfectlyNormal = PerfectlyNormal Intdata PerfectlyNormal = PerfectlyNormal Int

When a type has multiple constructors, it's still legal to give one of the constructors the same name as the type, but this is much less common.

data LegalButWeird = SomethingFishy
                   | LegalButWeird Stringdata LegalButWeird = SomethingFishy
                   | LegalButWeird String

Parameterised types

In our discussion of lists, we mentioned that we can create a list of values of any type. We can define our own types that allow this, too. To do this, we introduce variables into a type declaration.

data Wrapper a = Wrapper a
                 deriving (Show)data Wrapper a = Wrapper a
                 deriving (Show)

Here, the variable a is not a regular variable; it's called a type variable, because it indicates that our Wrapper type takes another type as its parameter (hence calling it a parameterised type). What this lets us do is use Wrapper on values of any type.

wrappedInt :: Wrapper Int

wrappedInt = Wrapper 42

wrappedString = Wrapper "foo"wrappedInt :: Wrapper Int

wrappedInt = Wrapper 42

wrappedString = Wrapper "foo"

As usual, we can load our source file into ghci and experiment with it.

ghci> :load Wrapper.hs
[1 of 1] Compiling Main             ( Wrapper.hs, interpreted )
Ok, modules loaded: Main.
ghci> :type wrappedString
wrappedString :: Wrapper [Char]
ghci> Wrapper [1,2,3]
Wrapper [1,2,3]
ghci> :type Wrapper
Wrapper :: a -> Wrapper a
ghci> :load Wrapper.hs
[1 of 1] Compiling Main             ( Wrapper.hs, interpreted )
Ok, modules loaded: Main.
ghci> :type wrappedString
wrappedString :: Wrapper [Char]
ghci> Wrapper [1,2,3]
Wrapper [1,2,3]
ghci> :type Wrapper
Wrapper :: a -> Wrapper a

Wrapper is a “generic” container type (albeit a fairly useless one); we can construct a Wrapper from a value of any type. It is also strongly typed; the type of whatever it contains is encoded in its own type.

To once again extend an analogy to more familiar languages, this gives us a facility that bears some resemblance to templates in C++, and to generics in Java. (In fact, Java's generics facility was inspired by several aspects of Haskell's type system.)

We can nest uses of parameterised types inside each other, but when we do, we may need to use parentheses to tell the Haskell compiler what we mean.

multiplyWrapped :: Wrapper (Wrapper Int)

multiplyWrapped = Wrapper (Wrapper 7)multiplyWrapped :: Wrapper (Wrapper Int)

multiplyWrapped = Wrapper (Wrapper 7)

Back to writing functions: local variables

Let's take a break from writing about types for a few moments. Within the body of a function, we can introduce new local variables whenever we need them, using a let expression. As an example, let's write a function that calculates the real-valued roots of the quadratic equation a * (x ** 2) + b * x + c == 0.

realRoots :: Double -> Double -> Double -> Maybe (Double, Double)

realRoots a b c = let n  = b**2 - 4 * a * c
                      a2 = 2 * a
                      r1 = (-b + sqrt n) / a2
                      r2 = (-b - sqrt n) / a2
                  in if n >= 0 && a /= 0
                     then Just (r1, r2)
                     else NothingrealRoots :: Double -> Double -> Double -> Maybe (Double, Double)

realRoots a b c = let n  = b**2 - 4 * a * c
                      a2 = 2 * a
                      r1 = (-b + sqrt n) / a2
                      r2 = (-b - sqrt n) / a2
                  in if n >= 0 && a /= 0
                     then Just (r1, r2)
                     else Nothing

The keywords to look out for here are let, which starts a block of variable declarations, and in, which ends it. Each line introduces a new variable. The name is on the left of the =, and its value on the right. We can use these variables both within our block of variable declarations and in the expression that follows the in keyword.

There's no problem with a variable earlier in a let block referring to a later one, or even with them referring to each other. (In some functional languages, this sort of flexible let is named letrec.)

We can have multiple let blocks within an expression. There's also another mechanism we can use to introduce local variables, called a where block. The definitions in a where block apply to the code that precedes it. Let's illustrate what we mean with another example.

import Data.Complex

roots :: Double -> Double -> Double
      -> Either (Complex Double, Complex Double) (Double, Double)

roots a b c = if n >= 0
              then Right ((-b + sqrt n) / a2, (-b - sqrt n) / a2)
              else Left ((-b' + sqrt n') / a2', (-b' - sqrt n') / a2')
    where n   = b**2 - 4 * a * c
          a2  = 2 * a
          n'  = n :+ 0
          b'  = b :+ 0
          a2' = a2 :+ 0import Data.Complex

roots :: Double -> Double -> Double
      -> Either (Complex Double, Complex Double) (Double, Double)

roots a b c = if n >= 0
              then Right ((-b + sqrt n) / a2, (-b - sqrt n) / a2)
              else Left ((-b' + sqrt n') / a2', (-b' - sqrt n') / a2')
    where n   = b**2 - 4 * a * c
          a2  = 2 * a
          n'  = n :+ 0
          b'  = b :+ 0
          a2' = a2 :+ 0

Here, the roots function returns the real roots when they're defined, and the complex roots otherwise. (We left out the divide-by-zero case for simplicity.) While a “where” clause initially looks very weird to non-Haskell programmers, it's a great way to put the “important” code early, followed by the auxiliary definitions that support it. After a while, you'll find yourself missing where clauses in languages that lack them!

The main difference between let and where is one of scope. The scope of a let only extends to the expression after the in keyword, while the variables introduced by a where clause are visible upwards to the beginning of the block that it “belongs” to. Also, let is always paired with an expression, but where is paired with a block of equations.

We'll be talking more about how to write let expressions and where clauses in the section called “The offside rule, and white space in a function body”.

[Note]Note

It's probably obvious from context above, but (:+) is the constructor for a complex number, taking the real part on the left and the imaginary part on the right.

Also, Complex is parameterised over the type of complex number it should represent. In practice, it only makes much sense to use Complex Double, since GHC implements Double more efficiently than Float.

How to represent a complicated result

We sneaked two previously unseen standard types, Maybe and Either, into our root-finding examples. We use Maybe when it might not make sense to return a normal result, for example because a function's result is undefined for some inputs. We use Just to say “we have a result”, and the argument to Just is that result. When we can't give a result, we use Nothing, which takes no arguments.

ghci> Just "answer"
Just "answer"
ghci> Nothing
Nothing
ghci> Just "answer"
Just "answer"
ghci> Nothing
Nothing

Why do we need Maybe here? The real-valued roots of a quadratic equation are infinity when a, the coefficient of x ** 2, is zero.

ghci> :load Roots.hs
[1 of 1] Compiling Main             ( Roots.hs, interpreted )

Roots.hs:56:30: Not in scope: `c'
Failed, modules loaded: none.
ghci> realRoots 0 1 2

<interactive>:1:0: Not in scope: `realRoots'
ghci> :load Roots.hs
[1 of 1] Compiling Main             ( Roots.hs, interpreted )

Roots.hs:56:30: Not in scope: `c'
Failed, modules loaded: none.
ghci> realRoots 0 1 2

<interactive>:1:0: Not in scope: `realRoots'

They're also not defined when b ** 2 - 4 * a * c is negative, because we would need to use complex numbers to represent a negative square root.

ghci> realRoots 1 3 4

<interactive>:1:0: Not in scope: `realRoots'
ghci> realRoots 1 3 4

<interactive>:1:0: Not in scope: `realRoots'

Otherwise, we can return a normal result, wrapped in Just.

ghci> realRoots 1 3 2

<interactive>:1:0: Not in scope: `realRoots'
ghci> realRoots 1 3 2

<interactive>:1:0: Not in scope: `realRoots'

Compared to error, which we saw in the section called “Reporting errors”, Maybe has the huge advantage that it's a normal value, which we return to our caller to deal with. Calling error, by contrast, is more akin to pulling on the program's ejector seat handle as it disintegrates around us; something terrible has happened, and we need to give up right now.

The Either a b type gives us even more flexibility than Maybe, as it's got two type parameters. We can wrap a value of any type a with Left, or a value of an unrelated type b with Right. Our roots function uses this to return a Double when the real roots are defined, and a Complex Double when they're not.

The offside rule, and white space in a function body

In our definition of realRoots, the left margin of our text wandered around quite a bit. This was not an accident: in Haskell, white space has meaning.

Haskell uses indentation as a cue to parse sections of code. This use of layout to convey structure is sometimes called the offside rule. At the top level, the first declaration or definition can start in any column, and the Haskell compiler or interpreter remembers that indentation level. Every subsequent top-level declaration must have the same indentation.

Here's an illustration of the top-level indentation rule. Our first file, GoodIndent.hs, is well behaved.

-- This is the leftmost column.

  -- It's fine for top-level declarations to start in any column...
  firstGoodIndentation = 1

  -- ...provided all subsequent declarations do, too!
  secondGoodIndentation = 2-- This is the leftmost column.

  -- It's fine for top-level declarations to start in any column...
  firstGoodIndentation = 1

  -- ...provided all subsequent declarations do, too!
  secondGoodIndentation = 2

Our second, BadIndent.hs, doesn't play by the rules.

-- This is the leftmost column.

    -- Our first declaration is in column 4.
    firstBadIndentation = 1

  -- Our second is left of the first, which is illegal!
  secondBadIndentation = 2-- This is the leftmost column.

    -- Our first declaration is in column 4.
    firstBadIndentation = 1

  -- Our second is left of the first, which is illegal!
  secondBadIndentation = 2

Here's what happens when we try to load the two files into ghci.

ghci> :load GoodIndent.hs
[1 of 1] Compiling Main             ( GoodIndent.hs, interpreted )
Ok, modules loaded: Main.
ghci> :load BadIndent.hs
[1 of 1] Compiling Main             ( BadIndent.hs, interpreted )

BadIndent.hs:8:2: parse error on input `secondBadIndentation'
Failed, modules loaded: none.
ghci> :load GoodIndent.hs
[1 of 1] Compiling Main             ( GoodIndent.hs, interpreted )
Ok, modules loaded: Main.
ghci> :load BadIndent.hs
[1 of 1] Compiling Main             ( BadIndent.hs, interpreted )

BadIndent.hs:8:2: parse error on input `secondBadIndentation'
Failed, modules loaded: none.

An empty line is treated as a continuation of the current item, as is a line indented to the right of the current current item.

The rules for let expressions and where clauses are similar. After a let or where keyword, the Haskell compiler or interpreter remembers the indentation of the next token it sees. If the next line is empty, or its indentation is further to the right than the previous line, this counts as continuing the previous line. On the other hand, if the indentation is the same as the previous line, this is treated as beginning a new item in the same block.

Here are nested uses of let and where.

bar = let b = 2
      in let a = b
         in abar = let b = 2
      in let a = b
         in a

In the body of bar, the variable a is only visible within the let expression that defines it. It's not visible to the let expression that defines b; only the result of the inner let expression is visible.

foo = a
    where a = b
              where b = 2foo = a
    where a = b
              where b = 2

Similarly, the scope of the first where clause is the definition of foo, but the scope of the second is just the first where clause.

The indentation we use for the let and where clauses makes our intentions easy to figure out.

A note about tabs versus spaces

If you are using a Haskell-aware text editor (e.g. Emacs), it is probably already configured to use space characters for all white space within a line. If your editor is not Haskell-aware, you should configure it to only use space characters.

The reason for this is simple portability. In an editor that uses a fixed-width font, tab stops are by default placed at different intervals on Unix-like systems (every eight characters) than on Windows (every four characters). This means that no matter what your personal beliefs are about where tabs belong, you can't rely on someone else's editor honouring your preferences. Any indentation that uses tabs is going to look broken under someone's configuration; this could lead to compilation problems, too. Using space characters instead avoids this problem entirely.

The offside rule is not mandatory

We can use explicit structuring instead of layout to indicate what we mean. To do so, we start a block of equations with an opening curly brace; separate each item with a semicolon; and finish the block with a closing curly brace. The following two uses of let have the same meanings.

bar = let a = 1
          b = 2
          c = 3
      in a + b + c

foo = let { a = 1;  b = 2;
        c = 3 }
      in a + b + cbar = let a = 1
          b = 2
          c = 3
      in a + b + c

foo = let { a = 1;  b = 2;
        c = 3 }
      in a + b + c

When we use explicit structuring, the normal layout rules don't apply, which is why we can get away with unusual indentation in the second let expression.

We can use explicit structuring anywhere that we'd normally use layout. It's valid for where clauses, and even top-level declarations. Just remember that although the facility exists, explicit structuring is hardly ever actually used in Haskell programs.

Recursive types

Here's a definition of a binary tree type.

data Tree a = Node (Tree a) (Tree a)
            | Leaf a
              deriving (Show)data Tree a = Node (Tree a) (Tree a)
            | Leaf a
              deriving (Show)

We call this a recursive type because Tree, the type we're defining, appears both on the left hand side and the right hand side of the definition: we define the type in terms of itself.

A little more about lists

Now that we're getting familiar with some of the jargon around types, we can revisit lists. Haskell's list type is a parameterised type, because we can make lists of any other type. It is also an algebraic data type, with two constructors. One is the empty list, written [] (sometimes pronounced “nil”, which is borrowed from Lisp).

ghci> []
[]
ghci> []
[]

The other is the (:) operator, often pronounced “cons” (this is short for “construct”, and also borrowed from Lisp). The (:) operator takes an element and a list, and constructs a new list.

ghci> 1 : []
[1]
ghci> 1 : [2]
[1,2]
ghci> 1 : []
[1]
ghci> 1 : [2]
[1,2]

We can use (:) repeatedly to add new elements to the front of a list.

ghci> "alpha" : "beta" : ["gamma", "delta"]
["alpha","beta","gamma","delta"]
ghci> "alpha" : "beta" : ["gamma", "delta"]
["alpha","beta","gamma","delta"]

The right hand side of (:) must be a list, and of the correct type. If it's not, we'll get an error.

ghci> True : False

<interactive>:1:7:
    Couldn't match expected type `[Bool]' against inferred type `Bool'
    In the second argument of `(:)', namely `False'
    In the expression: True : False
    In the definition of `it': it = True : False
ghci> True : ["wrong"]

<interactive>:1:8:
    Couldn't match expected type `Bool' against inferred type `[Char]'
      Expected type: Bool
      Inferred type: [Char]
    In the expression: "wrong"
    In the second argument of `(:)', namely `["wrong"]'
ghci> True : False

<interactive>:1:7:
    Couldn't match expected type `[Bool]' against inferred type `Bool'
    In the second argument of `(:)', namely `False'
    In the expression: True : False
    In the definition of `it': it = True : False
ghci> True : ["wrong"]

<interactive>:1:8:
    Couldn't match expected type `Bool' against inferred type `[Char]'
      Expected type: Bool
      Inferred type: [Char]
    In the expression: "wrong"
    In the second argument of `(:)', namely `["wrong"]'

Because (:) constructs a list from another list, the list type is recursive. So here we have a built-in type that's parameterised, recursive, and algebraic.

One consequence of lists being generic is that lists of lists, for example, aren't special in any way.

ghci> [["foo", "bar"], ["x", "y"]]
[["foo","bar"],["x","y"]]
ghci> [["foo", "bar"], ["x", "y"]]
[["foo","bar"],["x","y"]]

This has type [[String]], a list of lists of strings. But since String is just a synonym for [Char], it's really a list of lists of lists of Char. Whew!

We're not limited to building up lists one element at a time. Haskell defines an inline function, (++), that we can use to append one list onto the end of another.

ghci> :type (++)
(++) :: [a] -> [a] -> [a]
ghci> [2,3] ++ [4,5]
[2,3,4,5]
ghci> "joining " ++ " lists " ++ " together"
"joining  lists  together"
ghci> :type (++)
(++) :: [a] -> [a] -> [a]
ghci> [2,3] ++ [4,5]
[2,3,4,5]
ghci> "joining " ++ " lists " ++ " together"
"joining  lists  together"

The concat function takes a list of lists, and concatenates the whole lot into a single list.

ghci> :type concat
concat :: [[a]] -> [a]
ghci> concat [[1,1,2], [3,5,8], [11]]
[1,1,2,3,5,8,11]
ghci> :type concat
concat :: [[a]] -> [a]
ghci> concat [[1,1,2], [3,5,8], [11]]
[1,1,2,3,5,8,11]

Unit, the zero-element tuple

Haskell has a special tuple type with no elements, written (), and pronounced “unit”.

ghci> ()
()
ghci> :type ()
() :: ()
ghci> ()
()
ghci> :type ()
() :: ()

This type is only really used with parameterised data types, to indicate that one of the type parameters isn't being used. Since it doesn't encode any information, it's a rough equivalent to void in C.

Here's an example of () in use. We can generalise our earlier Tree type a little, so that internal nodes contain values of type a, while leaves contain values of type b.

data ComplexTree a b = ComplexNode a (ComplexTree a b) (ComplexTree a b)
                     | ComplexLeaf b
                       deriving (Show)data ComplexTree a b = ComplexNode a (ComplexTree a b) (ComplexTree a b)
                     | ComplexLeaf b
                       deriving (Show)

If we wanted to create a ComplexTree where we wanted to store Ints on the leaves, but don't care about the internal nodes. We would write its type as ComplexTree () Int.

Pattern matching

Although we introduced a handful of functions earlier that can operate on lists, we've yet to see how we might generally get values out of a constructed algebraic data type. Haskell has a simple pattern matching facility that we can use to this end.

A pattern lets us peer inside a compound value and bind variables to the values it contains. In fact, when we define a function, the parameters to that function are really patterns that bind our variables to an entire value.

Here's an example of pattern matching in action on a list; we're going to add all elements of the list together.

sumList (x:xs) = x + sumList xs
sumList []     = 0sumList (x:xs) = x + sumList xs
sumList []     = 0

See that (x:xs) on the left of the first line? The : means “match the head of a list”; that's the familiar list constructor, (:), in action in a new way. The variables x and xs are given the values of (“bound to”) the head and tail of the list, respectively. The whole pattern is wrapped in parentheses so Haskell won't parse it as three separate arguments.

What effect does pattern matching have? Haskell will only evaluate the right hand side of an equation if it can match all of the patterns on the left hand side. In the definition of sumList above, the right hand side of the first equation won't be evaluated if the input list is empty. Instead, Haskell will “fall through” to the equation on the following line, which does have a pattern for the empty list, and it will evaluate that.

It might initially look like we have two functions named sumList here, but Haskell lets us define a function as a series of equations; so in fact these two clauses are defining the behaviour of one function, for different inputs. (By the way, there's a standard function, sum, that does this for us.)

The syntax for pattern matching on a tuple is similar to the syntax for constructing a tuple. Here's a function that returns the third element from a three-tuple.

third (a, b, c) = cthird (a, b, c) = c

There's no limit on how “deep” within a value a pattern can look. Here's a definition that looks both inside a tuple and inside a list within that tuple.

complicated (True, a, x:xs, 5) = (a, xs)complicated (True, a, x:xs, 5) = (a, xs)

We can try this out interactively.

ghci> :load Tuple.hs
[1 of 1] Compiling Main             ( Tuple.hs, interpreted )
Ok, modules loaded: Main.
ghci> complicated (True, 1, [1,2,3], 5)
(1,[2,3])
ghci> :load Tuple.hs
[1 of 1] Compiling Main             ( Tuple.hs, interpreted )
Ok, modules loaded: Main.
ghci> complicated (True, 1, [1,2,3], 5)
(1,[2,3])

Wherever a literal value is present in a pattern, that value must match exactly for the pattern match to succeed. If every pattern within a series of equations fails to match, we get a runtime error.

ghci> complicated (False, 1, [1,2,3], 5)
*** Exception: Tuple.hs:6:0-39: Non-exhaustive patterns in function complicated

ghci> complicated (False, 1, [1,2,3], 5)
*** Exception: Tuple.hs:6:0-39: Non-exhaustive patterns in function complicated

We can pattern match on algebraic data types using their constructors. Remember the Wrapper type we defined earlier? Here's how we can extract a wrapped value from a Wrapper.

unwrap (Wrapper x) = xunwrap (Wrapper x) = x

Let's see it in action.

ghci> unwrap (Wrapper "foo")
"foo"
ghci> :type unwrap
unwrap :: Wrapper t -> t
ghci> unwrap (Wrapper "foo")
"foo"
ghci> :type unwrap
unwrap :: Wrapper t -> t

Notice that Haskell infers the type of the unwrap function based on the constructor we're using in our pattern. If we're trying to match a value whose constructor is Wrapper, then the type of that parameter must be Wrapper a.

[Note]Note

Haskell tests patterns for matches in the order in which we list them in our code. It goes from top to bottom and stops at the first match; it does not check every pattern and use the best match.

If you're familiar with pattern matching from a logic programming language like Prolog, Haskell's facility is simpler and less powerful. It doesn't provide backtracking or unification.

The don't-care, or wild card, pattern

When we're writing a pattern, we can specify that we don't care what value a particular value within a structure has, without actually binding that value to a name. The notation for this is _ (called a wild card or “don't care”), and we use it as follows. This function tells us whether the result of the roots function we defined earlier is real-valued or not.

isRealValued :: Either (Complex Double, Complex Double) (Double, Double)
             -> Bool
isRealValued (Left _) = False
isRealValued _        = TrueisRealValued :: Either (Complex Double, Complex Double) (Double, Double)
             -> Bool
isRealValued (Left _) = False
isRealValued _        = True

Here, we don't care about the value of the result, just about which constructor was used to create it. If it was Left, the result must be a complex number, otherwise it must be real. We can use a wild card for the entire second pattern; there's no need to see if the constructor is Right, because it must be; Either only has two constructors.

The case expression

We're not limited to using patterns in function definitions. The case expression lets us match patterns at any time. Here's what it looks like.

hasRealRoots a b c = case realRoots a b c of
                       Just _  -> True
                       Nothing -> FalsehasRealRoots a b c = case realRoots a b c of
                       Just _  -> True
                       Nothing -> False

The case keyword is followed by an arbitrary expression; the result of this expression is what we're pattern matching on. The of keyword signifies the end of the expression and the beginning of the block of patterns and expressions.

Each item in the block consists of a pattern, followed by an arrow ->, followed by an expression to evaluate if that pattern matches. The result of the case expression is the result of the expression associated with the first pattern to match, taken from top to bottom.

To express “here's the expression to evaluate if none of the other patterns match”, we would just use the wild card pattern _ as the last in our list of patterns.

A flying visit back to the where clause

Now that we've seen that we can define a function as a series of equations, the usefulness of the where clause should be a bit more clear. Variables that we define inside a where clause are visible across all of the equations that precede it in a single block.

Limits to pattern matching

Every variable that appears within a pattern is a new local variable that will be given a value if the pattern matches, not a reference to some variable on “outside” the pattern. Matching a pattern only lets us do exact comparisons against constructors and simple values. We can't use a pattern to match something like a function. To see what we mean, take a look at the following function.

isHead f = case f of
             head -> True
             _    -> FalseisHead f = case f of
             head -> True
             _    -> False

A naive glance suggests that it's trying to check the value of f to see if it's actually the standard function head, but such an interpretation would be quite wrong. Here's what is really doing.

Because the first pattern in the case expression is just a variable, this pattern will always match, no matter what the value of f is, and the value of f will be given to the local variable head when the right hand side is evaluated.

[Note]Note

A pattern that consists only of a variable will always match, because it's not being compared against anything that might cause the match to fail. We refer to patterns that always match as “irrefutable”.

Finally, because the first pattern always matches, GHC will always evaluate its right hand side and use that as the result of the case expression. The second pattern (the one with the wild card) will never actually be reached. Because the two patterns will match the same values, they are said to overlap. If GHC ever complains to you about overlapping patterns, it's telling you that one of your patterns is the same as another, and so will never actually be matched.

Another limitation of patterns is that a variable can only appear once in a pattern. For example, you can't put a variable in multiple places within a pattern to express the notion “this value and that should be identical”.

The way around these restrictions of Haskell's patterns is to use patterns in combination with a language facility called guards, which we'll talk about next.

Conditional evaluation with guards

We can further extend our expressive arsenal using guards. A guard is an expression of type Bool; if it evaluates to True, the equation that follows it is evaluated. Otherwise, the next guard in the series is evaluated, and so on. Here's an example of guards in action.

guardedRoots a b c
    | n >= 0 && a /= 0 = Just (r1, r2)
    | otherwise        = Nothing
    where n  = b**2 - 4 * a * c
          a2 = 2 * a
          r1 = (-b + sqrt n) / a2
          r2 = (-b - sqrt n) / a2guardedRoots a b c
    | n >= 0 && a /= 0 = Just (r1, r2)
    | otherwise        = Nothing
    where n  = b**2 - 4 * a * c
          a2 = 2 * a
          r1 = (-b + sqrt n) / a2
          r2 = (-b - sqrt n) / a2

Each guard is introduced by a | symbol, followed by the guard expression, then an = symbol (or -> if within a case expression), then the expression to evaluate if the guard succeeds.

The otherwise used in the second guard has an obvious meaning: it's the expression to evaluate if previous gaurds all evaluate to False. It's not a special piece of syntax, though; it's just a variable whose value is True.

We can use guards anywhere that we can use patterns. The combination of writing a function as a series of equations, pattern matching, and guards lets us write code that's clear and easy to understand.

Remember the myTake function we defined in the section called “Conditional evaluation”?

myTake :: Int -> [a] -> [a]

myTake n xs = if n <= 0 || null xs
              then xs
              else myTake (n - 1) (tail xs)myTake :: Int -> [a] -> [a]

myTake n xs = if n <= 0 || null xs
              then xs
              else myTake (n - 1) (tail xs)

Here's a reformulation of that function using patterns and guards. Instead of reasoning about what an if expression is doing and which branch will be evaluated, the code uses a series of equations with simple patterns and guards. This makes it easier to understand the behaviour of the function under different circumstances.

niceTake n _ | n <= 0 = []
niceTake _ []         = []
niceTake n (_:xs)     = niceTake (n - 1) xsniceTake n _ | n <= 0 = []
niceTake _ []         = []
niceTake n (_:xs)     = niceTake (n - 1) xs

Let's return to one of the limitations of patterns that we mentioned in the previous section: the fact that we can't use a pattern to check two variables within the pattern for equality. We can express this quite easily using a guarded pattern.

secondEqualsThird x = case x of
                        (True, a, b) | a == b    -> "foo"
                                     | otherwise -> "bar"secondEqualsThird x = case x of
                        (True, a, b) | a == b    -> "foo"
                                     | otherwise -> "bar"

Here, for good measure, we've illustrated guard syntax in a case expression.

Infix functions

Usually, when we define or call a function in Haskell, we write the name of the function, followed by its arguments; this is called prefix notation, because the name of the function comes before its arguments. For a function that takes two arguments, we have the option of using it in infix form, between its first and second arguments. This allows us to write expressions using functions as if they were infix operators.

The syntax for defining or calling a function in infix form is to enclose the name of the function in backtick characters (sometimes known as backquotes). Here's a simple infix definition.

a `plus` b = a + ba `plus` b = a + b

Defining a function in infix form doesn't change anything about the behaviour of the function. We can call the function using infix or prefix notation, as we prefer.

ghci> 1 `plus` 2
3
ghci> plus 1 2
3
ghci> 1 `plus` 2
3
ghci> plus 1 2
3

Infix notation is useful for more than just our own functions. For example, Haskell's standard Data.List module defines a function, isPrefixOf, that indicates whether all elements of its first argument are equal to the first elements of its second argument.

ghci> :module +Data.List
ghci> :type isPrefixOf
isPrefixOf :: (Eq a) => [a] -> [a] -> Bool
ghci> :module +Data.List
ghci> :type isPrefixOf
isPrefixOf :: (Eq a) => [a] -> [a] -> Bool

Let's define a few variables in ghci.

ghci> let string = "foo"
ghci> let other = "foobar"
ghci> let string = "foo"
ghci> let other = "foobar"

If we call isPrefixOf using prefix notation, we can have a hard time remembering which argument we're checking for as a prefix of the other.

ghci> isPrefixOf string other
True
ghci> isPrefixOf string other
True

But if we use infix notation, the code “reads” more naturally; it's now obvious that we're checking the variable on the left to see if it's a prefix of the variable on the right.

ghci> string `isPrefixOf` other
True
ghci> string `isPrefixOf` other
True

There's no hard-and-fast rule that dictates when you ought to use infix versus prefix notation, although prefix notation is far more common. It's best to choose whichever makes your code more readable in a specific situation.

[Note]Note

The backtick notation is not a general mechanism: it's a piece of special syntax that applies only to names. For example, we can't put backticks around an expression that returns a function, and then treat that as an infix function.

Conclusion

In this chapter, we've had a whirlwind overview of Haskell's type system and much of its syntax. We've read about basic types, compound types, and how to write our own algebraic data types. We've seen how to write functions, and how to declare local variables within them. We've read about Haskell's offside rule for laying out functions, and how we can avoid it if we need to. We've seen conditional evaluation, pattern matching and guards. We've discussed error handling.

This all amounts to a lot of information to absorb. In Chapter 4, Functional programming, we'll build on this basic knowledge to understand how we can write, and think about, code in Haskell.

Want to stay up to date? Subscribe to the comment feed for this chapter, or the entire book.

Copyright 2007 Bryan O'Sullivan, Don Stewart, and John Goerzen. This work is licensed under a Creative Commons Attribution-Noncommercial 3.0 License. Icons by Paul Davey aka Mattahan.