by Carl Burch, Hendrix College, August 2012 Based on a work at www.cburch.com/books/hsfun/.

## Contents

We'll begin our study of Haskell by learning to write functions. For now, we'll equate the terms function and program. In most other languages, the notion of a program is not the same as the notion of a function. But we want to avoid this distinction because Haskell, as a purely functional language, cannot have simple input and output statements. Haskell does support input and output, but it's a more complex concept that we'll reach only much later. For now, we don't want to worry about it: We just want to talk about writing functions that will take a value and produce a value in response. And in this document, in fact, we'll only worry about functions that process numeric values.

To experiment with Haskell programming, I recommend that you get a Haskell interpreter such as Hugs (available at www.haskell.org/hugs/). There are compilers for Haskell that produce reasonably efficient machine code, but using them requires understanding input and output in Haskell, which as I said is a concept we'll tackle much later. With an interpreter, you can write your own functions in a file, load them up into the interpreter, and call them however you like.

## 1. Expressions

We'll begin, though, without defining our own functions. We'll start with the functions already defined in the Haskell standard libraries (called the Haskell Prelude), and we'll learn how we can use these functions to build up more complex expressions. In contrast to imperative languages, where statements are the basic computational construct, in functional language the expression is the basic way of talking about computation. (Of course, imperative languages have expressions, too, but they're subsidiary to the statement. In functional languages, that's all we have.)

First we'll see how to use functions — for example, we might want to use the Prelude's `sqrt` function, which computes the square root of a floating-point value. To compute √5, for instance, we can simply type the following into the interpreter, and it would print back the return value.

 `sqrt 5.0` returns 2.23606797749979

Note that when we call a function in Haskell, we do not place parentheses around the parameter. Of course, we could if we wanted, but they would be redundant. Since Haskell programs use lots of functions, it's convenient to leave them out.

We can use arithmetic operators, as in the following.

 `sqrt (3 * 3 + 4 * 4)` returns 5.0 `sqrt (3 ** 2 + 4 ** 2)` returns 5.0

(The `**` operator represents exponentiation.) The parentheses around the argument in each example are necessary, because function application has a higher order of precedence than the arithmetic operators. Leaving them out would be the same as writing `(sqrt 3) * 3 + 4 * 4`. Notice that Haskell does observe the order of operations among arithmetic operators: Exponentation (`**`) precedes multiplication and division (`*` and `/`), which themselves precede addition and subtraction (`+` and `-`).

Many functions take multiple arguments. Two examples are the `max` function for computing the maximum of two numbers and the `rem` function for computing the remainder of dividing two integers.

 `max 3.0 3.5` returns 3.5 `rem 17 5` returns 2

Notice that we're again not using parentheses to surround the arguments. They would be necessary, however, if we want to use a function to compute the argument to another function.

 `max 3.0 (sqrt 5.0)` returns 3.0

Without the parentheses, the interpreter would parse this as an attempt to pass three arguments into `max`: 3.0, `sqrt`, and 5.0. The interpreter would complain that the expression doesn't type-check: The `max` function expects two numeric arguments, and it's being told to pass it three arguments, and with second argument as a function (`sqrt`), not a number.

## 2. Functions

Now suppose we want to define a function — maybe a `square` function that squares its argument. That's easy enough to do.

`square x = x * x`
A function definition has an equals sign separating its header from the expression saying how to compute the return value. The header in this case is simply a prototype for a function call: In this case, we're saying that we're defining `square` that takes a single parameter `x`.

Now suppose we load this function into our interpreter. (With Hugs, that would involve writing the above into a file with a text editor and then instructing Hugs to load the file.) We would then be able to use the function just as we can use the functions defined in the Prelude.

 `sqrt (square 3.0 + square 4.0)` returns 5.0

You may look at this definition of `square` and object that the parameter `x` looks suspiciously like a variable, which is exactly what we're trying to avoid here. But it is not a variable in one, crucial sense: Its value, determined upon the creation of `x` when entering the function, cannot change. That is, there is no way for us to vary its value, and so it is not a varyable. And in fact, functional programmers would not call `x` a variable — they would call it an identifier. (One could argue, though, that functional languages have a stronger claim to the term variable because they are faithfully following the long-standing term as it has been used by mathematicians for centuries, before imperative languages redefined the term.)

If we want to define a function taking multiple parameters, then we simply list multiple identifiers to identify the different arguments. For example, the following computes the value for x under the normal probability distribution (i.e., a bell curve) with a mean of 0 and a standard deviation of σ (sigma).

`normal sigma x = exp (-0.5 * square (x/sigma)) / (sqrt (2 * pi) * sigma)`

This happens to use the `exp` function (computing ex) and the `pi` identifier (representing π ≈ 3.14) defined in the Prelude.

Haskell allows you to define additional identifiers to represent helper values using a `where` clause. For example, we might instead define `normal` as the following.

```normal sigma x = exp (-0.5 * square devs) / (factor * sigma)    where devs = x / sigma          factor = sqrt (2 * pi) ```

When we do this, and when we have multiple identifiers to define, it's important that the definitions each start in the same column. Otherwise, the Haskell interpreter will not be able to tell where one definition stops and another begins. (Generally, white space is not important to Haskell, but in this case it is.)

## 3. Conditional expressions

Sometimes, we may want the value of the function to be computed different ways in different circumstances. For example, suppose we want a `distance` function to compute the distance a vehicle has gone after t seconds, supposing that the vehicle starts at a standstill, accelerates at a rate of a m/s², to a terminal velocity of vmax m/s. The computation of the distance will depend on whether enough time has elapsed for the vehicle to reach its terminal velocity, which will occur at time vmax / a.

We can make this determination using an `if``then``else` expression.

```distance a vmax t = if t < timeToMax                         then 0.5 * a * square t                         else vmax * (t - 0.5 * timeToMax)     where timeToMax = vmax / a ```

(The line breaks in this definition are present only because otherwise the line would be too long. The Haskell parser ignores them.) In this case, if t is too small for the vehicle to reach the terminal velocity, then we compute the distance based purely on a constant acceleration; but if t is larger than that, then we deduct from t half of the time to terminal velocity (since the average speed during this time is half of vmax) and multiply that by vmax.

The `if``then``else` construct looks similar to the `if``then``else` statements that commonly occur in imperative languages, but don't let the syntax fool you into thinking they are the same. Haskell's `if``then``else` expression is always for computing a value of an expression, and so the expression in the `then` and in the `else` must both contain expressions whose values have the same type. One of the consequences is that in Haskell, the `else` expression is always required. In contrast, `if` statements in imperative languages typically have statements to perform in the `then` and the `else` blocks; neither block has any type. Actually, if you want an analogue to Haskell's `if``then``else` in imperative languages, you should think of the `?``:` ternary operator in C/C++/Java.

In addition to the `if``then``else` expression for testing conditions, Haskell also defines guard notation, where you can place conditions on the parameters before specifying the return value. This idea is inspired by mathematical notation for handling conditions:

d(a, vmax, t) = {
 ½ a t² if t < tmax vmax (t - ½ tmax) otherwise

Haskell has the condition precede the return-value expression, but otherwise it looks quite similar to the above mathematical notation.

```distance a vmax t | t < timeToMax  = 0.5 * a * square t                   | otherwise      = vmax * (t - 0.5 * timeToMax)     where timeToMax = vmax / a ```

On a call to this `distance` function, the function would get the parameter values, then step through the guards to see which is the first to be true for those parameters (with `otherwise` being a catch-all), and then the return value would be the value of the expression following that guard expression.

Throughout the preceding discussion, we have used the boolean type without defining it. All conditions — whether within the `if` of an `if``then``else` expression or as a guard expression — must have a boolean type. Haskell calls this type `Bool`, and its two truth values are represented as `True` and `False`.

The operators dealing with Booleans are the comparison operators: `==`, `/=` (not-equals), `<`, `<=`, `>`, and `>=`. These fall below the arithmetic operators in the precedence hierarchy. Below that is logical AND (`&&`) and below that is logical OR (`||`). There is also the logical NOT, implemented as the regular function not.

To summarize, here are the operators we have seen so far and their precedence levels.

 operators associativity highest function application left `**` (exponentiation) right `*`, `/`, ``rem`` left `+`, `-` left `==`, `/=`, `<`, `<=`, `>`, `>=` none `&&` left lowest `||` left

Note that the comparison operators are not associative. If you write `0 <= x < 1`, the parser will complain that the order of evaluation is ambiguous. (And of course, if you simply added parentheses to get `(0 <= x) < 1`, the type-checker would complain that you can't compare a boolean value (the result of `0 <= x`) to an integer (1). You would need to write this is using logical AND: `0 <= x && x < 1`.)

## 4. Infix and prefix functions

In Haskell, binary operators such as `+` and `*` are functions, too, though they are used differently in that their names are placed between the two arguments (infix notation) rather than before them (prefix notation). Haskell has two categories of functions: functions whose names consist solely of symbols (like `+`), which are normally invoked using infix notation, and functions whose names consist solely of letters (like `sqrt`), which are normally invoked using prefix notation. (Negation (as in −x) is the sole exception to the rule that all symbolic names are infix operators. Haskell uses `-` for both negation and subtraction, and of course the subtraction function is an infix operator.)

For cases when this behavior is not desirable (and such cases occur with fair regularity), Haskell allows the programmer to talk about infix operators as prefix operators and vice versa. Functions with symbolic names are by default infix operators, but placing the name within parentheses changes it to a prefix operator

 `(*) 3.0 4.0` returns 12

Likewise, a function with a textual name is a prefix operator, but placing the name in backticks changes it to an infix operator.

 `3.0 `max` 3.5` returns 3.5 `17 `rem` 5` returns 2

Obviously, this is relevant only for functions that take two arguments. Incidentally, the Prelude defines the infix remainder function (``rem``) to have the same precedence as multiplication and division.

Haskell has another feature for infix operators that is called the section: You can specify one of the arguments within the set of parentheses, getting a function that takes an argument and returns the result of applying the operator to both arguments. The following, for example, define two functions, half that returns the number that is half its argument, and negate that returns the negation of its argument.

```half   = (/2) negate = (0-) ```

The following illustrates these being used.

 `(/2) 13.0` returns 6.5 `(0-) 15` returns −15

## 5. Iteration

One thing that is bound to frustrate imperative programmers is that functional languages don't have loops. They can't: The whole point of a loop is to repeat something until some condition no longer holds, but because functional programs don't have any variables, a condition's value can't change. Thus, whenever we have a repetitive process — and what's the point of programming if we can't mechanize repetitive processes? — the most important technique for iteration in imperative languages isn't available to us.

But there is that other technique — that technique that is denigrated in some circles as confusing, in others as inefficient: recursion. Luckily, it's not as confusing in functional programming, because the side effects cause a large part of the confusion. And we'll see later that there isn't necessarily a substantive efficiency concern. At any rate, recursion is the most important technique for iteration in functional languages.

We can illustrate this with the factorial function, which is typically represented as n!, but which we'll represent as fact(n). Mathematicians traditionally define the function recursively, as they define all iterative processes.

fact(n) = {
 1 if n = 0 n ⋅ fact(n − 1) otherwise

In Haskell, this would translate to the following.

```fact n | n == 0     = 1        | otherwise  = n * fact (n - 1) ```

Haskell provides a different way of defining functions called patterns. The idea here is that we write multiple definitions of the function, but each time using a different pattern for the arguments. For `fact`, we'll have one pattern that will match an argument of 0 only, while the second pattern will match any value to the name `n`.

```fact 0 = 1 fact n = n * fact (n - 1) ```

Given a function defined with patterns, the system will translate this to a function that receives the arguments, goes through the various patterns to find the first pattern that matches the arguments, and then computes the return value based on the return value for that pattern.

Pattern definitions become more interesting for multi-argument functions. The following simple function computes the area of a rectangle given its width and height.

```area 0 _ = 0 area _ 0 = 0 area w h = w * h ```

This definition uses the special underscore pattern, called the wildcard, which will match anything (like an identifier) and throw the value away (unlike an identifier). Given this definition, the function would first check whether the width is 0, in which case it returns 0; and if not, then it would check whether the height is 0, in which case it also returns 0; and if neither of those patterns matched, then it would have to multiply the width and the height.

Of course, the above isn't recursive. However, if for some reason we wanted to avoid multiplication, we could make it recursive by shaving off one unit from the width and the height, computing the area of the remainder, and then adding back the area of what we shaved off.

```area 0 _ = 0 area _ 0 = 0 area w h = area (w - 1) (h - 1) + (w + h - 1) ```

(This isn't exactly equivalent to the preceding function, though. One differences arises when given a negative width or height: the first definition would return a number, while the second would be infinitely recursive.)

Based on the discussion so far in this document, you should now be able to make full sense of the following, relatively sophisticated Haskell program.

```testFactorsFrom i n     = (i * i > n) || (n `rem` i /= 0 && testFactorsFrom (i + 1) n) isPrime n = testFactorsFrom 2 n countPrimesTo n | n < 2     = 0                 | isPrime n = 1 + countPrimesTo (n - 1)                 | otherwise = countPrimesTo(n - 1) ```

## 6. Function types

It's good practice when you write a function to define the type of the function's identifier. If you leave the type definition out, then the Haskell interpreter/compiler will determine it for you using a process called type inference. Given the `square` function above, for example, it would notice that you are multiplying something with x, and so x must be a number; and it would notice that the result is returned, and so `square` must return a number. But giving the types is good practice for two reasons:

• It enhances the readability of your program, as a reader doesn't have to figure out what sort of thing each argument is.

• It can have a dramatic impact on the efficiency of a program. In the `square` example, the inferred type will end up being able to take two integer values and produce an integer value, and it will be able to take two floating-point values and produce a floating-point value, and it will be able to take two complex values and produce a complex value. (We'll see how this works later, when we get to the notion of type classes.) In many compilers, though, `square` will have to end up determining which multiplication to perform at run-time, which will impact slow the computation down considerably.

By declaring the type of the parameters, `square` will know the types of its parameters, and it will be able to go directly into the appropriate multiplication routine without any run-time type accesses.

To illustrate, we can define `square` as a function that takes a floating-point value and returns a floating-point value. To do this, we use the `Double` type, which represents floating-point values. (We could use the `Integer` type instead if we wanted to be able to square integers.)

```square :: Double -> Double square x = x * x ```

The declaration of an identifier's type begins with the identifier name, followed by a double-colon, followed by the type name. In this case, we have a function from `Double`s to `Double`s, and we use the arrow notation to represent this.

Type names in Haskell always begin with a capital letter, while identifier names (including function identifiers) must always begin with a lower-case letter — unless, of course, the identifier happens to be an infix operator, in which case it would consist entirely of symbols.

What do we do for functions like `normal` that take multiple parameters? The answer is a bit interesting:

`normal :: Double -> (Double -> Double)`

What we're saying here is that `normal` is a function that takes a `Double` and returns a function that itself takes a `Double` and produces a `Double`. This is bound to strike a newcomer to Haskell as peculiar: We were thinking of `normal` as a function that takes two `Double`s and computes a `Double` value, but now we're defining it as a function that takes a single `Double` and produces another function.

To make sense of this notion, look at how we would call `normal`.

 `normal 1.0 1.0` returns 0.241970724519143

Before, we thought of this as passing two `Double` values into `normal` at once. But that's not how Haskell actually interprets this. Instead, it interprets the expression as passing 1.0 into `normal`, and then passing 1.0 into the result: That is, Haskell interprets it the same as as `(normal 1.0) 1.0`: First, it computes the function that `normal` produces given 1.0 as its parameter, and then it computes the value that new function produces given 1.0 as its parameter. Fundamentally, Haskell understands function application as a nameless left-associative binary operator that happens to have the highest precedence.

The arrow operator in type definitions is right-associative. (By contrast, arithmetic operators in expressions are left-associative: That is, given 2 − 3 − 4 in traditional mathematics and in Haskell, we process left-to-right, computing (2 − 3) − 4 rather than 2 − (3 − 4).) Because of this, we can omit the parentheses in our type declaration for `normal`, and Haskell will assume we placed parentheses around the rightmost.

`normal :: Double -> Double -> Double`

This concept of functions that produce other functions can be used within Haskell. For example, suppose we want to define `threshold` that takes a number x and returns x if x is positive and 0 otherwise. We can give a simple definition of such a function using the Prelude's `max` function.

`threshold x = max 0 x    -- one definition`

But an even simpler defintion is:

`threshold = max 0        -- alternative definition`

Here, we are defining `threshold` as the function that `max` returns given 0. Of course, this function computed by `max` will take a number and return the maximum of 0 and that number, which is exactly what we want `threshold` to do.

Another example is the following double function that multiplies its argument by 2; this definition encloses the multiplication operator in parentheses so that we can treat it just like regular two-argument functions.

`double = (*) 2`

The transformation of a function from one that takes multiple arguments to produce a value into another that takes a single argument and produces another function to take another argument, is called currying. And a function, such as `normal`, that works this way, is called a curried function. These terms derive from the name of Haskell Curry, who is also the namesake of the Haskell language. Curry didn't invent currying, but he used it intensively enough that the idea became named after him. Curry, who died in 1982, also predates the invention of Haskell, introduced in 1990. But he was instrumental in defining many of the concepts that underlie functional languages in the early 20th century.

## 7. Functions as first-class values

One of the interesting features of Haskell — and with most functional languages — is that it treats functions as regular data, just like integers, or floating-point values, or other types. Language specialists say that functions are first-class data types, meaning that programs can work with them just as well as they can work with other data.

Curried functions — where a function takes a parameter and produces another function — is one facet of this. Another facet is that functions can take other functions as parameters. For example, we might want to estimate the area under a function curve (i.e., the integral) using the trapezoid method, where we divide the range of interest (ab) into equal-sized pieces. We can easily write such a function, taking the function to be integrated as a parameter, along with the width of the trapezoids delta and the range of interest (ab).

```integrate :: Double -> (Double -> Double) -> Double -> Double -> Double integrate delta f a b | a + delta > b  = (b - a) * (f a + f b ) / 2                       | otherwise      = delta   * (f a + f a') / 2                                            + integrate f delta a' b                                          where a' = a + delta ```

To find the area below the function √x and the function ex for x in the range (0, 1), we would use the below expressions.

 `integrate 0.01 sqrt 0 1` returns 0.666462947103147 `integrate 0.01 exp  0 1` returns 1.71829614745042

For those whose integral calculus is rusty or non-existent, let me add that the exact values are 2/3 and e ≈ 1.71828. The values we got were pretty close.

Incidentally, the above definition is somewhat poor because it will end up evaluating the function twice at each point, which will lead it to be slower. Fixing this is just a matter of adding a helper function that takes f(a) for the current a as a parameter.

```integrate :: Double -> (Double -> Double) -> Double -> Double -> Double integrate f delta a b = sub a (f a)   where sub a fa | a + delta > b  = (b - a) * (fa + f b) / 2                  | otherwise      = delta   * (fa + fa') / 2 + sub a' fa'                                     where a' = a + delta                                           fa' = f a' ```

We can also define functions that take a function and produce a function as a result. For example, we might want to define a function `threshold` that takes a value z and a function f and returns a new function g that is identical to f except that it is clipped off wherever f goes above z. Essentially, we want the return function g to be one that takes a value x and returns the minimum of z and f(x). We can easily write exactly that in Haskell.

```threshold :: Double -> (Double -> Double) -> (Double -> Double) threshold z f = g where g x = min z (f x) ```

In writing the type for `threshold`, that the final set of parentheses is unnecessary, since the arrow in type declarations is right-associative anyway. And, indeed, this fact reveals that we could equivalently define `threshold` as follows, though it's not quite as readable as our first try.

```threshold :: Double -> (Double -> Double) -> Double -> Double threshold z f x = min z (f x) ```

The Prelude defines a handful of functions defined to take functions as parameters and produce other functions. One of the most useful of these is the composition operator. Given two functions f and g, the composition of f and g is represented mathematically as f ⋅ g and defined as the function h where h(x) = f(g(x)). In Haskell, the composition operator is written using the period (`.`), but otherwise it's the same.

`f . g = h where h x = f (g x)   -- as defined in the Prelude`

The composition operator is very high in the precedence hierarchy — just below function application, above the arithmetic operators.

We can use the composition operator to shorten our definition of `threshold` yet further. First, we'll add parentheses to our second definition from above to make the currying clear.

`threshold z f x = (min z) (f x)`

Here we have something that looks very similar to the `h` returned by function composition: We have `_ (_ x)`. This is the same as `(_ . _) x`.

`threshold z f x = ((min z) . f) x`

Finally, since the final parameter is simply passed into another function, we can take it off both sides to arrive at:

`threshold z f = (min z) . f`

And this makes sense: Given a value x, we'll first apply f to it to arrive at f(x), and then we'll pass that as the second argument to min to get the thresholded value.

It's an interesting diversion to extend the process further to remove the other parameters to `threshold` — though it's certainly just a diversion, since it certainly doesn't clarify the function's purpose. First, we'll convert the usage of the composition operator to prefix notation.

`threshold z f = (.) (min z) f`

Now because `(.)` is curried, we can observe that this is the same as

`threshold z f = ((.) (min z)) f`

Now we're in the same situation as earlier — the final parameter is simply passed into a function — and so we can strip off the final parameter.

`threshold z = (.) (min z)`

To get rid of the z parameter, we simply observe that this definition itself looks like function composition: We apply min to z, and then we apply the function composition operator to the result. So we can rewrite the above as

`threshold z = ((.) . min) z`

And now the final parameter is simply passed into another function, so we can strip off that parameter too.

`threshold = (.) . min`

Writing `threshold` this way would get you no points for readability, but you could feel pride in being brief!

I hasten to add that there are some important limits on what you can do with functions: You can't test two functions to see if they're equal — i.e., if they work the same. In fact, determining this is mathematically impossible (in the same sense that the halting problem is uncomputable). Also, functions can't be converted into a reasonable string representation, and so any attempts to display them will fail. But Haskell regards these operations as library features, not a feature of the base language, and in terms of the base language, we can do anything with functions that we can with other data.

Many imperative languages, by contrast, do not give functions such a first-class status. If you're very familiar with C, you might object that it has a function pointer feature, allowing a function to take a function pointer as a parameter, to make a call to the function referred by that function pointer within the function, and to return a function pointer. But C's facility, though nice, isn't as strong as Haskell's, for a function pointer can only refer to a preexisting function. In C, you can't write a function like `threshold` that would take one function as input and produce another function as output. (However, there are a reasonable number of exceptions to the rule: Smalltalk, JavaScript, and Python all have flexibility of function handling that nears the expressiveness of Haskell.)

## 8. Polymorphic functions

In our function types thus far, we've worked with specific types. But Haskell also has a feature where functions can work with arbitrary types. Consider the following translate function which takes a function f and a distance d and returns a new function g that is f translated d units to the right. `translate f d = g where g x = f (x - d)`

One way to define the type of this function is to restrict f to be a mapping from `Double`s to `Double`s.

`translate :: (Double -> Double) -> Double -> (Double -> Double)`

But this is not as general as possible. It's true that f must take a number as a parameter — after all, we pass the result of a subtraction into f — but its return value need not be a number. We might also want to translate a function that maps numbers to Booleans — that essentially identifies a subset of the real number line, like the set of numbers x for which sin x < 0.5. But the type declaration as written above would disallow this.

Enter polymorphic functions, where we can have type variables (there's that nasty word again) that represent an abstract type to based on how the function is actually used. In Haskell, each type variable is generally a lower-case letter.

`translate :: (Double -> a) -> Double -> (Double -> a)`

In this case, we're saying to let a be whatever the return type of the first parameter function is; the final return value would be a function from `Double`s to a's. With this type declaration (or with the type that would be inferred were we to omit the type declaration), we could use translate to translate any function taking a `Double` as a parameter — whether the return value would be a `Double`, a `Bool`, or even another function.

The composition function `(.)` is an interesting case to study with polymorphic types. Recall that we defined it as the following.

`f . g = h where h x = f (g x)   -- as defined in the Prelude`

Now what's the most general type we could assign to this function?

We'll start with x, the parameter to h, the return value, and assign it a type variable `a`. Notice that x is passed to g also, so g must also take a parameter of type `a`. Let's suppose that g returns a value of type `b` (which may or may not be the same as `a`). To compute h, the return value of f is passed into f, so f's parameter type must be `b`. Whatever f returns — call it `c` — is likewise returned by h. This leads to the following type declaration.

`(.) :: (b -> c) -> (a -> b) -> (a -> c)`

This is a pretty general type!