Caml offers the following control structures:
if ... then ... else
match ... with ...
;
for and
while
raise ... and corresponding error handling
try ... with ....
If none of these control structures can fit your needs, you must define your own structures using functions (recursive or not).
Table of contents:
if ... then ... else
if ... then ...
if ... then ... elseFor tests we classically use an
if condition then expression1 else expression2 construct,
that returns expression1 or expression2,
according to the condition boolean value.
Boolean arithmetic comparisons are
=, <, >,
<=, >=, <>,
and boolean operators are
&&, ||, not.
For instance:
let absolute_value x = if x >= 0 then x else - x;; absolute_value : int -> int = <fun>
Type checking:
the condition must have type bool, and both branches
must have the same type (which is the type of the whole construct).
if ... thenA if construct without the corresponding else part
is automatically completed with an extra else ()
clause. This kind of construct is useful inside
sequences.
Type checking: as a regular alternative, hence
the condition must have type bool and both branches
must have the same type (unit).
Pattern matching or shape analysis is used to perform complex tests involving the shape of the argument applied to the construct.
Syntactically, a pattern matching is a list of
clauses. Each clause is a construct of the form
| pattern -> expression. The
pattern matching mechanism is an analysis of the shape of a given value
to compare it with each of the patterns or shapes of the list
of clauses. The analysis succeeds as soon as the value can be
considered as an instance of the pattern of the clause.
A value is an instance of a pattern if it is a special case of this
pattern (in other words, if the pattern is a generalized form of the
value or else the pattern is more general than the value).
A pattern is made of constants (basic constants or constructors),
variables, and applications of functional constructors to argument
patterns. Consider for instance the clauses | 0 -> 1
(integer constant pattern 0) or | n -> 1
(the pattern is just a single variable n). The only value
that matches the pattern 0 is obviously the integer
0, when the n pattern agrees with every
value whatever it could be. Finally, a pattern such as C
f where C is a constructor and f
another pattern, matches all the values built using the constructor
C and an argument value v that matches
pattern f.
Patterns of clauses are tried in turn following the order of the text of the program, until one succeeds.
If none matches the value at hand, an run time error is raised (see for more information).
The value returned by a pattern matching is the one returned by the
corresponding expression of the clause that succeeded.
The basic syntactic construct to perform pattern matching is
match expression with clauses: it evaluates
expression, then matches the resulting value with each
clause of the list of clauses given in the clauses part
of the matching.
For instance, succ 0 being 1, the
following pattern matching returns the expression corresponding to the
clause | 1 -> "unit":
#match succ 0 with | 0 -> "null" | 1 -> "unit" | _ -> "several";; - : string = "unit"
Note the last clause
| _ -> "several", that involves an underscore in the
pattern part: the _ pattern means ``anything else''
and matches any value. It is very often used as the last pattern to
catch all the remaining cases.
Pattern matching a value against a variable has another effect:
during the computation of the expression on the right of the
corresponding clause, the variable evaluates to the value (
pattern matching performs bindings of variables to values).
For instance, succ 1 being 2, the following
pattern matching returns the expression of the clause | n ->
string_of_int n, thus string_of_int n
with n evaluating to 2 (we say that
n is bound to 2):
#match succ 1 with | 0 -> "null" | 1 -> "unit" | n -> string_of_int n;; - : string = "2"
Note that the if ... then ... else ... construct
is just a special case of pattern matching on booleans:
if cond then e1 else e2 corresponds to
match cond with | true -> e1 | _ -> e2
Type checking:
all clauses of a pattern matching must have the same type:
patterns must have the same type and the expressions must have a
common type as well. In the case of the match
e with matching construct, the type that patterns have in
common must be the type of e, and the type of all the
expressions of the clauses of the pattern matching is the type of the
whole construct.
Whenever possible, pattern matchings must be exhaustive (every possible case gets a corresponding clause) and must not contain useless clauses (presumably because there are two distinct clauses for the same case). The Caml compiler will probably emit a warning when faced with such a bad pattern matching. Whenever the pattern matching mechanism fails at runtime because the value is not an instance of one of the patterns, a failure occurs indicating where in the source code the pattern matching failed.
For syntactic reasons, a pattern matching that occurs inside another
one must be parenthesized (using parentheses or
begin and end keywords).
The reason is clearly evident, if you consider
the pattern matching of the expression e2, inside the
pattern matching of e1:
match e1 with | f1 -> match e2 with | f2 -> ... | f3 -> ... | f4 -> e2
As it is written here, one understands that f2 and f3
belongs to the pattern matching of e2, and this is correct.
But, a human being will probably believe
that the indentation is significant, and will consider that the
pattern f4 belongs
to the surrounding pattern matching, the one of e1.
Still f4 simply follows f3, which we agree
belongs to the pattern matching of e2. That's why,
although the indentation is here error prone, f4
syntactically belongs to the internal pattern matching
(the one of e2). Presumably, the indented meaning of the
program was:
match e1 with | f1 -> begin match e2 with | f2 -> ... | f3 -> ... end | f4 -> e2
To avoid such errors, you must always use parens when writing pattern matchings inside other pattern matchings.
If you have some condition to test during pattern matching, you may
use a guard, that is a boolean condition that will monitor the
selection of the pattern matching clause.
A guard is introduced by when, followed by the expression
to test. If this expression evaluates to true the clause
is selected (when its pattern part matches), otherwise the next
pattern is checked. For instance:
(* power i j computes i ^ j, provided j is positive. *) let rec power i j = match j with | 0 when i != 0 -> 1 | 1 -> i | j when j mod 2 = 1 -> let p = power i (j / 2) in i * p * p | j when i = 0 -> invalid_arg "power" | _ -> let p = power i (j / 2) in p * p;; power : int -> int = <fun>
Another typical example is to test the equality of two variables of the pattern. The following predicate tests if a list has two successive elements that are identical:
let rec two_successive_equal = function | [] -> false | [_] -> false | x :: y :: l when x = y -> true | x :: l -> two_successive_equal l;; two_equal : 'a list -> bool = <fun>
Further reading about guards.
The sequence e1; e2 means successive evaluation of the
two expressions e1 then e2. The construct
returns the value of e2. This construct is used to
perform effects, for instance printing effects:
#print_string "Hello "; print_string "World!";; Hello World!- : unit = ()
Type checking: a sequence has the type of its last expression.
In Caml, there is no need to parenthesize a sequence appearing in a
pattern matching clause:
... -> e1; e2 is equivalent to ... -> begin
e1; e2 end.
if ... then ... else ...A sequence appearing in a branch of an alternative must be
parenthesized. The common habit is to use begin and
end. One writes
if cond then begin e1; e2 end else begin e3; e4 end
On the other hand, an alternative inside a sequence does not need any parens. So
if cond1 then e1; if cond2 then e2 else begin e3; e4 end;;
means:
if cond1 then e1 else (); if cond2 then e2 else begin e3; e4 end;;
If you forgot to parenthesized a sequence corresponding to one of the branch of an alternative, this sequence belongs to the surrounding expression. For instance, in the expression:
if cond1 then e2 else e3; e4;;
the expression e4 is always evaluated, whatever the value
of the cond1 condition could be. That's because the whole
expression is equivalent to:
(if cond1 then e2 else e3); e4;;
There are two kinds of loops, for and
while loops. They are used to perform effects and always
evaluate to the value ()
or ``nothing''.
One writes while condition do body done, to mean
repetitive evaluation of body, while
condition evaluates to true. In particular, if
condition always evaluates to false body is
never evaluated.
Type checking:
the condition must have type bool, and a loop always have
type unit.
One writes for ident = initial to final do body
done, to mean repetitive evaluation of
body, with ident successively bound to
initial, initial + 1, ...,
including final. In particular, if
initial is strictly greater than final,
body is never evaluated.
Note that ident is defined by the
for loop, there is no need to declare or define it in advance.
Notice also that you cannot modify the value of ident
when the loop is running (since
ident is not bound to a mutable
reference).
#for i = 0 to 10 do print_int i; print_char ` ` done;; 0 1 2 3 4 5 6 7 8 9 10 - : unit = ()
Type checking:
loop index has type int, as well as
expressions initial and final.
The loop gets type unit.
To exit from a while or for loop, you
must use the exception mechanism. Escape from the loop by raising the
exception devoted to this case, the Exit predefined
exception (see below).
The raise primitive is used to thraw exceptions (signal
errors) when computation cannot go on. These
exceptions ought to be caught by a surrounding ``try ... with''
construct to treat the error (either by aborting the program after
some error message, or going on another way). If there is no handler
the failure propagates until the entire evaluation process finally
aborts. For instance:
#print_string "Hello "; #raise (Failure "division by zero"); #print_string " world!\n";; HelloUncaught exception: Failure "division by zero"
Type checking:
if exception has type
exn, then raise exception has type
'a.
The try expression with matching construct
is used to handle errors: expression is evaluated while
catching errors that may happen during this evaluation.
Errors that may appear during the evaluation are carried out via
exceptions that tell the kind of errors at hand. These errors
may be as numerous as necessary, so that we list them in the ``pattern
matching'' part of the try ... with: if an exception of
that kind occurs during the evaluation of expression, it
is matched against the clauses of this pattern matching of the
try ... with. If the match succeeds, the exception is
stopped (that is, ``the error is recovered''): the expression
corresponding to the matching clause is then returned, and the
evaluation goes on (that is, the program does not abort).
If no clause matches the exception, this exception is propagated (that is, the exception is thrown again, and the program continue abortion).
Type checking:
the type checking rule for try expression with
matching is analogous to the type checking rule for match
expression with matching:
the patterns must have the type of exceptional values,
i.e. exn, and all the expressions of the right hand side
of clauses must have the same type as the expression that the handler
protects, expression. This is the type of the whole construct.
Using exception handling, you can easily escape from any kind of loops
(for or while, or even loops encoded into a
recursive function): this is equivalent to the C break
instruction, or the Ada exit instruction.
Premature end of loop is obtained by raising the predefined exception
Exit from within the body of the loop, while handling
this exception from outside the loop.
For instance, we define a predicate that test if a vector contains a zero: if we encounter a 0, we escape from the loop and return true; otherwise the loop goes on until its normal end, and we return false.
let has_zero v = try for i = 0 to vect_length v - 1 do if v.(i) = 0 then raise Exit; done; false with Exit -> true;;
The user defines his own exceptions using
exception name;; for a constant exception, or
exception name of typ;; for an exception having an
argument. These exceptions are generative, that is to say:
two successive definitions of an exception with the same name lead to
the definition of two distinct exceptions that will not be confused by
Caml programs (handlers for the first exception will not handle
the second exception).
Type checking:
a constant exception has type
exn, an exception defined by exception name of typ;;
has type typ -> exn.
Some exceptions are predefined in Caml to signal some frequent errors or problems:
Failure of string: report an error that occurred
in some user's code. The reason of the failure or the name of the
function where the error occurred is mentioned in the string argument.
Not_found: signal that a searching function has
failed (e.g. assoc).
Exit: used to jump out of loops or functions.
Break: interruption (the user has send an explicit
interrupt to the program).
Invalid_argument of string: the argument provided to
the function cannot be handled properly.
Sys_error of string: the operating system host
has raised an error (this exception is defined in the sys
library module).
Match_failure of string * int * int: signal a
pattern matching failure in the file mentioned, at the given location
(in characters).
End_of_file: an end of file occurred while reading
some data (this exception is defined in the io library module).
Contact the author Pierre.Weis@inria.fr