Sunday, April 23, 2006

Qi Programming Basics: Pattern Matching

The Qi language has the ability to create functions that return different results depending on the inputs. In most languages besides Haskell, ML, and Prolog, you must simply specify inputs and then test each input to determine what you want the function to return. This ends up cluttering the function definition with many “if” or “case” statements. This in turn lowers the readability and writeability of the language in question. Advanced functional languages take advantage of a special syntax that allows the definition of the return value based on the inputs to the function.

A second, and more important, use for pattern matching is for binding the formal parameters of the function. In many languages, if you wish to extract different parts of the parameters in a function, you have to define local variables to hold the parts that you are interested in. In Qi, you use the pattern matching operator to do the special assignment for your cases up front. This simplifies the programming and makes the program more readable (and hence maintainable).

Qi provides 2 special operators for the purposes of parameter pattern matching. The -> operator is equivalent to the = operator in Haskell. It specifies that a pattern matched to the left of the arrow will produce the results specified on the right of the arrow, after rewriting the function on the right to use the values matched on the left. The other operator is _ and it basically tells the rewriter to “throw away” the input. Thus the Qi pattern matching system corresponds to what computer scientists call a priority rewrite system.

Now that we have dispensed with some dense technical verbiage, let us look at what all of this actually means to a Qi programmer. We will start with the first meaning of pattern matching, namely matching basic elements with no bound formal parameters (i.e. none of what you would call “variables”). We will take 3 animals, jane, chuck, and rex and make a function called likes_to_eat that lets us know that jane likes pizza, chuck like cereal, and rex likes dog_food.


(define likes_to_eat
jane -> pizza
chuck -> cereal
rex -> dog_food
)

(11-) (likes_to_eat jane)
pizza

(12-) (likes_to_eat rex)
dog_food


Seems simple enough, no? So now lets see what our function has to say about what bob likes to eat.

(13-) (likes_to_eat bob)
error: partial function likes_to_eat;

It produces an error as expected as we have not defined what bob likes in the above function. The error message could perhaps be more useful, but it does point us to the problem. Qi will also then ask if we want to “track” the function. This is Qi’s way of helping us find where we went wrong in our function call. If you say yes, Qi will display a lot of diagnostic messages when it encounters the likes_to_eat function so you can determine what went wrong.

So let us modify our function to return the symbol “unknown” whenever we don’t know the animal that the function was called with. We use the _ operator to tell the system that we don’t care about the input, we just want to return unknown if we reach the pattern.

(define likes_to_eat
jane -> pizza
chuck -> cereal
rex -> dog_food
_ -> unknown
)

Running this version gives us the following output.

(21-) (likes_to_eat bob)
unknown

Note the difference between this version and the following

(define likes_to_eat
_ -> unknown
jane -> pizza
chuck -> cereal
rex -> dog_food
)

(23-) (likes_to_eat jane)
unknown

The pattern matches from top to bottom. Thus the _ operator matches anything input and returns unknown as a result, even though we have specified jane later on in the function. This is only the beginning of how tricky these sorts of rewrite rules can be, but as long as you remain aware of the order of evaluation, you should have relatively few problems with them.

While returning “unknown” is OK, we may want a more useful error message. We now want to modify our function likes_to_eat to display a helpful error message that tells us WHAT caused the function to fail. We also want to stop the computation at this point. Qi provides a helpful function called “error” that stops the program and returns a string. We will make the string say “likes_to_eat has not heard of bob” if we input bob.

To make the program work as we described, we want to return the following function (error "likes_to_eat has not heard of ~A" bob). Of course, you can immediately see that bob is a variable that we will need to input into the error function. This allows us to introduce the concept of formal parameters. In Qi, formal parameters start with an upper case letter. For simplicities sake, let us call the variable which will bind to bob X. Now our result becomes (error "likes_to_eat has not heard of ~A" X) where X is bound to bob. The Qi syntax makes this rule easy to write, just put X on the left of the arrow. We thus create a new definition of likes_to_eat.

(define likes_to_eat
jane -> pizza
chuck -> cereal
rex -> dog_food
X -> (error "likes_to_eat has not heard of ~A" X)
)

(26-) (likes_to_eat bob)
error: likes_to_eat has not heard of bob

Now let us take a look at a more complicate pattern matching question. In Qi, we can use any of the special constructor forms on the left hand side of the arrow operator. These forms are “list”, “cons”, and “@p”. We can also use the “|” operator inside a list to extract the cons form. I recommend reading the Qi Language book to better understand these forms, but a simple exercise should suffice to show any programmer versed in list processing how they operate.

(43-) (list a b c)
[a b c]

(44-) [ a b c ]
[a b c]

(45-) (cons a (cons b (cons c [])))
[a b c]

(46-) (@p 1 2)
(@p 1 2)

(47-) [ a b | c]
[a b . c]

(48-) [ a | b c]
error: Improper use of |


(48-) [ a | [b c] ]
[a b c]

Let us create an analog of the Haskell take function that takes N arguments from a list and returns the elements extracted. Defining it recursively, we find that if N is 0, then return [] or if the list is [] for any N, return []. Finally, return the application of take to itself by making a list of the head elements and the tail elements being the results of take N – 1.

(define take-alpha
0 _ -> []
_ [] -> []
N [X | XS] -> [ X | (take-alpha (- N 1) XS) ] )

For the first 2 patterns, we can easily see what the patterns match to. For the final pattern, we find that we get a number N and something called [ X | XS ]. This form means that X is the single element head of the list and XS is the remainder of the list. Each call to take-alpha reduces the number of elements in the list by one and reduces the N by one.

(50-) (take-alpha 2 [ a b c])
[a b]

The first 2 patterns work slightly differently than Haskell programmers may be used to. To show this, let us define a function take-beta that reverses the first 2 rules.

(define take-beta
_ [] -> []
0 _ -> []
N [X | XS] -> [ X | (take-beta (- N 1) XS) ] )

(51-) (take-alpha 0 bad_thing)
[]

(52-) (take-beta 0 bad_thing)
[]

(53-) (take-alpha bad_thing [])
[]

(54-) (take-beta bad_thing [])
[]

This differs from Haskell in that Haskell would return the bottom operator for the middle 2 elements because it could not match bad_thing to the list or the number respectively. The typesafe version of Qi would throw an error at compile time if we attempted such a call. In that case, Qi provides a default symbol called #\Escape that works along the lines of the Haskell bottom operator (with significant differences not described here). Regardless, we can see that the notion of “more defined” that we find in Haskell does not apply to the general version of Qi.

You can make the left hand side of the equation as complicated as you wish. We will create a list that has alternating types, the first being a number and the second being a tuple created by the @p operator. For example, we would like the function to make [ 1 (@p 3 4) 5 (@p 8 5)] add up to 26. We want it to be able to do it for any valid list length. The definition follows.

(define complicated-list-add
[] -> 0
[ X | [ (@p Y Z) | XS]] -> (+ (+ X (+ Y Z)) (complicated-list-add XS))
)

Testing this function gives

(58-) (complicated-list-add [ 1 (@p 3 4) ])
8

(59-) (complicated-list-add [ 1 (@p 3 4) 5 (@p 8 5)])
26

If we try to simply input 3 numbers we get the following error.

(60-) (complicated-list-add [ 1 2 3])
error: partial function complicated-list-add;

The pattern matcher could not match 2 to (@p X Y) so it returns the same error we saw when we tried to call likes_to_eat with bob. It doesn’t know how to apply it and it gives us an error. If we wished, we could add error handling in the same manner as likes_to_eat.

One final point remains on the basics of pattern matching in Qi. What happens if we have a variable that we didn’t specify on the left hand side of the arrow? The rewrite rule described above wouldn’t know how to decipher it as we didn’t explicitly define it. Qi has chosen the route of warning the user but defining the function as opposed to generating an error. We can see the behavior below.

(define bad-rewrite
X -> Y)

======> Warning:
the following variables are free in bad-rewrite: Y;

bad-rewrite

(62-) (bad-rewrite 10)
Y

If you are a Lisp programmer, you may worry about variable capture due to this rewrite rule. Fortunately, Qi handles this case for us via the rewrite system and we don’t have to worry about it in this case. This also means we can return proper names from functions in the relatively few cases where we want to do this. But expressive power always means we need to take care of how we code things, thus the reason for the warning.

(63-) ((/. Y (bad-rewrite Y)) 12)
Y

Whew. Congrats to all that made it to the bottom of this pedantic look at the Qi pattern matching system. You should note that this only covers rewrite rules without guards (the Qi “where” special form). If you delve further into Qi, you will find that the combination of the arrow and where can allow native backtracking of the functions that you call. Yes, I said native backtracking of function calls! More on that later.

Enjoy your Qi!

PS: Mark Tarver, the main implementer of Qi, had this to say about the warn vs error issue with Qi variables.

The main motivation for this is to allow users to mix
Lisp code into Qi code. Since Lisp reverts to the
uppercase when case sensitivity is invoked (Qi is
case sensitive), if you want to mix Lisp into Qi you have to violate
the free variable condition.

(0-) (define implode
Chars -> (COERCE Chars STRING))
======> Warning:
the following variables are free in implode: COERCE; STRING;
implode

(1-) (implode [#\a #\b #\c])
"abc"

You see the idea, that is why you get a warning and
not an error. Being able to mix the two is a big plus.

However if you really want to block off functions like
bad-rewrite then (strong-warning +) will do it and you
get an error and not a warning.

(3-) (strong-warning +)
true

(4-) !0

0.
(define implode
Chars -> (COERCE Chars STRING))
error: the following variables are free in implode: COERCE; STRING;

Mark

0 Comments:

Post a Comment