Declarative Languages
Lecture #8
Purpose: Equality, hash-tables and blocks.
8.1 Introduction: equality
We encountered earlier various predicates for comparing specific types
of lisp object:
-
= for numbers (compares any number of objects)
-
char= for characters (compares any number of objects)
-
string= and string-equal for strings (compares precisely
two objects)
and one predicate
-
equal (compares precisely two objects)
for comparing general lisp objects. I have assured you that if two objects
print the same then they are equal. Now let's get closer to the
truth, by introducing two further general equality predicates (each taking
precisely two arguments): the functions eq and
eql.
8.2 Equality and eq
Two objects are eq only if they are in fact the same object.
Quite how this works depends on their types. Obviously, if the objects
have different types they cannot be the same object and so they cannot
be eq.
-
If two symbols print the same then they are defined to be identical
and hence they are guaranteed to be
eq:
(mapcar 'eq '(foo t nil :wibble) '(foo t nil :wibble)) =>
(t t t t)
-
If two integers have the same value and are fixnum then
they are
eq. To be fixnum, an integer has to be between
the constants most-negative-fixnum and most-positive-fixnum
inclusive. The values of these two numbers is implementation-dependent.
-
In LispWorks for Windows, these values are -223 and +223-1.
Most implementations these days go higher than that in the fixnum range
(typically to 228
or 229).
-
In theory, you cannot guarantee that typing in the same character
twice will result in two eq objects: eg
may turn out to be false. In practice however no implementation worth its
salt will harass you like this and for LispWorks the above expression is
true.
-
With any other objects, looking the same is no guarantee of eq.
Typing in two identical looking bignums (that's an integer which
is not a fixnum), or floats, or lists, or strings, or vectors, will result
in numbers which are =, or in sequences which have the same members,
but it has generated different structures with different addresses in memory
and these will not be eq.
(eq 1.0 1.0) => nil
(eq 8388608 8388608) => nil ; but may be t in
other implementations
(eq '(a b c) '(a b c)) => nil
(eq "this takes some thought" "this takes some thought")
=> nil
If you have two pointers to the same thing then they will be eq.
For example, (eq something something) is true no matter what value
something
has. The following function too will always return true (for any argument):
(defun always-true (thing)
(let* ((my-list (list thing)))
(eq thing (first my-list))))
Note that if a function generates new objects, then these cannot be eq
to each other:
CL-USER 21 > (let* ((things nil))
(dotimes (i 2)
(push '(t) things)) ; pushing the same object each time
(eq (first things)
(second things)))
T
CL-USER 22 > (let* ((things nil))
(dotimes (i 2)
; pushing s new object each time
(push (list t) things))
(eq (first things)
(second things)))
NIL
CL-USER 23 >
This includes all functions (e.g. copy-list) which are defined
as returning a fresh copy of some object, for example:
(let* ((foo '(1 2 3 4))) (equal foo (butlast
foo 0))) => t
(let* ((foo '(1 2 3 4))) (eq foo (butlast foo
0))) => nil
8.3 Equality and eql
Two objects are eql if
-
they are eq or
-
they are both numbers of the same type and the same value or
-
they are both characters that represent the same character (as noted
above, this distinction is not worth bothering with and you can in practice
assume that two identical looking characters are eql simply because
they are eq).
A large number of lisp functions use a predicate for comparing objects;
this tends to be specified as an optional argument and the default value
is typically eql (see section 17.2.1 of the HyperSpec). As an
example, consider the function position which takes an object
and a sequence, and returns the first index into the sequence at which
the object was found (or nil if it was not found):
(position 'wibble '(foo bar wibble baz wombat)) =>
2
The objects are compared by eql, unless another predicate is handed
in as the value to the :test argument
(position "wibble" '("foo" "bar" "wibble" "baz" "wombat"))
=> nil
(position "wibble" '("foo" "bar" "wibble" "baz" "wombat") :test
'string=)
=> 2
Related to position is the function position-if which
takes a predicate (of one argument) and a sequence:
(position-if (lambda (x) (and (numberp x) (plusp x) (evenp
x)))
'(digits of pi are 3 1 4 1 5 9))
=> 6
and related to both of these are find and find-if, which
return the item which was found rather than its position.
(let ((bits '("foo" "bar" "wibble" "baz" "wombat")))
(eq (third bits)
"wibble")) => nil
(let ((bits '("foo" "bar" "wibble" "baz" "wombat")))
(eq (third bits)
(find "wibble" bits :test 'string=)))
=> t
8.4 Revisiting equal
Two objects are equal if
-
they are eql or
-
they are strings, of the same length, which match (by eql)
character for character or
-
they are both of type cons and
-
the two cars are equal and
-
the two cdrs are equal
-
(there are a couple of further conditions under which this predicate is
true but we haven't met the objects they apply to yet)
8.5 Hash-tables (an excuse for knowing about eql)
We know about the following general (in the sense that they can contain
any lisp objects) data structures:
-
cons: used for building lists and trees; good for flexibility
as you can add or remove cells very easily; slow access over long sequence
-
vector: one-dimensional array; inflexible but fast access no matter
how large the sequence
-
structure: user-defined type; behaves like a vector whose fields
are named rather than numbered
We now introduce the hash-table. This is a data structure
whose indices may be general lisp objects, which offers flexibility similar
to lists and which delivers lookup times intermediate between lists and
vectors.
Name |
Index by |
Flexibility |
Data ordered into sequence? |
Speed |
Use |
cons |
first and rest |
good |
yes |
slow access over long sequence |
building lists and binary trees |
vector |
numerical index |
poor |
yes |
fast, independent of length of sequence |
random lookup and rapid traversal of large data sets |
structure |
field name (not available at run-time) |
poor |
no |
fast, independent of number of fields |
user-defined types |
hash-table |
any lisp object |
good |
no |
intermediate |
dictionaries, general object maps |
If we weren't bothered about lookup times, we could implement something
like this with lists:
CL-USER 11 > (defun get-from-list (index list)
(dolist (list-member list)
(let* ((maybe-index (first list-member))
(maybe-value (second list-member)))
(if (equal index maybe-index)
(return maybe-value)))))
GET-FROM-LIST
CL-USER 12 > (defparameter phone-numbers
'(("Nick" 2330) ("Martin R" 2356) ("Bob" 2342)))
PHONE-NUMBERS
CL-USER 13 > (get-from-list "Bob" phone-numbers)
2342
CL-USER 14 > (get-from-list "Ethel the Aardvark" phone-numbers)
NIL
CL-USER 15 >
(and with more, slightly nastier code to add, reset and remove the phone
numbers). Using hash-tables hides the above nastiness and is reasonably
fast even when it gets large.
To make a hash-table, call the function make-hash-table. To
look values up in the table use the function gethash (setfable).
To remove a single entry altogether use remhash, and to empty
a hash-table completely call clrhash. For example:
CL-USER 9 > (defparameter *table* (make-hash-table :test 'equal))
*TABLE*
CL-USER 10 > (dolist (pair '(("Nick" 2330) ("Martin R" 2356) ("Bob"
2342)))
(let* ((key (first pair))
(value (second pair)))
(setf (gethash key *table*) value)))
NIL
CL-USER 11 > (gethash "Bob" *table*)
2342
T
CL-USER 12 > (gethash "Ethel the Aardvark" *table*)
NIL
NIL
CL-USER 13 >
Notes:
-
make-hash-table takes a keyword argument :test which
determines how keys (i.e. the indices) will be compared. You do
not have an open choice of any predicate here: you are limited to eq
eql equal and equalp (look this last one up if you feel a
burning urge to do so). The default test is, as ever, eql.
-
gethash is like nth (and unlike aref): it takes
the key as first argument and the table comes second.
-
gethash returns two values (like read-line did): the
second value tells you whether anything was found or not. This allows you
to distinguish between finding nil and not finding anything (in
both cases, the primary return value is nil).
-
(setf gethash) can be used both to add new values to the table
(as in the above example) and to reset existing values.
Once you've built a hash-table, a useful function for traversing it is
maphash,
which takes a function and hash-table as arguments. The function is invoked
repeatedly for each entry in the table, with two arguments (a key and the
corresponding value). For example:
CL-USER 13 > (maphash (lambda (name number)
(format t "~&~a is on extension ~a"
name number))
*table*)
Nick is on extension 2330
Martin R is on extension 2356
Bob is on extension 2342
NIL
CL-USER 14 >
Note:
-
the order in which the entries are processed by maphash is implementation
defined and may even not be the same twice running.
-
maphash always returns nil.
8.6 Blocks
We have met the macro return which allows "premature" exit
from the various looping macros (dotimes dolist loop etc). A generalization
of this is the special operator return-from which in particular
allows early exit from any (named) function.
CL-USER 14 > (defun one-value (table)
(maphash (lambda (key value)
(declare (ignore key))
(return-from one-value
value))
table))
ONE-VALUE
CL-USER 15 > (one-value *table*)
2330
CL-USER 16 >
The above (admittedly somewhat pointless) function returns one value extracted
from the hash-table supplied as its argument.
-
When we're within the body of a (named) function, we are said to also be
inside a block with the same name. So while we are in the body of
one-value,
we are inside a block named one-value.
-
Within a block, you can leave at any time with the special operator return-from.
It takes two arguments: the first (not evaluated) is the name of the block
you want to leave, the second (evaluated; optional and defaulting to nil)
is the value to return.
-
The looping macros (dotimes dolist loop etc) establish a block
named nil, so you could exit them by calling (return-from
nil). The macro return is shorthand for this.
-
The above examples of blocks (established by defun or by the looping
macros) are said to be implicit, because they are created behind
your back. You can establish blocks of your own, at any point in your code,
using the special operator block (look it up in the HyperSpec).
-
The special operator return-from is said to be lexical
in scope - it only works within the textual confines of the block it refers
to.
Also in the above code you should be aware of the following:
-
In the lambda form in one-value, two arguments have to be supplied
(because that's how maphash works) but only one is actually wanted
(or used). The (declare (ignore ...)) form is included immediately
after the parameter list to prevent compiler warning along the lines of
;;;*** Warning in ONE-VALUE: KEY is bound but not referenced
Declarations can appear after function parameter lists, after the bindings
in let*, and in many other macros and special operators - see
figure 3-23 in the HyperSpec for the full list.
8.7 Practical session / Suggested activity
Convert last week's work to store student records in a hash-table (accessible
by name) rather than in a list. Write functions to add a new student, to
find the record of a student with a given name, and to delete a student.
As before, write functions to name the three students who have the highest
marks, or to spot which lecturer fails most of their students.
Use return-from in a function to return the SID of any student
who hasn't attmpted any modules at all.
Comment on which data stucture was "best". [Define "best".]
8.8 Further reading & exercises
-
Look up the definitions of the functions eq, eql and
equal
in the HyperSpec. Try out many examples. Make really sure you appreciate
the difference between them.
-
Either: justify carefully the statement that "if two objects print the
same then they are
equal." or give a simple counter-example. Consider
vectors.
-
Define your own version of equal (call it my-equal) in
terms of eql; define eql in terms of eq, =
and char=.
-
Actually, position takes more keyword arguments than the one (:test)
given above. Look them up and try them out.
-
To see multiple-values taken to a mild excess, call the function (get-decoded-time).
If you can't figure out what the result means, try calling it once or twice
more (and then, if necessary, either ask me or look it up). Read section
5.5 of Graham and implement a function which prints today's date, or the
time, or both. (You might attempt to emulate the format of UNIX date(1),
eg "Fri Sep 15 14:12:41 BST 2000".)
-
Write a function of two arguments which simply returns its first argument.
Defining your function should not cause any warnings to be signalled.
-
Redefine function position-three (section 5.4) to use return-from
rather than return. Do this twice: first returning from a block
named nil, then returning instead from a block named position-three.
Does either of these new functions give you anything which the original
didn't?
-
Define a function which doubles every member of a list. If any member of
that list is not a number, simply return nil from your function.
-
Suppose by some ghastly misunderstanding there was a lisp where the implementers
had forgotten to include the type vector or any operations based
on it.
See if you can use hash-tables to plug the gap, implementing enough
of the basics (my-make-array, my-length, my-aref
and my-setf-aref at the very least) to prove that the concept
works. Make sure that my-aref checks that the index is within
bounds (you can store the upper-bound in the hash-table).
-
Jon L White (jonl@ptolemy.arc.nasa.gov) writes:
>
> I showed up at MIT around the summer of 1966 (as a cross-registered
> graduate student from Harvard) and the FOO, of FOO and BAR, was
generally
> recognized then as a variant, and a softening, of the oft-used
phrases
> from the American Military organizations "Situation Normal: All
F***ed Up",
> for "Snafu" and "F***ed Up Beyond All Recognition" for
"Fubar"
>
> No one has---to my knowledge---verified this as an accurate origin;
merely
> that the story, as told, has it roots probably prior to the Vietnam
War,
> and maybe even going back before the second World War.
If it is _true_,
> then one must wonder where the ancient "Phooey" came from?
Maybe the
> above explanations are merely the first of the rounds of urban
legends.
>
> So. The interesting question to ask your students is this:
>
> Let Foo and Bar be the first two metavariables; now,
name the third.
>
> -- JonL --
Copyright (C)
Nick
Levine 1999. All rights reserved.
Last modified 2000-09-14
$Id: //info.ravenbrook.com/user/ndl/lisp/declarative/lectures/lectures/lecture-8.html#2 $