The Shakespeare Conference: SHK 17.1023 Monday, 20 November 2006
From: Martin Mueller <
This e-mail address is being protected from spambots. You need JavaScript enabled to view it
>
Date: Saturday, 18 Nov 2006 16:42:08 -0600
Subject: 17.1016 New Shakespeare Search Engine Launches
Comment: Re: SHK 17.1016 New Shakespeare Search Engine Launches
I read with interest the announcement about the new Shakespeare tool
called Shakespeare Searched. Very nice it is. Somewhere, I believe, in
Gilbert & Sullivan somebody says that in this world and age you have to
blow your own trumpet, because nobody else will. So may I suggest that for
some students of Shakespeare, and certainly for students in AP courses or
colleges Northwestern's WordHoard is a superior tool
(http://wordhoard.northwestern.edu), and like Shakespeare Searched, it is
free. It is superior for several reasons:
1. It is based on a text that, while not perfect, is much better than the
Moby Shakespeare, and--at least in its sequence of words--differs only
trivially from Arden, Bevington, or Riverside
2. It is much more explicit about what it lets you do
3.It lets you do a lot more
Shakespeare Searched apparently uses collocation statistics, and it does
so in an ingenious way. If you look for 'blood' in Macbeth, you get a
standard concordance, but you also get another list of suggested words or
'topics'. You are not told why these words are topics, but if you look a
little more closely you see some algorithm at work. The algorithm
identifies words that by some criterion occur more often around blood than
you would expect. How much more often? You're not told.
In WordHoard, you can look for collocates of 'blood' in Macbeth, and you
can define the collocates quite precisely by specifying a distance of
words before and after, And when you see the results you see the
likelihoods associated with it. A little more work, to be sure, but a lot
more transparent. You see the evidence for the proposition that in this
context word X is a disproportionately frequent companion of word Y
(Remember J. Firth: you shall know a word by the company it keeps).
Frequency and salience are not the same. There can be frequency without
salience and salience without frequency. Having the numbers helps making
that point.
There is a lot more you can do with WordHoard than with Shakespeare
Searched. Take 'blood'. Is it a disproportionally common word in Macbeth
in the context of the other tragedies? Actually it isn't. Its relative
frequency is twice as high, but by statistical measure that is not
particularly impressive. 'Bloody' ranks higher. We may have salience
without frequency here.
What are the words that are disproportionately frequent in Macbeth when
compared with the other tragedies? An odd but telling list (in descending
order): that, the, knock, tyrant, hail, we, king, fear, wood, sleep,
trouble. Don't ask me about 'that' and 'the' (although there may be
interesting answers), but the remaining words certainly tell a story.
The most powerful feature of WordHoard, however, is almost certainly its
extraordinarily flexible concordance tool. You can look up a word, and if
it is a common word, you can group and sort the word by different
criteria. Looking up 'blood' you see at once that the top eight works
include six histories, Macbeth and Julius Caesar. From which you gather
that in the histories 'blood' is probably both dynastic and gory, but that
in Julius Caesar and Macbeth it is mainly gory. And that's an interesting
pointer to the deep relationship between those two plays. If you look for
'sad' and group results by play and scene, you see that eight of its nine
occurrences in the Merchant of Venice are in 1.1. Very telling.
WordHoard expects a little more work from users. But it does a lot more
for them, and wherever it uses statistical procedures to foreground
features, it scrupulously gives you the evidence for why it does what it
does.
_______________________________________________________________
S H A K S P E R: The Global Shakespeare Discussion List
Hardy M. Cook,
This e-mail address is being protected from spambots. You need JavaScript enabled to view it
The S H A K S P E R Web Site <http://www.shaksper.net>
DISCLAIMER: Although SHAKSPER is a moderated discussion list, the opinions
expressed on it are the sole property of the poster, and the editor
assumes no responsibility for them.
|