Make a Donation

Consider making a donation to support SHAKSPER.

Subscribe to Our Feeds

Current Postings RSS

Announcements RSS

Home :: Archive :: 2006 :: November ::
New Shakespeare Search Engine Launches
The Shakespeare Conference: SHK 17.1023  Monday, 20 November 2006

From: 		Martin Mueller <
 This e-mail address is being protected from spambots. You need JavaScript enabled to view it
 >
Date: 		Saturday, 18 Nov 2006 16:42:08 -0600
Subject: 17.1016 New Shakespeare Search Engine Launches
Comment: 	Re: SHK 17.1016 New Shakespeare Search Engine Launches

I read with interest the announcement about the new Shakespeare tool 
called Shakespeare Searched. Very nice it is. Somewhere, I believe, in 
Gilbert & Sullivan somebody says that in this world and age you have to 
blow your own trumpet, because nobody else will. So may I suggest that for 
some students of Shakespeare, and certainly for students in AP courses or 
colleges Northwestern's WordHoard is a superior tool 
(http://wordhoard.northwestern.edu), and like Shakespeare Searched, it is 
free.  It is superior for several reasons:

1. It is based on a text that, while not perfect, is much better than the 
Moby Shakespeare, and--at least in its sequence of words--differs only 
trivially from Arden, Bevington, or Riverside
2. It is much more explicit about what it lets you do
3.It lets you do a lot more

Shakespeare Searched apparently uses collocation statistics, and it does 
so in an ingenious way. If you look for 'blood' in Macbeth, you get a 
standard concordance, but you also get another list of suggested words or 
'topics'. You are not told why these words are topics, but if you look a 
little more closely you see some algorithm at work. The algorithm 
identifies words that by some criterion occur more often around blood than 
you would expect. How much more often?  You're not told.

In WordHoard, you can look for collocates of 'blood' in Macbeth, and you 
can define the collocates quite precisely by specifying a distance of 
words before and after, And when you see the results you see the 
likelihoods associated with it. A little more work, to be sure, but a lot 
more transparent. You see the evidence for the proposition that in this 
context word X is a disproportionately frequent companion of word Y 
(Remember J. Firth: you shall know a word by the company it keeps).

Frequency and salience are not the same. There can be frequency without 
salience and salience without frequency. Having the numbers helps making 
that point.

There is a lot more you can do with WordHoard than with Shakespeare 
Searched. Take 'blood'. Is it a disproportionally common word in Macbeth 
in the context of the other tragedies? Actually it isn't.  Its relative 
frequency is twice as high, but by statistical measure that is not 
particularly impressive. 'Bloody' ranks higher. We may have salience 
without frequency here.

What are the words that are disproportionately frequent in Macbeth when 
compared with the other tragedies? An odd but telling list (in descending 
order): that, the, knock, tyrant, hail, we, king, fear, wood, sleep, 
trouble. Don't ask me about 'that' and 'the' (although there may be 
interesting answers), but the remaining words certainly tell a story.

The most powerful feature of WordHoard, however, is almost certainly its 
extraordinarily flexible concordance tool. You can look up a word, and if 
it is a common word, you can group and sort the word by different 
criteria. Looking up 'blood' you see at once that the top eight works 
include six histories, Macbeth and Julius Caesar. From which you gather 
that in the histories 'blood' is probably both dynastic and gory, but that 
in Julius Caesar and Macbeth it is mainly gory. And that's an interesting 
pointer to the deep relationship between those two plays. If you look for 
'sad' and group results by play and scene, you see that eight of its nine 
occurrences in the Merchant of Venice are in 1.1. Very telling.

WordHoard expects a little more work from users. But it does a lot more 
for them, and wherever it uses statistical procedures to foreground 
features, it scrupulously gives you the evidence for why it does what it 
does.

_______________________________________________________________
S H A K S P E R: The Global Shakespeare Discussion List
Hardy M. Cook, 
 This e-mail address is being protected from spambots. You need JavaScript enabled to view it
 
The S H A K S P E R Web Site <http://www.shaksper.net>

DISCLAIMER: Although SHAKSPER is a moderated discussion list, the opinions 
expressed on it are the sole property of the poster, and the editor 
assumes no responsibility for them.
 

©2011 Hardy Cook. All rights reserved.