The Shakespeare Conference: SHK 14.1107 Friday, 6 June 2003
From: Gerald E. Downs <
Date: Thursday, 5 Jun 2003 19:58:41 EDT
Subject: 14.1072 Re: King John, Titus, Peele
Comment: Re: SHK 14.1072 Re: King John, Titus, Peele
Since Brian Vickers's critique of Foster's attribution of "Funerall
Elegye" has arisen in the context of methodology, this may be a good
time to examine another of the Elegye arguments.
Foster noted in his book:
'It is nevertheless true that Shakespeare's vocabulary contains an
unusually high percentage of un- words, even at 2.5 percent. The same
is true of W.S. In the Elegy . . . there are 4,445 words. Of 1,420
"different" words (Spevack's definition), 28 are (different) un- words,
or 2.0%. W.S. thus comes closer to Shakespeare's mean than does
Shakespeare himself in Venus and Adonis (0.9%), The Rape of Lucrece
(1.4%), the Sonnets (1.2%), and "A Lover's Complaint" (0.7%).' (96)
Vickers, in his book " 'Counterfeiting' Shakespeare" writes of the
"Foster climaxed his discussion with that entirely fortuitous
statistical correlation between the Elegye and the global figures for
Shakespeare's usage." (112)
Foster reported Shakespeare's usage (according to Spevack) at 29,066
different words, of which 724 are 'un-'words, 2.5% of the total. He
referred to this figure as "Shakespeare's mean". Vickers referred to a
"statistical correlation between the Elegye and the global figures for
The 2.5% figure is not a mean. There is not a statistical correlation
between Elegye and Shakespeare's works, except to show they differ.
Ford uses far more 'un-' words than Shakespeare, who never approaches
the Elegye figures. Vickers could have gone to the heart of the matter.
If "global" refers to total usage in all Shakespeare's works (recently
reduced by an elegy), then canonical percentages are higher than the
average for individual works -- for any canon.
Foster reported Shakespeare's poems as having 1.4% or less 'un-'words.
He reported no play figures, but implied that they would reflect a usage
of 2.5% or greater to bring the works up to a "mean" determined from
It doesn't work that way. Total 'un-' words grow at a faster rate than
canonical words. The more works, the higher percentage of 'un-' words.
This fact is not reflected in totals or percentages for individual
Othello has 25,982 words, 3,917 different words, and 50 'un-' words, or
The Winter's Tale has 24,680 words, 4,023 different, and 48 'un-' words:
Merry Wives, 3,327 unique words and 23 'un-' words: 0.69%.
All's Well, 22,585 words, 3,604 different, 37 'un-' words: 1.0%.
Henry IV, Pt. 1, 40 'un-' words at 1.0%.
Mutant HIV, 27 'un-' words of 4,229: 0.64%.
Two Gents 34 of 2,818 unique words, or 1.2%.
Love's Labor's Lost, 26 'un-' words: 0.67%.
Errors, 29, or 1.1%. Speaking of errors, these figures have some, no
The Tempest has only 21 'un-' words out of 3,288, or 0.64%. That's less
than 10 per 1,000 lines. Ford's The Lady's Trial has 58 'un-' words in
King John, 47 separate 'un-' words in 2,575 lines.
That's 1.3% of the 3,651 types.
Julius Caesar, 23 'un-' words: 0.78%.
As You Like It, 31 'un-' words: 0.92% of 3,365. That's three more than
Elegye in more than twice as many unique words.
Henry V, 33 'un-' words of 4,674: 0.7%. MV, 0.98%.
Much Ado is a play of 2600 lines, 20,800 tokens, and 3,049 types, of
which 20 are 'un-' words, or 0.65%.
Ford's Witch of Edmonton has 20 'un-' words in 829 lines.
Troilus, 4,400 types, 43 'un-' words, 0.98%, Twelfth Night, 3,233, 34,
1.05%, Macbeth, 3,448, 41, 1.2%, Cymbeline, 4,467, 41, 0.92%.
Midsummer Night's Dream, 18 'un-' words, 0.59%.
Ford's Fames Memorial has 26 in less than 1200 lines.
Pericles, 0.83% of 3,394 words. Henry Sixth Pt. 1 has 29 'un-' words:
0.73%, Pt. 3, 37 at 1.0%, and Pt. 2, with more words total , has only 34
Timon, 34, or 1.0%. Taming, 30, or 0.89%. Titus, 30, 0.86%. The Two
Noble Kinsmen has 33 'un-' words in 3,430 lines.
Some of Shakespeare's plays do have higher numbers of 'un-' words. King
Lear has 55, or 1.3% of the types. That's about 17 per 1,000 lines.
Ford's Perkin Warbeck has 57 in a thousand lines less, or 24/1,000.
Richard II also registers 1.3% (50 of 3,743 types, in 2,750 lines).
Ford's Laws of Candy has 50 in 1,989 lines.
Romeo and Juliet, 46 in 3,838 types: 1.2%.
Coriolanus, 49 in 4,216, also 1.2%, in 3,800 lines.
Ford's Love's Sacrifice has 49 in 1,700 lines. Same number, less than
half the lines.
Richard III, 57 for about 1.36%
Measure for Measure has 50 'un-' words in 3,403 types, or 1.47%. In
2,940 lines that's 17 per 1,000. Ford's Lover's Melancholy, with 46 in
1,824, rates at over 25 per 1,000 lines. Shakespeare never comes close
in his plays to Ford's figures.
Hamlet has the most 'un-'words of Shakespeare's plays, 69, in 4,894
types from 29,673 total words, or 1.4%. That's from about 3,800 lines,
or 18 per 1,000.
The number of lines is a direct function of the tokens (almost 30,000
total words in Hamlet) but reflects a logarithmic relationship to the
types, which are six times less in number. Ford's The Broken Heart has
63 'un-' words in only 2,328 lines. That would reflect about 3,000
different words, or 2% 'un-' words, a figure never approached by
No matter how they're sliced, Shakespeare's plays never use 'un-' words
with Ford's higher frequencies. The plays do correspond to Shakespeare's
poems, and the works as a whole show consistency. Shakespeare habitually
hovered around the one percent mark. The highs and lows probably reflect
subject matter more than anything else.
Each succeeding work comprises repeats and new uses, some of which are
nonce-words. But for each work itself, the repeats from earlier works
are counted anew to find its percentage of 'un-' words. These
recurrences make the individual percentages irreconcilable with the
Shakespeare used 724 different 'un-' words. Starting with Hamlet, we get
68 of them. Much Ado adds 20, but those are reduced by repeating
Hamlet's 'unjust,' 'unknown,' 'unworthy,' and 'undo.' To those 86 the
next work will also add less than its own total.
After those two, the remaining works need average only 16 new
Shakespearean usages to get to 724, and those will be supplied mostly by
the first works counted. The individual figures (type-to-type
percentage) have no direct correlation to the canonical figures.
Venus and Adonis is reported by Foster as having 'un-' words at a rate
of 0.9% of the total "different" words. There are 23 'un-' words in
V&A, which gives about 2,555 different words (23 / .009). The poem has
1,180 lines, twice as many as Elegye, or about 9,000 total words.
Expressed as 'un-' words per 1,000 lines, V&A has 19.5, two and one-half
times less than Elegye.
Lucrece is reported to have 1.4% 'un-' words, which number 49 of about
3,500 different words in a poem of 1,846 lines. That works out to 26.5
per 1,000 lines. This is Shakespeare's largest usage per 1,000 lines,
but MM has 1.47% and only 17 per 1,000. That's how the total number of
lines changes the figure. That's also why Elegye at 48/1,000 sounds so
high. Per thousand figures are meaningful only in groups of equal line
The Sonnets 'un-' words are 1.2% of 3,167 words, 38 in all. In about
2,140 lines, that's less than 18 per 1,000 lines.
Christ's Bloody Sweat has (by my count) 61 separate 'un-' words. In 1908
lines, that's 32 / 1,000, considerably more than the shorter Lucrece.
The first 1,000 lines of Ford's poem has 38 'un-' words, as many as the
Sonnets in half the lines.
Comparing poems, Shakespeare's are far outside the figures for Elegye.
The plays are within the same range.
I believe most works of the era would hover around the Shakespeare canon
in frequency of 'un-' words, and such figures would not usually be
useful for attribution studies. If 'un-' words tell us anything, it's
that Shakespeare didn't write Elegye.
Gerald E. Downs
S H A K S P E R: The Global Shakespeare Discussion List
Hardy M. Cook,
The S H A K S P E R Web Site <http://www.shaksper.net>
DISCLAIMER: Although SHAKSPER is a moderated discussion list, the
opinions expressed on it are the sole property of the poster, and the
editor assumes no responsibility for them.