The Shakespeare Conference: SHK 16.1393 Thursday, 25 August 2005
 From: Marcus Dahl <
Date: Wednesday, 24 Aug 2005 14:06:32 +0100
Subj: RE: SHK 16.1367 Wager
 From: Jonathan Hope <
Date: Wednesday, 24 Aug 2005 16:21:45 +0100
Subj: Re: SHK 16.1367 Woodstock
 From: Lene Petersen <
Date: Wednesday, 24 Aug 2005 20:32:02 +0100 (BST)
Subj: Re: SHK 16.1367 Wager/ Evidence
From: Marcus Dahl <
Date: Wednesday, 24 Aug 2005 14:06:32 +0100
Subject: 16.1367 Wager
Comment: RE: SHK 16.1367 Wager
As someone who has been looking into the area of Shakespeare attribution
studies for the last few years I would like to offer a few cautionary
notes and issues (and to prod a few obvious wounds).
(1) I have recently re-counted all of the negative use of the un (vn)
prefix in (all) Shakespeare (Folio) and Robert Greene (for comparison)
retracing Michael Egan's recounts of Hart's counts. Egan I think
believes that his own recounts (which differ from those of Hart) show
the inadequacy of numerical tests. The fact that we can now use
computers to test these figures and provide an accurate answer in days
to the question of numbers refutes this claim. For those interested, I
have a complete list of all the 'un' words I counted for comparison
(which alas neither Hart nor Egan can provide). If anyone else would
like to confirm my counts, that would be useful. Here are the counts:
Play Hart's figures Egan count MXD
2 Henry VI 34 44 42
Romeo and Juliet 44 57 48
Richard II 52 61 54
1 Henry IV 39 45 42
Twelfth Night 33 39 38
Hamlet 71 80 75
Lear 55 62 60
Coriolanus 48 55 55
The Tempest 20 20 21
ASYOU LIKE 37
JULIUS C 25
RICH III 73
(2) Unfortunately these counts prove that the test itself is not
particularly useful. When I compared all of Shakespeare Folio with the
works of Greene, (using percentages of total word counts and
Discriminant Analysis) it seems evident that the variance within the
data makes for unreliable predictive claims. For example, The Tempest
has a %un count of 0.12% and Much Ado About Nothing 0.09% which is far
less than the average Shakespeare Folio count of around 0.2%. If
comparison is made with Greene (who generally has a lower than
Shakespearean %un count), these plays come out as looking more like
Greene than Shakespeare though their authorship is not in question.
Shakespeare clearly varied his use of un- in quite a considerable way.
(3) This does not mean that statistical data is un-useful but it does
mean that it is significant which data we use. For example, if we
exclude certain 'Shakespeare' plays from our 'core' Shakespeare group on
the basis that they are not 'representative' (as I have noted many
attribution scholars have done - not to name any names) we risk skewing
the data. In the above example for instance, if we only tested middle
period plays with counts of around 0.2% and then tested The Tempest or
Much Ado etc your average statistical analysis package would flag up
those plays as 'un-Shakespearean' on the basis of comparison with the
'core' group. This answer in this case would obviously be false. How
then are we to determine what the outer boundaries of our group is to
be? Statistically this line is blurry. If we include certain dubious
plays, our core baseline (to use Eliott's term) changes - therefore it
is important that the criterion for choosing such a boundary are not
only objective, but sure. For instance, what happens if Ward INCLUDES
1HVI, 2HVI, 3HVI and Titus in his group of Shakespeare plays? Would this
change the answers his tests give? How, before we know the answer as to
the authorship of all of Shakespeare's works, are we to determine what
the correct identification of a Shakespearean baseline will be? See
below on Quartos and 'bad' texts. If, rightly, Ward states that we must
have *some* accepted core group before we can test for differences of
authorship, what do we do with those texts which are *not* core and how
do we use the data from these texts to allow us to understand what we
*mean* by core and not core. i.e. if 1HVI is not core, is it therefore
un-Shakespearean? How do we constitute the margins of the canon?
(4) It depends also on who the core group (or whole group) is being
compared with. In my work on 1HVI it is obvious that there is not always
enough data on suspected other authors to make our comparisons very
accurate. e.g. we have 36 First Folio plays by Shakespeare but only 1
pageant play and various prose pieces by Nashe. And yet many notable
commentators have seen enough evidence in that one play and pieces of
prose from Nashe's to feel that we have sufficient fingerprint of
Nashe's style to attribute large parts of 1HVI to him. Now the
ascription may or may not be correct, but it is important to realise
that the data sources are highly imbalanced. The same fact goes for
comparisons of Shakespeare's works with ANY of his early contemporaries
who died before 1600. (i.e the data for comparison with Jonson is quite
large, but is relatively small for Nashe, Peele, Greene, Kyd - whose
works are not only small in size, but poor in textual quality (i.e.
deriving from largely 'bad quarto' style texts) and uncertain in origin
- ie. did Greene really write Selimus (which was attributed to T.G not
R.G) and how many extant attributively sure plays do we really have by
Kyd or Nashe?)
(5) Continuing from the above - the authorship of many of Shakespeare's
contemporaries' works is uncertain. For many years people thought that
Greene wrote the 'Groatsworth of Wit'. Now it is thought that it may in
fact be by Chettle. Is this true, if so, how does it change our
knowledge of Greene? How much of The Jew of Malta did Marlowe actually
write? How much of Selimus is by Greene? Did Kyd really write Arden of
Faversham? Did Peele really write Troublesome Reign? If not Shakespeare,
who actually wrote Edmond Ironside, Faire Em, Locrine and the rest of
the Shakespeare Apocrypha? How much of the above works is collaborative?
Ward Elliot's tests examine whole plays, not sections, but most
suspected works of Shakespearean authorship such as Edward III or 1HVI
are collaborative - meaning the Elliott tests are of virtually no use
for issues of collaboration (as he himself admits).
(6) What about textual condition? As far as I am aware no-one has lately
cleared up the issue of Memorial Reconstruction or the 'Bad Quartos'.
Who actually wrote down the text of The Taming of A Shrew or Q1 Hamlet?
Who wrote the text of John of Bordeaux or 1RichardII - I mean the actual
text we have. The issue of orally milled texts is unresolved for early
texts i.e. - do we regard 'The First Part of the Contention' or Q1
Hamlet to be 'by' Shakespeare even if his hand was never involved in the
writing of the published text? If as I pointed out on a previous post we
find many examples of oral devices in early 'Shakespearean' texts - such
as A Shrew, Hamlet, Edward III, 1RichII, Titus etc do we hold back our
'Shakespearean' ascriptions or do we say that the 'work' is by
Shakespeare (in part or whole) but the 'text' is corrupt/ written by
ear/ written from memory/'transmitted through performance'? If so, how
does this affect authorship attribution? Personally I regard Q1 Hamlet
as 'by' Shakespeare though I doubt he sat down and wrote the words in
that exact form - this may be the case for many of these early plays and
it may also explain the huge amount of literary parallels between plays
like Edmond Ironside and 1Rich II with more canonical Shakespeare -
these are stage-derived texts from the same theatre as Q2 Hamlet or The
Taming of The Shrew etc. But can orally milled texts be accepted into
the mainstream canon? Can they indeed be attributed to a fixed author at
all? My own tests reveal that the Shakespeare first quartos are
overwhelmingly closer to canonical Shakespeare than any other author of
the period. This is of course no surprise since they often share more
than 50% of the same words to their Folio brothers. However if we admit
Q1 Hamlet into the canon, must we admit Troublesome Reign, Edmond
Ironside, 1RichardII etc on the same basis - as corrupt or oral versions
of 'true Shakespeare originals'? More of this later.
(7) Some tests appear to work well for separating authors but are highly
subject to date. I.e. Jonathan Hope's 'Do' Auxiliary test - on the basis
of his own evidence- is efficient at separating Shakespeare from later
contemporaries but poor at separating his works from texts published or
written earlier than 1595. Therefore the fact that say, 1RichardII comes
out as 'possibly Shakespearean' on this test, indicates little about its
authorship and more about its date. Other tests such as the 'You/ Ye'
test are similarly prone to such problems.
(8) Tests can be either positive (Egan) or negative (Eliott) - but what
to do with the difference? If, as I noted the other day on Shaksper the
interesting spelling 'YOR' for 'YOUR' is ONLY found in the possibly
orally derived texts 1RichardII, Edmond Ironside and John of Bordeaux
out of 256 other early modern texts and NOWHERE in Shakespeare - should
we regard this evidence as an indication that these 3 texts are
un-Shakespearean but unified in some way nonetheless (author? scribe?)?
If Ward Elliott's tests show that 1RICHII NEVER passes all the standard
tests which all of canonical Shakespeare easily pass, does this mean we
should exclude it from the canon? It does seem unlikely as Ward states
that a real Shakespeare play would come along and fail all the standard
Shakespeare tests passed by all other canonical Shakespeare. If Michael
Egan's tests show positive parallels with canonical Shakespeare, is this
enough evidence to include the text? Precisely similar evidence to
Michael Egan's linguistic parallels have been used (mainly by the late
Eric Sams) to attribute Edmond Ironside, The Troublesome Reign of King
John, Locrine, Faire Em and Edward III to Shakespeare. There are
hundreds of examples in Sams' works of close parallels of langauage
/images/ ideas etc between canonical Shakespeare and the apocryphal
plays - and yet most mainstream scholarship has been reluctant to accept
Sams' examples as evidence of common authorship. But if they are not
examples of common authorship, then what are they and how did they get
there? This question needs answering too.
(9) If you do not know who the other possible candidates for the
authorship of a play are - how do you test for them? i.e. does a play
which has lots of Shakespearean themes, words, etc have to be by
Shakespeare simply because it sounds a bit like Shakespeare and we can't
think of anyone else to attribute it to. As Michael Egan's own comment
on yesterday's post indicates - plays such as Edward III or 1RichII
might only appear Shakespearean because we can't think of anyone else to
attribute them to. But this in itself is not sufficient reason to
attribute the play to one author over another. For example, there are
hundreds of close parallels of language between 1HVI and Spenser but
since clearly Spencer is not an authorial candidate for the play we
exclude them as merely literary parallels. But what happens if an author
IS a candidate (i.e. he could have written a certain suspected play) and
the same parallels exist - the superficial evidence is the same but now
we think we may have a genuine candidate, we are more likely to take the
same evidence seriously. But the evidence is clearly deficient since it
cannot stand up on its own - close literary parallels exist between
canonical Shakespeare and non-canonical Shakespeare throughout his
works, but every time we find them we do assume collaborative authorship
or the hand of Marlowe in Richard II because of all the parallels with
Edward II. Literary scholars must resolve this issue.
(10) We now have the availability of computers to check large amounts of
vocabulary and cross-reference our data - but how should we decide how
compelling the absence or presence of data is in using these tests? I.e.
if (as I know from checking) John Ford is one of the few writers to use
'All what' as a phrase, what does the presence of absence of this phrase
mean for the authorship of a play. If (as I did) one finds it in The
Noble Soldier (not previously attributed to Ford to my knowledge) must
we suspect the hand of Ford in that play or do we merely attribute the
(very rare) use of the phrase to chance / accident/ plagiarism etc? Also
- if I can get up to 93% discrimination of Shakespeare Folio from 256
author texts by 24 other authors using 87 function word based tests and
discriminant analysis - (which I can) what shall we conclude about the
accuracy of these tests for the determination of authorship -
particularly if such seeming success comes in the face of all the other
questions here related?
(11) I have yet to finish doing linguistic tests on 1RichII and Edward
III etc because it is difficult to test texts which do not belong to a
particular author group - since merely 'apocryphal' texts do not form a
natural linguistic / statistical group. This, as alluded to above, is
perhaps the biggest problem facing authorship studies - how does one
test an isolated work BUT by comparison with a wider canon - our
knowledge of which is itself under scrutiny: i.e. we need to form canons
in order to examine canons, but overwhelmingly our knowledge of the
works of authors other than Shakespeare is very poor. Until such a time
as everyone on this list has actually bothered to read and know ALL of
Greene, Peele, Nashe, Lodge, Marlowe, etc (not to mention the hundreds
of anonymous and apocryphal early texts) the kinds of ascriptions of
authorship and style regularly discussed in this and other academic
groups will remain merely superficial. As they say in the law courts,
ignorance is no excuse.
Answers on a postcard please!
Dahl the Doubter.
From: Jonathan Hope <
Date: Wednesday, 24 Aug 2005 16:21:45 +0100
Subject: 16.1367 Woodstock
Comment: Re: SHK 16.1367 Woodstock
Here are the details I promised on auxiliary 'do' use in Woodstock.
Figures fans may remember that my initial findings for overall use of
auxiliary 'do' in the play put it at 83% regulation, within the range
However, I also said that I wanted to check some details as I noticed
what looked like some interesting patterns while doing the counts. So
here they are.
Although the average figures for auxiliary 'do' use don't provide a
distinction between the author of Woodstock and Shakespeare, the
detailed figures for certain sentence types do.
The clearest distinction comes in positive statements (the commonest
sentence type, and therefore the one which provides the most reliable
A positive statement is something like, 'I went to hear a play yesterday'.
In Early Modern English, because of a significant long-term change in
the grammar of English, speakers were able to add the auxiliary verb
'do' to positive statements as an alternative to using the simple verb form.
So the following are direct equivalents:
I went to hear a play yesterday
I did go to hear a play yesterday
(unlike in Modern English, the second version here does not
automatically imply emphasis).
This is a relatively short-lived option in the language, but usefully
for attribution studies is at its height in the second half of the
sixteenth century. On average across Early Modern texts about 9% of
positive statements have auxiliary 'do' added like this.
Shakespeare, probably because of his birthdate and birthplace, is above
average in his use of auxiliary 'do' in positive statements - generally
about 11% of his positive statements have auxiliary 'do'.
The author of Woodstock however is below average in his use of auxiliary
'do' in this sentence type - only 4.5% of positive statements in the
play have auxiliary 'do' (this pattern is consistent across the text).
To put this into raw figures, for Woodstock to have a Shakespearean
pattern of auxiliary 'do' use in positive statements, it would need
about 50 more sentences of the type 'I did go to hear a play yesterday'
(it currently has 36).
This is a very large difference, and on its own is enough to make me
very dubious about the possibility that Woodstock was written by
Shakespeare. Given that other researchers' linguistic tests also reject
Woodstock as a Shakespeare text, I have to say I'm pretty satisfied that
it's not by him.
Strathclyde University, Glasgow
ps there are also differences between the author of Woodstock and
Shakespeare in the formation of positive questions and negative
questions, though these are much less frequent sentence types, so are
not as robust statistically.
From: Lene Petersen <
Date: Wednesday, 24 Aug 2005 20:32:02 +0100 (BST)
Subject: 16.1367 Wager/ Evidence
Comment: Re: SHK 16.1367 Wager/ Evidence
In the most recent posting on the case of the authorship and provenance
of 1, Richard II, Michael Egan states that "whoever wrote 1 Richard II
also wrote Edward III, the Shakespeare scenes in The Two Noble Kinsmen
and the fragment from Sir Thomas More attributed to Shakespeare." He
directs readers to his website, where "the data supporting these claims"
can be consulted in the form of parallel phrases linking 1, Richard II
with Edward III, and with Shakespearean scenes in Kinsmen and Sir Thomas
More. I make no comment here as to how convincing or not the bulk of
this evidence may be; only I would like to disqualify an -admittedly-
small section of Egan's 'parallels' as incapable of indicating specific
authorship in early modern playtexts.
Comparing 1, Richard II with Kinsmen, Egan says: "That we find any
overlaps at all, however, is remarkable since The Two Noble Kinsmen was
almost certainly one of Shakespeare's final plays, 1 Richard II among
his first. The common turns of phrase thus reveal habits of mind
spanning an entire career." Included in these 'turns of phrase' are, for
"Come, come, let's leave them [i.e., the court]. (1 Richard II, II.i.171)
Let's leave his court (Two Noble Kinsmen, I.ii.75)
He comes, my lord,- (1 Richard II, II.i.127)
Here she comes (Two Noble Kinsmen, II.i.15)
Look where she comes (Two Noble Kinsmen, IV.iii.9)"
These particular examples may yield 'habits of mind', or formulae
rather, but they do not belong to the careers of any specific
playwrights. They are found in several playtexts of the period (as
simple searches in LION or KEMPE-online will show). I would like to call
them 'oral' formulae (as similar look, see, come and news formulae
appear to increase in number in orally transmitted folk ballads), but
the fact is that on the English renaissance stage, with its intense
repertoire system, they have probably become "strong" variants of
theatrical formulae; that is, they are not only used by players in
repeated (oral) performance, but also by playwrights in the writing of
their (literary) texts. It would appear there are certain formulae for
certain stereotypical situations.
The stock of various "Come, (come) let's go/see...", Here x/y/z
comes/comes/goes/is", "Look/see where he/she/it comes/goes/is etc" is
substantial in the surviving texts from the early modern English stage,
so are "what's the news...", "leave me alone..." and "how now"
formulae. That such features can be seen to increase in so-called "bad"
texts in relation to equivalent long texts (where these survive, e.g.
Hamlet Q1 vs Q2/F1 or Romeo and Juliet Q1 vs. Q2/F1etc.) is worth
noting. These features are induced by transmission per se. They are used
by players and playwrights alike; in composition by playwrights and in
"de-composition" by players. Thus they cannot and should not be claimed
to belong to the careers of individual playwrights.
For the same reasons, I recommend excluding the following features as
part of a case for Shakespeare's authorship of 1, Richard II vis-