The Shakespeare Conference: SHK 15.0938  Friday, 23 April 2004

From:           Marcus Dahl <This email address is being protected from spambots. You need JavaScript enabled to view it.>
Date:           Thursday, 22 Apr 2004 17:23:48 +0100
Subject: 15.0904 Stylometrics
Comment:        RE: SHK 15.0904 Stylometrics

Dear All,

I have been meaning to 'answer' this one for ages so apologies for
lateness of my reply.

As I have just finished a PhD on the subject of the authorship of Henry
VI I have some thoughts on Stylometry as a subject and Shakespearean
authorship studies in general.

(1) I have yet to read an edited edition of Shakespeare that does not at
least in part rely upon previous stylometric / attributive work in some
respect in order to justify the editor's basic editorial principals. For
instance, if you believe a work to be by Shakespeare in entirety, you
tend towards an inclusive editorial strategy - straining usually to make
sense of obscure or ostensively meaningless passages and justify
apparent inconsistencies in the text through editorial amendment. If on
the other hand, you tend to regard the text in hand as collaborative,
pirated or in some way disjoint - the editorial policy tends to
increasingly remove 'Shakespeare' from the responsibility of its
authorship and to cut those passages from the text. Thus one's editorial
opinion of a play's authorship affects the way in which it is edited and
also the way in which the text is therefore constructed in the real
world of Arden , Oxford, Cambridge etc.  edited texts. What on SHAKSPER,
or 'down the pub with your mates' can seem an aloof and perhaps banal
subject takes on real significance in publishing terms as
'Shakespearean' texts are rebranded, remarked and re-edited according to
the current orthodoxy of scholarly attribution. Henry VI (my text of
choice) is currently lying on the very cusp of the Shakespearean margins
- to be included in the canon or not? that is the question. I know that
Ward has excluded it from his 'core canon'.

On a personal note - the reason I became interested in the subject is
that the very first Shakespeare text book I owned was the Wells/ Taylor
edited Shakespeare Works - a highly contentious edition still - a work
which I read cover to cover at school and university without ever
realising for some time that it was a highly amended and altered text
with a view to restoring the editors' views of what constituted original
Shakespeare texts. Now, whether or not Wells and Taylor were right or
wrong in their attributive and editorial work is not so important as the
fact that most average readers of Shakespeare get their view of what is
or is not Shakespeare from the cover of the book and what the 'experts'
inside the cover of that book say about the play. Now, if the 'experts'
say that 'Henry VI' is not all by Shakespeare but partly by Nashe (etc)
- that is quite different from the play being all by Shakespeare. It
changes the way you read each part - i.e.  remember how much flurry of
argument there is each time Shakespeare's supposed Catholicism is
brought up to support some reading of his plays (or vice versa) - if
Nashe or Greene or Marlowe etc wrote large or even small chunks of early
'Shakespeare' a whole new biography of 'the writer's life' emerges. (And
a whole bunch of new covers in book shops. How to sell a play that is
not actually 'by' one author but three - two of whom may be unknown?).
If an unknown passage in a book marked 'Marx' turns out to be by
'Groucho' and not 'Karl' it changes rather profoundly the 'import' of
the text. Thus if there is anything to studies of authorial biographies
at all, we should suppose that a passage written by Greene or Marlowe or
Nashe has rather a different significance (even in the context of the
same play) to us as modern readers than if the same passage was written
by the 'bard of avon'. It must be noted in this context that rarely are
'superior' passages taken away from 'Shakespeare' - it is always the
supposed 'inferior' passages which go to poor old Nashe and Greene etal.

(2) If we can agree in principle that certainly at least the modern day
publishers of books care who wrote them - just as most readers do not
stroll blindly around disregarding the names of the authors of their
favourite books / films / music etc, then we might agree that at least
in principle we like to know where things come from and how they got
there. I.e. we like to know the history of things - whether this be
people, places or plays. Rare books are more expensive than ubiquitous

(3)If then we like to know the origin of things, the first question we
ask ourselves is how we might find those origins out? In the case of
Shakespeare the answer is: - from critical/ literary books, from
contemporary sources, from the works themselves, from historical
manuscripts etc etc. All of which is presumably undisputed. When I
started my PhD, I certainly had no particular urge to use computers or
learn statistics or print out graphs of Shakespeare's use of various
verb-forms etc. The problem I came across was that for all the books I
could read on the subject, there was no discipline, no cross-examination
and no agreed opinion as to what was the best, most acceptable, or
reliable method of finding out the authorial history of a text. Now if
one person tests 'do-auxiliary verb forms' and another tests the
occurance of 'here' as a stage direction and yet another merely throws
up his hands and says 'it cannot be done' what is one to think? It is
not like the publishing industry stops printing works of 'Shakespeare'
or people stop wanting to know who wrote which plays when. Terence
Hawkes and the post-modernists are in the minority I am afraid. Most
people want to know - or at least know that they cannot know (remember
Godel everyone?). Thus I started trying to test as many of the theories
about Henry VI as I could.  The problem (as has been pointed out on
Shaksper as well as in some very good articles by Joseph Rudman et al)
is that it is very difficult to test the testers. But - Michael Egan
please heed - this is not reason to dismiss the subject - it is more
reason to solidify the methodologies - to embark on a thorough
investigation of how we might do it properly. The difficulty of testing
collaborative early modern authorship does not go away merely because
some people do not trust Ward Elliot's methodology etc. Rather it
intensifies the need to clarify how one might go about the subject with
agreed parameters, falsifiable conditions, acceptable statistical
margins etc. In my own tests I can get very convincing margins of error
and predictive techniques which work using my tests and figures but you
would have to test them yourselves to see if you agree with my results.
Incidentally I have deliberately kept copies of everything I did so that
people can do precisely this. I believe that Ward still suffers from the
unreproducability of his results - but I may be wrong.

(4) The answer of course is the internet. I have said this many times
before and no doubt I shall go on boring people until it is done. All of
my linguistic tests were done across the internet using the VISL
linguistic parser at Odense University in Denmark (and as soon as is
possible will be made available online by my colleague Lene Petersen -
prob summer 2004 ) but all these things really take is for lots of
people to put their minds to it.  Test the testers - use the same tools
- apply a scrupulous methodology - make the stats programs available and
the data and it can all be done. I believe that with the new amazing
linguistic software and the increased availability of electronic texts
online for testing, we can at least test ALL of the available texts
beside each other and draw some general conclusions from a careful
survey of all current arguments -whether they re literary / lingusitic
(as Brian Vickers) or statistical / linguistic (like Ward Elliott / Mac

Some general things that need to be done before a more general consensus
in Shakespearean Stylometry is reached:

(1) All available early modern texts must be digitised and made
available for linguistic / statistical analysis. This has not yet been done.

(2) A reliable early modern linguistic parser must be created to use on
un-modernised electronic texts - this will take years. E.g. for my tests
I used a modern grammatical parser modified through a database of early
modern spellings / forms etc.

(3) The list of linguistic and statistical tests used by stylometers
must be in some way formalised, standardised and organised so that a
generally agreed methodology can be formulated for all students of the
subject. If I learn physics - I have to learn certainn principles and
techniques which allow me study the subject.

(4) The subject of Stylometry - like the study of English itself in the
1890s - needs to be formalised so it can be taught and tested.

(5) All resources of all researchers should be made common. i.e. it is
not enough that I can test Shakespeare's 'canonicity' and tell you what
I think - you should be able to test my interpretation and results
yourself (as you can say in Engineering).   The testers must be tested -
and here I mean everyone - not just people such as Ward Elliot (who I
know wouldn't mind) but those such as Gary Taylor who regularly make
attributive arguments to the deafening silence or those who do not mind
his techniques because they not appear 'technical'. Old style literary
criticism is just as likely to lead to false conclusions as modern
computer aided statistical analysis.

(6) Literary scholars have to stop being scared of statistics (they are
everywhere else in the world - from marketing to politics - why not in
literary studies) and stop being so personally involved in their own
results. If Gary Taylor is right that Nashe wrote the first part of 1HVI
then I should shake his hand and commend him on his acumen - not write
scathing ad hominem reviews and become a trapist monk.

Incidentally, I do not think that these above possible (and by no means
final) criteria invalidate the process of trying to test authorship
before they have all been met - right now on the shelf before me sit
various editions of 'Henry VI Part One' - all by different editors and
all with different views on the play's authorship. Certainly, if one is
at all interested in Shakespeare's early canon, early writings, possible
collaborative practise, stagecraft or literary / linguistic technique,
then this play's authorship should be of importance to you - and as yet
- the matter is far from being decided.

Personally I do not believe that the case has been adequately made that
the play is partly by Nashe - but we shall see!

All the best,
Marcus Dahl
Univ. Bristol UK

S H A K S P E R: The Global Shakespeare Discussion List
Hardy M. Cook, This email address is being protected from spambots. You need JavaScript enabled to view it.
The S H A K S P E R Web Site <http://www.shaksper.net>

DISCLAIMER: Although SHAKSPER is a moderated discussion list, the
opinions expressed on it are the sole property of the poster, and the
editor assumes no responsibility for them.

Subscribe to Our Feeds


Make a Donation

Consider making a donation to support SHAKSPER.