December
The Shakespeare Conference: SHK 29.0429 Sunday, 9 December 2018
[1] From: Gabriel Egan <
Date: December 8, 2018 at 10:37:56 AM EST
Subj: Re: SHAKSPER: NOS (specifically Rizvi on 'microattribution'
[2] From: Brian Vickers <
Date: December 9, 2018 at 9:57:13 AM EST
Subj: NOS
[1]-----------------------------------------------------------------
From: Gabriel Egan <
Date: December 8, 2018 at 10:37:56 AM EST
Subject: Re: SHAKSPER: NOS (specifically Rizvi on 'microattribution')
Dear SHAKSPERians
When replicating the work described in someone else’s publication, the guiding principle is that one should be following the same procedures as the original investigator did. One might use different sources for the same data, different software for processing the data, and different methods for presenting it. But a replication is not valid if one goes looking for entirely different data. If one does that, then finding that one’s results differ from the original investigator’s results is not news at all: rather, it would be surprising to find the same results.
Pervez Rizvi claims in “The Problem of Microattribution” (published in Digital Scholarship in the Humanities in advance access) to be replicating the microattribution performed by Gary Taylor in “Empirical Middleton: Macbeth, Adaptation, and Microauthorship” Shakespeare Quarterly 65 (2014): 239-272. Rizvi claims that there is much more data to be had than Taylor found and that when it is all considered the microattribution method does not work: all it is really good for is finding out something we already know, which is who had the largest dramatic canon.
The first sign that Rizvi just isn’t using the same evidence as Taylor is that Rizvi finds as his primary data 759 ‘matches’ between a 63-word passage from Macbeth (4.1.140-148) and the other plays of the period, where Taylor found only 18. Either Taylor failed to notice 98% of the available evidence, or he and Rizvi were in fact looking for different things and Rizvi has failed to replicate Taylor’s method. Even Taylor’s worst enemies should have trouble believing that he is as incompetent as Rizvi’s numbers suggest.
Rizvi writes that “The stated aim of Taylor is ‘to analyse every word and every possible combination of words’ in the passage to be attributed . . . In summary, the method looks for these combinations of words in other plays, focusing on the ones found only in the plays of Shakespeare and Middleton, referring to these as unique matches. It then makes the attribution of the passage to Shakespeare or Middleton by comparing how many unique matches come from each dramatist’s plays”.
That isn’t the stated aim of Taylor. He made it clear that “This form of analysis treats a passage of text as a network, consisting of individual lexical nodes and the relationships between them” (Taylor p. 245). He did this, he writes, by searching for n-grams and meaningful combinations of lexical words, allowing for their variant forms. Thus, according to Taylor, “’stands’ could generate collocations involving three grammatical forms of the same verb (‘stand’, ‘standing’, ‘stood’), and any one of those four forms could link to four additional variants of ‘amazedly’ (‘amazed’, ‘amaze’, ‘amazing’, ‘amazingly’), which might occur before or after the verb, immediately adjacent to it, one word away, two words away, three words away, and so on”. And the nouns “’sisters’ and ‘sprites’ can be searched in both singular and plural forms” (Taylor p. 245).
There is nothing in Taylor’s article that justifies Rizvi’s assumption that Taylor intended to look for every collocation of every function word. Taylor did not assume that if we do that we get meaningful evidence for authorship. Perversely, Rizvi assumes that Taylor did mean to look for every collocation of every function word, and naturally by doing so Rizvi found lots of evidence that Taylor did not look for.
The vast majority of the 759 matches that Rizvi finds between the 63-word passage from Macbeth and other dramatists’ works are collocations of function words. Typical are:
#74 [is] [this] [so]? FIRST WITCH [Ay], sir, all [this] [is] [so]. But [why]
(from Macbeth)
which matches with
[This] [was] thy daughter. TITUS [Why], Marcus, [so] she [is]. LUCIUS [Ay] me, [this]
(from Titus Andronicus)
The number I have put after the hash symbol refers to the ordinal position of the evidence within Rizvi’s list of matches provided at
http://shakespearestext.com/micro/collocations-macbeth.htm
and I have put square brackets around the words that are common to the two passages, for which Rizvi uses boldface type.
Such collocations were not the sort of thing that Taylor’s microattribution method was concerned with: he looked for phrases (n-grams) made of lexical and function words and collocations made of lexical words. Thus Rizvi’s hundreds of ‘matches’ of collocating function words are irrelevant. For example:
#81 [so]? FIRST WITCH [Ay], [sir], all this is [so]. But [why]
(from Macbeth)
which matches with
[Ay] me! what mean you [Sir]? Pla. [Why] there, [why] [so]
(from The Rival Friends)
And for example:
#119 [sir], [all] this [is] so. [But] why Stand Macbeth [thus]
(from Macbeth)
which matches with
[all] [sir] [but] I imagine By your [being] here [thus]
(from The London Prodigal)
And for example:
#171 [But] why Stands Macbeth [thus] amazedly? [Come], sisters, cheer [we]
(from Macbeth)
which matches with
[us] [thus], See here [comes] Collatine, [but] (from Lucrece)
Of course, because Rizvi is working with lemmatized texts, “being” can match with “is”, “we” can match with “us”, and “come” can match with “comes”.
As well as these hundreds of matches of function words collocating with function words, Rizvi shows hundreds more of function words collocating with just one lexical word, which again is not what Taylor looked for, which was n-grams and lexical words collocating with lexical words.
On its own terms, Rizvi’s raw data seems to me likely to be correct. But it just isn’t the evidence used by the method he claims to be replicating. In the second half of his article, Rizvi repeats the process in order to try to critique other publications using the same method and the same flaw vitiates the exercise: he isn’t looking for the same things.
That the data Rizvi is concerned with doesn’t seem to be telling us about authorship is interesting in itself, but it does not constitute a critique of other methods that are concerned with n-grams and the collocation of lexical words.
There is one final question of method that Rizvi’s work raises that isn’t settled by a careful reading of his article. He compiles tables showing in rank order the authorial canons that give the most matches in his lists. It’s not clear just which author gets the counts for which hits when the play in question is thought to be co-authored.
Elsewhere on his website, in the materials that describe his dataset of 527 plays, Rizvi provides two documents that might contain the evidence that tell us how he counts such cases:
“Authors-List per Play.xlsx” and
“Authors-List per Play-Author Combination.xlsx”.
These two spreadsheets contain rows for 1, 2, 3 Henry VI, Henry VIII, Pericles, The Two Noble Kinsmen, Timon of Athens, and Titus Andronicus, and for each the entry under the heading “Author” is “Shakespeare, William” and the entry under “Multi-Author Play?” is “N”. If Rizvi were in fact counting matches to any part of each of those plays as a hit for Shakespeare he would be falsely raising the Shakespeare count and lowering the counts for John Fletcher, George Wilkins, Thomas Middleton, George Peele, and some other authors about whom we are not all in agreement.
But in fact I suspect that Rizvi isn’t making that mistake. The spreadsheets in question appear to be concerned with the raw play texts that Rizvi got from Martin Mueller’s moribund project “Shakespeare His Contemporaries” for which all published weblinks now give “Not found” errors. I suspect that Rizvi wisely divided, by author, the parts of plays that are widely agreed to be co-authored. But he doesn’t say in his article that he did that, and it’s an important detail. This last point illustrates why I have been banging on so much about the need for Rizvi to articulate the details of what he did to create his dataset and how he uses it. The details matter.
Regards
Gabriel Egan
[2]-----------------------------------------------------------------
From: Brian Vickers <
Date: December 9, 2018 at 9:57:13 AM EST
Subject: NOS
Gentle SHAKSPERians
I am pleased that Gabriel Egan should regard as “complimentary” this verdict by Lois Potter: “I ended up wishing that Egan was working in law enforcement, which clearly needs him even more than does scholarship” [ December]. But surely he has already combined both offices by cracking down on anyone who disputes the attributions made by his colleagues at the NOS, rather as he tells us that “there are many people for whom being a programmer is a way of being a Professor of English.” (I hope not.) But it seems strange that given his loyalty to that group, he should express diffidence about contacting them:
I’ll see what I can do for David Auerbach in getting the data he wants, if he’d be so good as to drop me a line. (I can’t legitimately do it at someone else’s behest.) [9 December]
Why should Auerbach write to ask Egan to ask Greatley-Hirsch to provide details of the data files that Greatley-Hirsch told Auerbach that he had not kept, when Egan himself has co-authored several papers with Greatley-Hirsch? Whose “behest” is being uttered here?
Warm regards,
Brian
The Shakespeare Conference: SHK 29.0428 Sunday, 9 December 2018
From: Mike Jensen <
Date: December 8, 2018 at 2:36:34 PM EST
Subject: A Laertes Actor’s Name
I am writing about a rather good Canadian Broadcasting Corporation radio broadcast of Hamlet that was produced in 1961 and directed in Vancouver by Gerald Newman. I should mention the name of the actor who played Laertes, but his last name on the broadcast credits is indistinct to my ear.
The actor’s first name is David, but the last may be Allen, Evan, or something else. I have Googled this within an inch of its life. To satisfy my editors, does anybody know the last name of this actor?
All the best,
Mike Jensen
Contributing Editor, Shakespeare Newsletter
Co-General Editor, Recreational Shakespeare
Author, The Battle of the Bard
author site: www.michaelpjensen.com
The Shakespeare Conference: SHK 29.0427 Saturday, 8 December 2018
From: Gabriel Egan <
Date: December 8, 2018 at 8:00:14 AM EST
Subject: Re: SHAKSPER: NOS
Dear SHAKSPERians
I apologize for misspelling the name of Paul Werstine’s collaborator, and esteemed Shakespearian, the late Barbara Mowat. I should have taken the care to look it up.
Brian Vickers asks “If Egan doesn’t suppose that” Pervez Rizvi meant any insult in assuming he is not a computer programmer “why is he making such a fuss?” The answer was in my next sentence: “This assumption about my limitations makes him [Rizvi] misread my serious interest in the technical details of what he did as my attempts to ‘divert attention’ from his achievements”.
Vickers assumes that being a Professor of English and being a programmer are things a person might do one after the other but not together: “It may well be that Egan had been a programmer before taking up the study of English”. In fact, there are many people for whom being a programmer is a way of being a Professor of English.
Alan Galey has written illuminatingly that we must overcome:
“. . . the assumption of an unbridgeable gap between those working with code and those working with texts and ideas, such that a humanities scholar and a programmer cannot be the same person. To draw a distinction between programming and abstract, poetic thinking is to misunderstand both.” (“Mechanick Exercises: The Question of Technical Competence in Digital scholarly Editing” in _Electronic Publishing: Politics and Pragmatics_ Edited by Gabriel Egan (Tempe AZ: Iter and the Arizona Center for Medieval and Renaissance Studies, 2010): 81-101, p. 94)
Galey goes into considerable detail on just why it is undesirable to divide a project’s labour so that one person provides the expertise on the texts and their contexts and another provides the computational skills. He quotes Matthew Kirschenbaum arguing that “Computers should not be black boxes but rather understood as engines for creating powerful and persuasive models of the world around us” and looking forward to a time when “. . . an appreciation of how complex ideas can be imagined and expressed as a set of formal procedures—rules, models, algorithms—in the virtual space of a computer will be an essential element of a humanities education”.
Although not “essential”, learning the art and practice of computer programming is already a popular option in the Humanities at my institution and I’d be happy to talk to anyone who is interested in why and how I introduced this option to the English and History degrees we offer. A fine introduction to the topic and invaluable practical teaching materials are available at Adam Crymble’s website “The Programming Historian”.
I’ll see what I can do for David Auerbach in getting the data he wants, if he’d be so good as to drop me a line. (I can’t legitimately do it at someone else’s behest.)
Brian Vickers thinks I’d be “doing this forum a favour” if I would “answer Rizvi’s critique of ‘micro-attribution’”. In the hope that it doesn’t in fact strain the forum’s patience, I’ll send such an answer along next.
Regards
Gabriel Egan
The Shakespeare Conference: SHK 29.0427 Friday, 7 December 2018
[1] From: Gabriel Egan <
Date: December 6, 2018 at 3:30:32 PM EST
Subj: Re: SHAKSPER: NOS
[2] From: Paul Werstine <
Date: December 6, 2018 at 8:13:54 PM EST
Subj: NOS
[3] From: Gabriel Egan <
Date: December 7, 2018 at 4:27:06 AM EST
Subj: Re: SHAKSPER: NOS
[4] From: Brian Vickers <
Date: December 7, 2018 at 5:38:14 AM EST
Subj: Re: SHAKSPER: NOS
[1]-----------------------------------------------------------------
From: Gabriel Egan <
Date: December 6, 2018 at 3:30:32 PM EST
Subject: Re: SHAKSPER: NOS
Dear SHAKSPERians
Those following this thread about authorship attribution may also be interested in these two articles:
- Pervez Rizvi “Authorship attribution for early modern plays using function word adjacency networks: A critical view” American Notes and Queries (2018): Advance Access
- Pervez Rizvi “Small samples and the perils of authorship attribution for acts and scenes” American Notes and Queries (2018): Advance Access
The first is a critique of the Word Adjacency Network method that Segarra et al. (including me) published in several articles, including one in Shakespeare Quarterly in 2016. The second is a critique of the essay “A Supplementary Lexical Test for ‘Arden of Faversham’” by MacDonald P. Jackson in the Authorship Companion to the New Oxford Shakespeare.
In both cases, the authors whose work Rizvi critiques do not accept the validity of the criticisms and are writing rebuttals with a view to publication. This is how it should be: the exchange of views generates new knowledge and gets us closer to reliable methods.
If SHAKSPERians hear, perhaps on this list, that Rizvi has now devastated the scholarship on which the New Oxford Shakespeare is based, I’d advise them to read Rizvi’s articles and the forthcoming rebuttals rather than taking that summary on trust.
Regards
Gabriel Egan
[2]-----------------------------------------------------------------
From: Paul Werstine <
Date: December 6, 2018 at 8:13:54 PM EST
Subject: NOS
In a post of December 4 in which my friend Gabriel Egan rightly is concerned that names of Shakespeareans be spelled correctly on SHAKSPER, he refers to the late Barbara A. Mowat as "Mowett."
Best wishes,
Paul Werstine
[3]-----------------------------------------------------------------
From: Gabriel Egan <
Date: December 7, 2018 at 4:27:06 AM EST
Subject: Re: SHAKSPER: NOS
Dear SHAKSPERians
It's a small point, but . . .
Brian Vickers misquotes Lois Potter’s review of the New Oxford Shakespeare Authorship companion. According to Vickers “. . . Lois Potter described Egan as a ‘law-enforcer’ . . .”.
Actually, Potter wrote the much more complimentary sentence “Egan’s account of reliable and unreliable methodologies includes some horrific accounts of court cases that were decided on totally inadequate ideas of probability; I ended up wishing that Egan was working in law enforcement, which clearly needs him even more than does scholarship” (Cahiers Elisabethains 94, 2017, p. 154).
The epithet that Vickers puts in quotation marks, “law-enforcer”, and attributes to Potter, appears nowhere in her review.
Regards
Gabriel Egan
[4]-----------------------------------------------------------------
From: Brian Vickers <
Date: December 7, 2018 at 5:38:14 AM EST
Subject: Re: SHAKSPER: NOS
Letter to Shaksper 7.12.18
Gentle SHAKSPERians,
In his latest post Gabriel Egan accuses Pervez Rizvi of in effect “telling me that I shouldn’t worry my pretty little Humanities head” – an unwarranted aspersion of misogyny – “about the technical aspects of the problem.” He then proceeds to demonstrate his knowledge of the technicalities of XML conversion. Out of interest I re-read what Rizvi had written:
It took me months of patient editing to turn those XML files into a database fit for N-gram searching. I did it because I have the technical know-how, not because I have some secret software that I haven’t “disclosed” to Gabriel. It may well be that when Gabriel looks at those files, he doesn’t know what to do. There’s no reason why he should, since he is an English scholar, not a programmer like me.
I have bolded the key words, which reflect an innocent assumption, given that Egan is indeed a Professor of English Literature. It may well be that Egan had been a programmer before taking up the study of English, but we have no way of knowing that. Indeed, Egan admits as much later in his posting:
I don’t suppose that Rizvi means to be insulting when he says that I’m not a programmer, but it was an unwarranted and mistaken assumption about me.
If Egan doesn’t suppose that, why is he making such a fuss? He would be doing this forum a favour if he could answer Rizvi’s critique of “micr0-attribution” instead of taking offence and off-loading a great deal of technical know-how that, I suspect, is of little interest to members of this forum. In his survey of XML conversion methods Egan includes
a quick-and-dirty way that non-XML-experts sometimes resort to, and it entails dangers that are not obvious when you first try it. It’s awfully easy to get the wrong results when you use a quick-and-dirty approach.
Given that Rizvi is a programmer he might take offence at what could be regarded as a slight on his professional competence, but I think he’s probably got better things to do.
In conclusion Egan makes his only comment on the scholarly substance of Rizvi’s critique:
I’m disappointed to hear that some materials underpinning the New Oxford Shakespeare Authorship Companion were not available to someone wishing to follow up an argument made there … I stand by the principle of total transparency in science and if the reader who was rebuffed would like to get in touch with me I’ll see what I can do to furnish the materials they’re after.
The phrase “I’m disappointed to hear” sounds as if Egan were unfamiliar with the piece in question, but it’s a chapter in the Authorship Companion which he co-edited, and which included the forthright principle enunciated by Anna Pruitt that “giving readers access to the raw data should be a non-negotiable part of the scholarly acceptance of studies of this nature; without access to the data the experiments cannot be replicated to validate the study” (104). Egan dramatizes the issue in terms of a reader being “rebuffed”, but in fact it was the distinguished critic and software engineer David Auerbach, who wanted to replicate the experiments of Jack Elliott and Brett Greatley-Hirsch which led them to give Arden of Faversham “its rightful place in the canon of [Shakespeare’s] works” (AC, 181). Despite email requests over several months they were unable to supply the data (perhaps Egan will be luckier). I know this because I’m guest editor of the winter issue of the Belgian journal Authorship that will include an essay by Auerbach called “A Critique of Quantitative Methods in Shakespearean Authorial Attribution.”
Warm regards,
Brian
The Shakespeare Conference: SHK 29.0426 Friday, 7 December 2018
From: Hannibal Hamlin <
Date: December 6, 2018 at 3:20:59 PM EST
Subject: RE: SHAKSPER: Ardenwatch
Surely, we can expect the first volumes of Arden 4 before they finish series 3? That’s how it seems to work. Of the making of Shakespeare editions there is no end.
Hannibal