Public Domain Shakespeare Project

Shakespeare Electronic Conference, Vol. 2, No. 234. Tuesday, 24 Sep 1991.
 
 
(1)	Date: 	Sun, 22 Sep 1991 17:22:58 EDT
	From: 	Roy Flannagan <This email address is being protected from spambots. You need JavaScript enabled to view it.>
	Subj: 	Not too minimal incoding, and the TEI
 
(2)	Date: 	Mon, 23 Sep 1991 11:59:00 -0400
	From: 	This email address is being protected from spambots. You need JavaScript enabled to view it.
	Subj: 	Re: SHK 2.0231  Oxford Text Archive vs Public Domain Texts
 
(3)	Date: 	Mon, 23 Sep 1991 13:13:21 -0400
	From: 	"S. W. Reid" <sreid%kentvm.bitnet@utcs>
	Subj:   Public domain texts
 
(4)	From: 	Michael Warren <This email address is being protected from spambots. You need JavaScript enabled to view it.>
	Date: 	Mon, 23 Sep 91 10:38:15 -0700
	Subj: 	Shakespeare texts
 
(5)	From: 	Ken Steele <This email address is being protected from spambots. You need JavaScript enabled to view it.>
	Subj: 	Public Domain Shakespeare Project
	Date: 	Tue, 24 Sep 91 1:01:22 EDT
 
 
(1)----------------------------------------------------------------------
 
Date: 		Sun, 22 Sep 1991 17:22:58 EDT
From: 		Roy Flannagan <This email address is being protected from spambots. You need JavaScript enabled to view it.>
Subject: 	Not too minimal incoding, and the TEI
 
    	 Though at the moment I am teaching *Sir Gawain* and annotating
    *Paradise Lost*, I still would like to get two words in about encoding
    a public domain Shakespeare.
 
    	 First word: more than enough encoding would be better than too
    little, because every bit of encoding that is added later would have to
    be retrofit to the text.  To retrofit the text with more code might add
    errors to the text.  So the more we can do to begin with, the less has
    to be done by hindsight (or oversight).
 
    	 Second word: to avoid the Babel and the babble, could we decide on
    using the TEI standards (not a big deal to do), on the grounds that
    they will be the standards, and that hundreds of people are working on
    questions just like the ones we will be working on in deciding to
    encode in literature and how to encode it.  Witness the TEXTCRIT
    people, working as I speak on revising the TEI standards to be flexible
    enough to handle Shakespearean compositors and Old Norse
    scribes.
 
    	 Peace be to Tom Horton, who likes COCOA, or the TACT people, with
    whom I have worked on Milton, we still should be aimed toward
    using a universal standard like TEI that will work independent of
    hardware and software restrictions and encryptions.
 
    	 In hopes that Tower will fall and that the texts will go peaceably
    into the public domain.
 
    Roy Flannagan
    Editor, *Milton Quarterly*
 
(2)-----------------------------------------------------------------------
 
Date: 		Mon, 23 Sep 1991 11:59:00 -0400
From: 		This email address is being protected from spambots. You need JavaScript enabled to view it.
Subject: 2.0231  Oxford Text Archive vs Public Domain Texts
Comment: 	Re: SHK 2.0231  Oxford Text Archive vs Public Domain Texts
 
What interest, if any, would there be in a public domain _William
Shakespeare: The Complete Works_ edited by Peter Alexander? This
was first published in 1951, in the U.K., and is still reckoned by
many the best 'conservative text' available. Most of it is now in
machine-readable form; proofing has yet to be carried out.
 
	David Bank
        Dept. of English
        University of Glasgow
 
(3)------------------------------------------------------------------------
 
Date: 		Mon, 23 Sep 1991 13:13:21 -0400
From: 		"S. W. Reid" <This email address is being protected from spambots. You need JavaScript enabled to view it.>
Subject:      	Public domain texts
 
Like Michael Warren I can't keep up with the turnaround rate
in the dialogue over public domain Shakespeare texts, so I've
been sitting on the sidelines observing the `play', as Steve U
calls it. Having argued unsuccessfully a couple of years ago that
input of quartos and folios as well as other important editions
should be undertaken as part of a large, established editorial
project that should have an electronic as well as printed form,
I've been reluctant to put in my two, now well-worn cents.
 
However, lately one or two messages have appealed for
responses from those who have experience with large-scale
scanning and computer-comparison of texts, so I'm now persuaded
to respond, if only to try to unconfuse myself about the state of
the dialogue and the direction the proposal seems to be going.
Much of what follows recapitulates what Ken Steele, Steve
Urkowitz, Mike Post, Hardy Cook, and others have already said at
various times. What I've tried to do is organize the commentary
under some headings and add a few observations. Perpend.
 
AUDIENCE / PURPOSE
This question has sifted down with surprising dispatch. People
want access to computerized quarto and Folio texts for
production, teaching, and scholarship (mainly textual). These
people consider the WordCruncher Riverside and the modern Oxford
unsatisfactory for their particular uses; but these texts are
more than adequate for those who want access to an electronic
modernized Shakespeare (despite some dissatisfaction with the
software), and there is no need to create yet another such. The
stuff available through Dartmouth and on CD should be ignored.
 
SCANNING / KEYBOARDING
The trend of the dialogue is reluctantly towards keyboarding,
rather than scanning the originals or type facsimiles of them.
This seems to me right. No speed-up occurs through editing a
scanned text that is not up to snuff, and the aggravation level
is high. Despite recent advances in scanners, the show-through,
bad inking, paper distortion, and other irregularities associated
with early printed books will drive a scanner bananas. Long s
will be confused with f, with riotous results in certain cases.
These problems can't be overcome with the sophisticated scanners
that can be taught to learn particular typefaces. It might be
possible to save some time scanning the latest Variorum texts,
but there aren't enough of them to repay writing a routine to
strip the files of the editorial matter at page bottom.
 
INPUT (not encoding)
Keyboarding would seem, then, to be the only practical way. Fact
is, a trained someone can input a play in about 20 hours at most.
If a team is to be used, agreement on procedures will be crucial
(this affects proofing plans and especially coding). Input stints
should be assigned by scenes: the process (like hand collation)
can, for awhile, seem quite rewarding as a different kind of
close reading, and ten-line segments would deprive everyone of
this simple joy. Copy (not copy-text, please) for everyone should
be the corrected state of the document. Attempts to get fancy by
recording variant states during input will make the process
unduly complex and not lead to uniform results; a record of
press-variants can be generated separately and users can later
flag their own files as they see fit. If double input is done for
proofreading purposes, the second inputting should use another
but similar text (e.g., a derived quarto like Q2 of {i}1 Henry IV{r} to
compare with Q1).
 
PROOFREADING
This should be done by computer-comparison, preferably against
the Howard-Hill/Oxford texts. Then someone will have to take
further responsibility for final cleanup as surviving goofs are
noticed (no person and no single computer-comparison is perfect).
A bookkeeping system for tracking the corrected state of the
files should be set up from the beginning, and the top of a file
should record its state.
 
ENCODING
About this there is less agreement than other matters. Some want
a `very simple' form of SGML or COCOA; others have suggested less
complex approaches. TEI coding seems to have been eschewed
altogether. With good reason, it seems to me, and for the same
reasons (and out of hard experience) I would suggest erring on
the side of minimal coding.
 
First, the people who want quarto and Folio texts are
interested in minimizing mediation; if they wanted more middle
persons than already lie behind the printed texts, they would be
content with the Riverside or Oxford. Who is going to decide
whether `Stand forth Demetrius' is dialogue or stage-direction?
Or how to divide the scenes in Capulet's orchard and before
Gloucester's castle? Do we really need to perpetuate eighteenth-
century notions of act and scene in these texts? And who is going
to keep track of who decided or imposed; at least with a modern
edition, you know who's responsible for the text. Whereas it's
true that nobody wants to go back through files to differentiate
speech-prefixes and stage-directions, it's equally true that
nobody wants to go back and strip codes out if they're mucking up
the works, or correct their own copies of the files after
discovering they don't agree with someone else's interpretation.
 
Second, practical input. The fancier you get with coding the
more variability there will be in the results, the higher the
frustration level for keyboarders, and the more difficult the
proofreading. Highly encoded texts will so fill the computer
comparison process with variation as to undermine its aim for
accuracy in the text itself. Whoever supervises this effort will
have enough trouble just getting uniform input on the simplest
level, like spacing after full-stops and before carriage returns.
 
The files should be simple diplomatic transcriptions of the
printed page, observing pagination and lineation, distinguishing
roman and italic `fonts', representing horizontal but not
vertical spacing. Distinctions between swash italics and `normal'
and other typographical niceties should be ignored, catchwords
transcribed and bracketed, and pagination based on signatures.
Otherwise, each inputter ought to ignore our modern uncertainty
principle and pay proper respect to the real author of our author
(according to the lady who used to reside in the next town) by
drawing a firm line between observation and inference, the marks
on the printed page and their presumed significance.
 
It seems very likely that inputters will want to use their
favorite word-processors. If the files can be output in DCA
format, I'm told, they ought to be capable of being imported for
PC conversion to the conference's simple style without further
ado. Once the conventions are uniform for italic, 1/M indention,
centered and flush-right text, the macros and selective/global
searches now available on most word-processors can be used to add
codes to a copy of a file for a particular application, including
codes for line numbering and for speech-prefixes and stage
directions. The trick is to have uniformity at the most basic
level, and that is more likely with simple input rather than
elaborate in-stream encoding.
 
My appeal is for rather vanilla files. If someone then wants
their copies full of COCOA, that's fine. It's more trouble
getting the stuff out, once it's been introduced. The sample of
the Sonnets just submitted by Hardy Cook is a good example of the
general approach, though of course dramatic texts are
typographically more complex. I would only recommend additionally
(1) the use of braces ({}) for coding instead of broken or angle
brackets (<>), which have other useful applications that can get
confused with code delimiters, (2) the shortening of the font
codes to one letter, and (3) the use of off-switches in all cases
rather than paired codes. Thus: {i}SONNETS{r} in his head-title,
rather than <it>SONNETS<it>.
 
In short we need to pursue the two ACCs: accuracy and
accessibility. More than that is unnecessary and, as they used to
say in USSR, counter-productive.
 
Farewell,
 
Sid
 
S. W. Reid
Kent State Univ.
 
(4)-------------------------------------------------------------------
 
From: 		Michael Warren <This email address is being protected from spambots. You need JavaScript enabled to view it.>
Date: 		Mon, 23 Sep 91 10:38:15 -0700
Subject: 	Shakespeare texts [Edited correspondence. -- k.s.]
 
Dear Ken,
 
This debate about public domain texts is an education in itself.
Given precise instructions I'd be happy to enter text in my slow
fashion; I'd also gladly proofread.
 
One of the things we would have to consider if we entered quartos,
for instance, is the occasional problem of what actually
constitutes a word. The computer knows that a space is a space;
the reader of the early text is often not so sure how to
interpret space. But that's just one of the joys in store.
I once arranged to count how many words Kent speaks in QLear;
Bit required a number of prior decisions.
 
Best wishes
 
Michael
 
UC Santa Cruz
This email address is being protected from spambots. You need JavaScript enabled to view it.
 
(5)-------------------------------------------------------------------
 
From: 		Ken Steele <This email address is being protected from spambots. You need JavaScript enabled to view it.>
Subject: 	Public Domain Shakespeare Project
Date: 		Tue, 24 Sep 91 1:01:22 EDT
 
First of all, my apologies for the delay in getting this digest out.
The McMaster node was apparently playing dead for most of this
afternoon, and it was not until an indecent hour of the morning that I
was able to perform my editorial duties.  (This was not entirely
McMaster's fault...)
 
Things continue to grow more interesting day by day in this
Prolegomena for a Public Domain Shakespeare.  I appreciate Sid Reid's
synthesis of the discussion, as I'm starting to lose track of it all,
and particularly his pun on Vanilla and Cocoa.  Judging from the notes
posted today, we're not quite sure whether we want SGML, plain
vanilla, or COCOA codes.  I continue to think that retrofitting
*certain* codes is easier than working them out as a text is
keyboarded -- although unquestionably the minimum should be tagged
from the beginning.
 
I would be very interested in obtaining the public domain Alexander
text of which David Bank speaks, and doubtless so would many others.
I think space could be made on the SHAKSPER Fileserver so that the
files could be available to all; what's more, if the proofreading has
not been completed it might be possible for SHAKSPER to assist in that
process too.  I would definitely like to hear more about these texts!
 
In the following digest (SHAKSPER 2.0235), Lou Burnard makes a
surprising offer of the Oxford Text Archive quartos and folios, on
diskette, at a genuinely affordable price.  I am very pleased to see
that this was possible so quickly, and hope that those of you who have
been interested in the texts can take advantage of his offer.
 
The question which arises now is, of course, if we can obtain the
public domain Alexander text, and if the quartos and folios are more
readily available from Oxford, what (if anything) is our priority for
our own project?  One possibility which Lou suggested earlier was that
we could concentrate on additional texts -- for example, Shakespeare's
sources, or plays by his contemporaries.  If the drive to public
domain remains vigorous, we could still aim to produce the quarto and
folio texts, of course.  But, I'll give you a chance to read Lou's
offer, and then I await your responses.
 
						Ken Steele
						University of Toronto
1991

Public Domain Shakespeare Project

Subscribe to Our Feeds

Search

Make a Gift to SHAKSPER