Make a Donation

Consider making a donation to support SHAKSPER.

Subscribe to Our Feeds

Current Postings RSS

Announcements RSS

Home :: Archive :: 1991 :: September ::
Public Domain Shakespeare
Shakespeare Electronic Conference, Vol. 2, No. 232. Sunday, 22 Sep 1991.
 
 
(1)	Date: 	Fri, 20 Sep 91 14:59:53 EST
	From: 	John T. Aney <
 This e-mail address is being protected from spambots. You need JavaScript enabled to view it
 >
	Subj: 	RE: SHK 2.0231  Oxford Text Archive vs Public Domain Texts
 
(2)	Date: 	Fri, 20 Sep 91 13:10:02 EDT
	From: 	
 This e-mail address is being protected from spambots. You need JavaScript enabled to view it
  (Tom Horton)
	Subj: 	Mark-up vs. plain ASCII
 
(3)	Date: 	Sun, 22 Sep 91 09:59:53 EST
	From: 	Ken Steele <
 This e-mail address is being protected from spambots. You need JavaScript enabled to view it
 >
	Subj: 	What I meant by "minimal encoding"
 
 
(1)--------------------------------------------------------------------------
 
Date: 		Fri, 20 Sep 91 14:59:53 EST
From: 		John T. Aney <
 This e-mail address is being protected from spambots. You need JavaScript enabled to view it
 >
Subject: 2.0231  Oxford Text Archive vs Public Domain Texts
Comment: 	RE: SHK 2.0231  Oxford Text Archive vs Public Domain Texts
 
Ken:
 
Bravo on your critique of the OTA's policies regarding electronic
texts of the Quartos and Folios.  You are absolutely correct--it is
due time these texts become more available to the scholar and theatre
practicioner, both casual and dedicated.
 
As an actor and founding member of a brand-new, "text based" Shakespeare
Company in Portland, Oregon (the Tygre's Heart Shakespeare Company), I
recognize the significance that access to electronic texts of Folios
and Quartos has on performance.  Tygre's Heart credo includes the
integration of scholarship and performance, and encourages actors to
do their own scholarship using the Quartos and Folios.  At this point,
this is not always easy, since even the 1623 Folio is difficult to
find (at least in Portland) and is not cheap.  Having access to these
texts on computer would allow actors to easily do their required
textual analysis, as well as allow our directors to print their
own acting texts and do their own "homework."
 
I don't really have much to add to the conversation at this point,
other than to say "count me in!"  As a performer, the possibility
of having these texts readily available is tantalizing to say the
least.  As a scholar, the opportunity to participate in such a
potentially far-reaching project is thrilling.
 
John T. Aney
 
(2)-------------------------------------------------------------------------
 
Date: 		Fri, 20 Sep 91 13:10:02 EDT
From: 		
 This e-mail address is being protected from spambots. You need JavaScript enabled to view it
  (Tom Horton)
Subject: 	Mark-up vs. plain ASCII
 
While I understand the reasoning behind the suggestion to prepare simple ASCII
texts and add mark-up later, I feel fairly strongly that this would be a
mistake.   [Long-ish plea for minimum mark-up follows.  Sorry for the
length, but please read if interested -- there are simple compromises!]
 
Yes, there are different systems of mark-up with varying degrees of
complexity, but one must realize that within one system (TEI, TACT, etc.) one
can keep it simple.   It is much preferable and very much easier to add the
*basic* mark-up at the beginning.   I've done it both ways (a lot), so this
conclusion is based on my own experience.
 
Once any kind of mark-up is inserted in a text, at least one can use a text
editor or word-processor to search for it and change it.  If there's no
mark-up, one may have to go line by line to do the most simple things.
(Which is what I had to do with the copyrighted texts on the NeXT, which
have no mark-up, just font changes, to be able to use them with a
concordance program for an English class here.)
 
But do keep it simple.   For example, if you were to adopt COCOA-style
references used by TACT, then something like this might be fine:
 
  <T Romeo and Juliet>        	-- means "title of the play"
  <I 1.1>			-- means "beginning of Act 1, Scene 1>
  <D Enter Romeo.>		-- stage directions set apart from text
      or
  <D 1> Enter Romeo. <D 0>	-- stage directions distinguished from text
  <S Juliet.>  Romeo, Romeo,... -- speech prefix
 
Perhaps nothing else. (Maybe lines numbers, but this is edition-dependent.)
 
IMPORTANT: The person typing the text in doesn't even have to type all
this.  He or she could simply surround stage directions with $...$, speech
prefixes with +...+, act/scene identifies with #...#, etc.  This shouldn't
be that hard, should it?  (I had professional typists do this with 8 plays,
and they had no problems and it made my life much simpler.)  Then some
computer hacker (like me or Ken) could use a good editor to convert this to
the TACT mark-up give above.
 
Some of the advantages of having some mark-up include:
 
   It seems to me a lot of people might want to distinguish speech prefixes
   and stage directions from speech.  If you want to do a vocabulary study and
   your counting program ignores all text between [ and ] (square brackets),
   then use your favorite text editor to do a global substitution of < to a
   left bracket and > to a right bracket.  Otherwise, your vocabulary counts
   are going to be wrong because of speech prefixes and non-authorial stage
   directions.
 
   If you wanted to print the text on your nice printer with speech prefixes
   in bold and stage directions in italics, you can use search (maybe even
   search and replace) to find the next thing to change.
 
   If you want no markup at all, the markup described above can be stripped
   out completely with a very short program or a UNIX utility.  (I'll do this
   for the group -- either writing the program or stripping marked-up texts.)
 
Of course, someone with sophisticated needs like Ken will probably want to
add mark-up for many more things (line numbers, verse versus prose,
instances of rhetoric, etc.)  This should be added later, but the basics
should go in at the start.  If there's anything I've learned in the 10
years I've been working with machine-readable Shakespeare, it's that doing
it right at the beginning will save one a great deal of time later on.
 
Sorry for the verbage.  Just a programmer's view....  Trying to help you
not repeat the mistakes of my past, but perhaps you're on a different road
that will miss those potholes!  Again, determining what most of you want to
use these for is crucial.   People like me, with text analysis needs,
probably already have marked up texts. (Therefore, feel free to ignore us!)
 
Tom
 
(3)-------------------------------------------------------------------------
 
Date: 		Sun, 22 Sep 91 09:59:53 EST
From: 		Ken Steele <
 This e-mail address is being protected from spambots. You need JavaScript enabled to view it
 >
Subject: 	What I meant by "minimal encoding"
 
Sorry, Tom, if I worried you with my suggestion that the simpler, the
better, when it comes to encoding the Public Domain Shakespeare texts.
Of course this has limits; I thought I mentioned at some point that
there are things much easier to encode the first time through, rather
than trying to add them later.  In this category I would probably NOT
include act, scene, or page/signature, although all of these things
are going to be important to everyone, simply because they can be
added very easily after the text has been entered.
 
The distinction between speech prefix and stage direction is
important, and should be done from the beginning.  Likewise some
distinction of homonyms might be advisable -- Howard-Hill affixed a
"#" to the beginning of "art" when it was a noun rather than a verb,
for example.  And the distinction of French, German, Italian, Latin,
whatever is quite difficult to do after the fact, scanning a text, and
should be done the first time around, if possible.
 
I'd define "minimal encoding" as the following, although some of this
could be added in the editing stages:
 
	Play Title	eg. <T Hamlet Q1>
	Act/Scene  	eg. <A 3.2>
	Line		[this should probably be added mechanically]
	Direction	eg. <D Enter {Hamlet}.>
	Speech Prefix	eg. <S {Ham.}>
	Font		eg. {these words in italic}
	Language	eg. <L Latin> ergo <L English>
 
Title and Act/Scene are easily added later by an editor, and needn't
trouble the keyboardist -- although text which actually appears in the
facsimile should be included.  Although this simple handling of Stage
Directions and SPeech Prefixes may not meet everyone's needs, it would
be simple to convert to something more useful.  I've found the curly
brace method of distinguishing italics is useful visually, and is
relatively simple to convert using search-and-replace to something
else should the need arise.
 
Some other things which might be added by an editor with sufficient
resources and information are the following:
 
	Speaker (not always clear or the same as prefix)
	Verse/Prose (not always the same as the original ed.)
	Signatures/Pages/Formes
	Compositor Stints (usually a little theoretical)
 
I would still consider this minimal encoding; what I meant to suggest
in my attraction to the simplicity of Hardy Cook's approach was that
I'd still like the ASCII text to reflect the appearance of the
original, particularly if the users won't have access to it.
 
 
					Ken Steele
					University of Toronto
 

Other Messages In This Thread

©2011 Hardy Cook. All rights reserved.