SHAXICON, PART 3, the nature of the database.

Shaxicon is a database-not a computer program.  It indexes the thousands
of "rare words" that appear no more than 12 times in Shakespeare's
canonical plays.  These include words that are not "rare" elsewhere but
are rarely used by Shakespeare (e.g., "family" and "real").  These
"rare" Shakespeare words (i.e., words occurring in the canonical plays
12x or less) are indexed as they appear in the canonical poems, bad
quartos, and other apocryphal or non-Shakespearean texts; also indexed
are words appearing in the canonical and noncanonical poems, the "bad"
quartos but not in the canonical plays.  But the occurrence of "rare"
words in these other texts does not affect the core lexicon, the
boundaries for which are limited to words appearing anywhere in a
canonical text at least once, but never more than 12 times in the
canonical plays.  (The poems, which have a far richer vocabulary than
the plays, introduce chronological skewing if treated as commensurate
with the plays-an error made by the Hieatts in their work on the
Sonnets; hence, occurrences in the poems are indexed but cannot serve to
_exclude_ from Shaxicon those "rare" words that appear 12 or fewer times
in the canonical plays)

Also cross-indexed-generously but not yet exhaustively-are an additional
2,000+ STC texts selected from Literature Online ("LION").  Shaxicon was
originally to be cross-indexed with the Vassar Electronic Text Archive;
but Literature Online came along and blew "VETA" off the map, so I have
switched to LION as a more comprehensive text-pool.

Shaxicon charts English literary vocabulary in Shakespeare's lifetime,
with a special focus on Shakespearean texts. Words can be tracked from
their initial appearance in English literature, to Shakespeare, to other
texts; or from a Shakespearean coinage to Shakespearean imitators; and
so on.  Shakespeare's "forgetting" or repetition of his own vocabulary
can likewise be explored.  Shakespeare's reading can be charted:
whenever Shakespeare reads a word-rich text closely and carefully, its
vocabulary explodes into his new writing.  The poet's acquisition of new
vocabulary from other texts has emerged as one of the most reliable
means by which to date Shakespearean texts and revisions.

A principal virtue of Shaxicon is that it can be used to fine-tune its
own chronology.  Another is that it can be used to expose even my own
mistakes as editor of the database. If one imagines the extant
Shakespeare writings (including disputed, and corrupt, and collaborative
texts) as a single linguistic tapestry with a few smudged or damaged
sections, and if one thinks of Shaxicon as a simulacrum of that
tapestry, one can then go about testing for inaccuracies in the
electronic model.  This entails a tedious process of checking and
cross-checking for internal contradiction:  each Shakespearean text must
be evaluated for relative "earliness" or "lateness" against every other
canonical text; as against every other text in the STC cross-sample
(most notably those texts that Shakespeare can be shown to have read).

I began, years ago, with the traditional chronology for the plays and
poems. Ever since that first hypothetical sequence was put into place,
it has been tested for inaccuracy or self-contradiction, re-tuned, and
tested again.  This has been a long and arduous process, and one that
has not led to a perfect sequence that is free of all internal
contradiction-but we're working on it.

For example: on the basis of rare-word distributions, TIT-Q1 (the Q1
_Titus_ lexicon) looks to be earlier than ROM-Q2 (the Q2 _Romeo &
Juliet_ lexicon), and ROM Q2 looks to be earlier than MND-Q1.  One may
thus infer the order: _Tit._ (Q1 version) >-->  _Rom._ (Q2 version) >-->
MND (Q1 version).  But my original sequence was structured TIT >--> MND
--> ROM-Q2.  The correction in subsequent versions represents an
accommodation to statistical, rather than historical evidence.  I cannot
prove that Shakespeare wrote these plays in the order Tit., Rom., MND.
I can only show that this order is suggested by the lexical data.

When a textual thread is out of place in the simulated tapestry, it
nevertheless remains in the same relative position for ALL lexical
distributions in the Excel data-tables; and because no particular
Shakespeare text has an overwhelmingly lopsided correlation with any
other Shakespeare text, line-item adjustments in the sequence (e.g.,
TIT, ROM, MND, for TIT, MND, ROM) have little effect on the data tables
generated by the corrected sequence.  Presumably, if we were able to
name the precise order in which Shakespeare set down his words from 1590
to 1616, the patterns of his reading and lexical recall would be starkly
clear.  But it will never be possible to place the 9,00,000+ words
represented by the Quarto and Folios texts in "correct" chronological
order.  I doubt that any of us could do so even with our own writing,
with or without the benefit of dated computer files.

Once a working chronology has been established (and anyone is free to
challenge the one that I and my research assistants have constructed),
one may investigate whether that chronology is borne out by the
intertextual evidence of Shakespeare's borrowing, and of borrowings from
Shakespeare by other writers. For example,  Shaxicon indicates that
Shakespeare first encountered Broke's "Romeus and Juliet" while writing
3H6.  Broke's vocabulary explodes into Shakespeare's writing in late
1592, and influences not only ROM but all of the poet's new writing in
the two years following.  If the Shaxicon sequence were to date a text
1593  (after 3H6), one should expect that text to register the lexical
influence of Broke's poem.  If no such influence is apparent, then the
Shaxicon sequence for that text would be suspect.   If a variety of
Shakespeare's source-texts from this period registered similar
anomalies, it would expose a flaw in the chronology, and grounds for
another adjustment in the chronological sequence.  For example: let's
suppose that in a revised and corrected chronology, we were to find that
the Friar Lawrence role appears to be "influencing" texts before Q2 ROM
and not afterward:  then my remarks about Friar Lawrence as one of the
poet's "remembered" texts would, of course, be disproved or at least
rendered highly problematic.

I have illustrated something of this research process in D. Foster, "The
Webbing of Romeo and Juliet," _Critical Essays on Shakespeare's <Romeo
and Juliet>_, ed. Joseph Porter (NY: G. K. Hall, 1997), 131-49.  Those
SHAKSPEReans who are interested in Shaxicon will do well to start there;
to facilitate cross-examination of the article by skeptical readers, I
have also posted the "Romeo and Juliet" article from _The Shaxicon
Notebook_ on my Webpage.  ("My" Webpage was constructed by a helpful
student; as Kent Hieatt has observed, the site has been poorly
maintained by yours truly; but I'm pedaling as fast as I can).

Other commentary by me concerning Shaxicon, none of it perfectly
satisfactory, can be found in the following articles and news-items:

-----. "Forum."   _PMLA_ 112.3 (May 1997): 432-4.

-----. "Shaxicon and Shakespeare's Acting Career."  SNL 46.3 (Fall
1996): 57-8.

-----. "A Funeral Elegy: W[illiam] S[hakespeare]'s "Best-Speaking
Witnesses."  _PMLA_ 111.5 (October 1996): 1080-95; reprinted in
Shakespeare Studies 25 (1997).

-----. "SHAXICON Update."  _SNL_  45.2 (Summer 1995):  1+.

-----. "Reconstructing Shakespeare. Part 3 of 3:  New Directions in
Textual Analysis and Stage History." _SNL_ 41.4 (Winter 1991): 58-59.

-----. "Reconstructing Shakespeare. Part 2 of 3:  The Sonnets." _SNL_
41.3 (Fall 1991): 26-7.

-----. "Reconstructing Shakespeare. Part 1 of 3:  The Roles that
Shakespeare Performed." _SNL_  41.2 (Summer 1991): 16-17.

