Methodology
Although not all potential data sets in the Mapping Digital Technology in Rhetoric and Composition History project are dependent on digital archives, this study's pool of online rhetoric and composition journals certainly is. These works may not currently exist in their original formats or locations. This presents several methodological challenges related to the impermanence of this type of data. As Steven Krause (2007) asserted, digital scholarship can disappear without warning due to changes in a publication’s information architecture that causes hyperlinks to break, a shift in a publication’s intellectual direction, or simple neglect.1 Additionally, the pilot data set collects the institutional affiliations of article authors and journal editors for geographical plotting on the corresponding maps, but not all journals offer this information. In some cases, this omission is a simple inconsistency; however, anecdotal evidence, including personal conversations with founding members of some of the journals in this study, suggests that sometimes the decision to omit institutional affiliations was an attempt to democratize scholarly discourse. This position seems in keeping with the Web’s principles of open access and personal publication in contrast to its print counterpart. Several online rhetoric and composition journal authors and editors are also not academics, and with concerted efforts to incorporate the perspectives of artists and private-sector professionals in publications including Kairos and Enculturation, the erasure of institutional affiliation is a rhetorical tactic to identify individuals primarily through their contributions rather than their institutional attachments.
Although omitting institutional affiliations is perhaps an appropriate and even laudable choice, it makes collecting data about the journals' geographical aspects difficult, and thus, the omission necessitates painstaking research into individuals’ locations at specific times. Such individuated research has the potential to introduce inconsistencies either due to simple error or to the necessity of using different methods to locate individuals. Moreover, omitting location information implicitly casts place as an elitist affectation rather than a valuable shared data point. Disregarding institutional affiliation is also somewhat curious if we consider that a major reason for including affiliation in most publications is to facilitate discourse among authors and readers. Email addresses, forums, and comment functions may encourage conversation in online publications, but these were also included inconsistently. All too often, online journals associated works only with names, effectively erasing their geographical connections and rendering the work a dislocated monologue. This seems to run against both the democratic impulse of online journals and the Web itself. More importantly, this lack of grounding reiterates the fallacious notion of a placeless Web by concealing coherent spatial trends.
Other issues encountered during data collection are characteristic of the journal genre as a whole rather than exclusive to online publications. These included possible discrepancies between a person’s institutional affiliation and the location where he or she lived. (Particularly due to the rise of distance-mediating digital technologies, it is no longer necessary for a person to live where he or she works.) This difference creates a potential ambiguity and also raises a more fundamental question of how to interpret a person’s university affiliation in contrast to his or her location of residence. An additional issue was the potential time lag between a work’s production and its publication. An author may substantially complete a work at one institution and then change institutions before the work is published. This raises a methodological question about how to value a work's place of production with its place of publication. Moreover, some works, such as interviews, force the researcher to determine if the geographical location of the interviewer or interviewee is of primary importance. Such methodological choices are relevant, and they require representational choices that underscore research's profound rhetorical investments. This study's guiding principle is to set its grain size at the institutional level whenever possible and to associate people with the institutions listed in the journals themselves. When this information was unavailable, I attempted to identify a person's location at the time of issue publication, either through his or her email domain name or other biographical research. This protocol does not correct any errors that were recorded in the journals, but it does strive for consistency between the journals and data.
Determining which online journals would constitute the data set was also a significant, and thoroughly rhetorical, task. Because of the comparatively low barrier to online publication, there are several online journals, although many of them existed for only a brief period of time—intentionally or unintentionally—and others are so individuated that they would skew the data set. Because of the ephemerality of online work, some of these publications no longer exist in a readily accessible format. This study's purpose is not to be comprehensive (if such a goal is even possible) but rather to be representative, and as such, its data pool sought to include significant publications. To ensure a source's significance, I cross-referenced three lists of online rhetoric and composition journals: Byron Hawk’s (2008) “Journals in Rhetoric and Composition,” Douglas Eyman's (2005) “Online Journals” page of the Computers and Writing Clearinghouse, and Matthew Levy’s (n.d.) "Journals..." list. Publications were selected from these lists based on how frequently and consistently they released issues, if they began publishing at an especially early date, or if their authors and editors published in other journals or served on editorial boards. From these criteria I generated a representative sample of ten online rhetoric and composition journals spanning the years 1994–2008: Academic.Writing; Across the Disciplines; Computers and Composition Online; Currents in Electronic Literacy; Enculturation; Inventio; Kairos; PRE/TEXT: Electra(Lite); The Electronic Journal for Computer Writing, Rhetoric and Literature; and The Writing Instructor.2 This study's data pool seeks to encompass the pioneering, prolific, and durable online publications in the field.
The data's grain size is the institution, and three geographical variables are tracked for each journal issue within the sample period: the location of article and review authors; the location of editors; and the location of associated sponsoring institutions. Each variable was assigned a specific numerical value based upon its relative impact on the journal, and numerical totals were used to compute the scale of corresponding map symbols. The numerical value of an issue’s sponsoring institution was four, which was the largest value. This coding denotes the marked effect of the sponsoring institution on other variables such as the location of editorial staff, who generally cluster at the sponsoring institution, particularly early in a journal’s existence. It was also the rarest of the three variables, as each issue had only one sponsoring institution but many editors and authors, so this variable was assigned the largest numerical value to give it a measure of parity. The numerical value of each issue editor was three, because editors have had a significant impact on the whole of the issue's content by selecting which works appear in an issue and by determining the theme an issue might have.3 Authors were assigned a numerical value of two for each article produced in an issue and a numerical value of one for each review. The classification distinction between articles and reviews was predicated upon a work’s amount of content. Works greater than 3,000 words or approximately seven pages were considered articles; works less than this were considered reviews. This threshold was an average based upon the lengths of works labeled as articles and reviews in the journals themselves, and the classification was mediated on a per-case basis by the amount of multimedia data contained in a work such as audio/video clips, images, interactive animations, and other elements. The numerical values of articles and reviews were divided equally among the authors of multi-authored works, because awarding full value to each author of a multi-authored work would distort its relative weight. Statements in journals reprinted from other sources without significant modification, such as CFPs and conference announcements, were not tracked.
At this point, it is worthwhile to acknowledge the subjective—but not arbitrary—nature of this study’s data weighting. Although I assert that its valuation framework is legitimate, it is the contestable result of grappling with the challenges of tracking this material. One such difficulty was determining the distinction between articles and reviews. Although I deemed this differentiation necessary (and it was present in most journals’ own classifications), using the criterion of word or page count was unsatisfying and possibly imprecise. Formal experimentation has flourished on the Web, and many online rhetoric and composition journals explicitly made such experimentation part of their espoused ethos. As a result, some works that journals identified as articles were too short to qualify for this designation by the study’s parameters. Additionally, some works labeled as reviews—of books, websites, software—were substantially long enough to qualify as articles by this study's standards. Moreover, it was difficult to evaluate works that were not primarily textual. Many works were multimedia texts that incorporated sound files, video clips, and interactive elements. Some works were mostly collections of hyperlinks to external webpages, which raises the question of how to assess the original contributions of such material. My policy was to adhere to the study’s length-based criterion, but throughout I was reminded that such a parameter is not sophisticated enough to handle the experimental work in this medium. However, it would seem that any set of criteria would be imperfect. An alternative might be to flatten quantitative distinctions and count all authored material equivalently. This is a simpler option, but it would negate the nuanced distinction between extended works of original scholarship and shorter works of review or description (a distinction that is still valuable both in my estimation and that of most journals in the data pool). More importantly, this tactic does not escape the question of valuation; it merely offers a different answer with its own set of caveats. The data valuing process reveals the inescapably rhetorical nature of the researcher’s task, even in something as seemingly algorithmic as quantitative document classification.
A further difficulty of valuation with this data set were the varying degrees of different journals' disciplinary significance. The MLA Directory of Periodicals provides detailed data that could be used to evaluate a journal's reputation within the field. Individual works could then be variably weighted according to their publication venue. However, not all of the journals in the study's data set are in this index, and making judgments about journal significance would add another layer of disputable subjectivity to the weighting schema. As such, this variable was not incorporated into the data. Nevertheless, factoring this aspect into project maps may present the data in productive ways and allow valuable patterns to emerge. It is certainly an option for future study.
To collect data, I visited the websites of the online journals themselves and reviewed their archives to obtain location data for article authors, journal editors, and journal-sponsoring institutions.4 Additionally, because this study also examines the geographical movement of disciplinary concepts, yearly concordances of article/review titles and special issue themes were needed to identify prominent concepts so that they could be connected with locations. I managed this task by creating text files (available for download through the "Download data files" link in the "Resources" box at right) that collect all article/review titles and issue themes for all of the journals produced during one calendar year. Each article/review title appears once in the text file; journal issue themes appear four times to give them appropriate proportional weight, corresponding to the ratio of one to four established by the study's data valuation parameters.5 I then generated a concordance of each text file to identify the five most frequently occurring terms in each year.6 I identified the locations of authors and sponsoring institutions that used these terms in article/review titles and journal issue themes and tabulated the frequency of term recurrence in each location. This process was repeated for each of the five most frequently occurring terms for all years between 1996 and 2008 (the years 1994 and 1995 were omitted due to insufficient data). Terms tied in rank in 1998, 2002, 2003, 2004, 2005, and 2007. In 2007, seven terms appear instead of five because of multiple ties in rank.
I recorded study data in a spreadsheet comprising five worksheets (available for download through the "Download data files" link in the "Resources" box at right). I used a spreadsheet rather than a database for data recording because its interface is familiar and accessible to a broad audience and the file format is flexible enough to be used by many different software applications. These aspects give other researchers easy and effective access to study resources. Spreadsheet applications also enable fairly sophisticated data manipulation, such as sorting and filtering, which was vital to produce this study's maps.
The five study worksheets are divided into three primary worksheets titled “authors,” “editors,” and “sponsoring institutions,” which collect raw data, and two secondary worksheets titled “PPSM results” and “CMM results,” which contain aggregated and processed data ready to be transformed into maps. All worksheets include various column headers corresponding to pertinent data fields. This material was converted into multiple .kml files readable by Google Earth (available for download through the "Download Google Earth maps" link in the "Resources" box at right) through the use of Stephen Morse’s online “Batch Conversions of Addresses to Latitude/Longitude” service and Bill Clark’s online “Excel to KML” converter. When opened in Google Earth these files display the study's interactive PPSM and CMM.
I selected Google Earth as this study's map platform for several reasons, including its growing prevalence in academic and scholarly contexts, as discussed in Patricia Cohen's (2011) recent New York Times article "Digital Maps Are Giving Scholars the Historical Lay of the Land." Jeffrey Young (2008) revealed in The Chronicle of Higher Education that Google Earth is a viable platform for academic historical mapping projects similar to this one, such as the University of Richmond’s Voting America map of United States political election data since 1840.
Accessibility is important to the Mapping Digital Technology in Rhetoric and Composition History project because of its collaborative ethos. The Google Earth desktop client has been translated into thirteen languages and is available without cost for multiple computer operating systems. Google Earth Project Manager Chikai Ohazama (2008) reported in “Truly Global” that by February of 2008 the program had been downloaded over 350 million times by people around the world. Additionally, the .kml file format used by Google Earth is controlled by an international standards organization, the Open Geospatial Consortium, rather than by a commercial entity. Any program that can open plain text files can open and edit files in this XML-based format, and because .kml specifications are controlled by an autonomous standards body, they cannot be capriciously leveraged to limit the access of programs or individuals for economic gain. Open formats and free, widespread, familiar software foster a spirit of academic collaboration in general, and they are particularly suitable to this project's orientation.
It must be noted, however, that Google Earth’s accessibility for disabled persons, particularly the visually impaired, is somewhat limited. A fundamental issue of geographical mapping is that it is a highly visual endeavor. Visually impaired photographer Tim O’Brien (2009) wrote in his review of Google Earth that it is a “profoundly visual program,” and that if the software is to be serviceable for the visually impaired it “would need to be rethought completely using tactile and auditory interfaces.” Nevertheless, O’Brien (2009) pointed out that recent versions of the software have taken steps to become more accessible by increasing default font sizes and using text outlines to heighten contrast between symbol labels and their backgrounds. Perhaps more fundamentally, because Google Earth files are stored in .kml format, it would be possible to create programs that present Google Earth files through the sort of tactile and auditory interfaces O’Brien imagined.
Google Earth also has the great advantage of functioning in both online and offline formats. In addition to the full-featured Google Earth desktop client, Google also produces a free, downloadable plug-in for multiple Web browsers and operating systems that allows Google Earth visualizations to be viewed and manipulated online (albeit with reduced functionality). The Mapping Digital Technology in Rhetoric and Composition History project offers both downloadable .kml files for use with the Google Earth desktop client and an online map interface for use with the corresponding browser plug-in. This study's maps may be accessed online in the "Online Map Interface" section.
These features of Google Earth—burgeoning prevalence in academia, compatibility with multiple languages and operating systems, availability at no cost, and use of open file formats—are well in keeping with the project's principles of open access and collaboration. All project map files and raw data are also freely downloadable from this website in open formats including .kml and .txt and released into the public domain. These measures attempt to enact the project's approach to history-making by encouraging broad dissemination, distributed collaboration, and the production of multiple, competing narratives.
1 Ashley Holmes's "The Essence of the Path: A Traveler's Tale of Finding Place" in this special issue also addresses the loss of a digital text.
2 The sample period ends at 2008 because data collection is an ongoing, time-intensive process.
3 Full editorial boards were logged in project data records, but only those persons listed as functional editors were tracked in visualizations. Editorial boards tend to be very large, and the actual involvement of board members in issue production varies greatly between different journals. In some cases, editorial board membership is essentially an honorific title rather than a duty with defined responsibilities. Including whole editorial boards in project visualizations introduced statistical noise, but it did not clarify comprehensible patterns. As such, editorial board member data is included in project data but not reflected in the study's maps.
4 When whole journals or individual issues were no longer available, I used the Internet Archive’s Wayback machine, which houses chronological snapshots of Web content. I acknowledge an immense debt to this resource, without which this study would have been impossible.
5 Concordance files do not distinguish between the relative weights of article and review titles, because the purpose of these data was to identify relevant terms, or more broadly, what was being discussed in the journals, whether that discussion occurred in a short or long format.
6 Similar terms were aggregated. For example, “pedagogical,” “pedagogies,” and “pedagogy” were all combined under the same head.