Tools for DH–Humanities in Action

I am teaching a partial credit course for the Languages and Cultures and Humanities Residential Colleges this year, called Humanities in Action.  This is a project based course that meets once a week over supper to develop ideas for Humanities based projects and develop and design them in a group.

As I have been working with students and faculty on various projects in DH I have wanted to have one page where I put together sites/tools/tutorials that are helpful to us. I am compiling these here and this is still a work in progress, but it might be useful to colleagues out there also.  In DH we tend to learn collaboratively; so many of the tutorials are adapted from colleagues who have pioneered this approach to teaching, such as Miriam Posner at UCLA and Alan Liu at UCSB,

Tools for DH–Humanities in Action

There are some basic tools that can help you with your DH projects, whether you know programming or not.  Here are some of the ones my students and I have found most useful.  Tutorials are also linked.

Text analysis

jiayu's network
Network visualization of terms in Wikipedia’s RPG game descriptions (by Jiayu Huang for HUMN 270 Fall 2015)

Voyant is the best multi-tool text analysis platform for a start.  The version that is online is the earlier release and can be found as part of the suite of tools that can be found here.  There is a new version of Voyant that brings these different platforms into one interface and which doesn’t require switching between tools.  If you want to use that, ask me.  I have a version on my thumb drive.

Voyant is very good as a concordance and frequency analysis visualization tool.  It can work with large amounts of text in multiple files.  You can compare aspects of different texts easily.  For example, which words come up most frequently in which texts; which terms are collocated; what are the vocabulary densities of different texts?

Here is a tutorial for Voyant 2.0

There are also sites/tools for analyzing large amounts of text data from a macro or high level perspective:  for example, Google Ngram viewer which visualizes word frequencies in the corpus of Google digitized books (in multiple languages)  and Bookworm which visualizes trends in repositories of digitized texts.

Topic Modeling

Screenshot 2015-09-27 19.26.38Topic modelling is a method by which your text is chunked into pieces and a computer works out what the most important topics are in the chunks.  The algorithm is not interested in meaning, just in related concepts.  The best tool for this is MALLET; a Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text. MALLET includes sophisticated tools for document classification: efficient routines for converting text to “features”, a wide variety of algorithms (including Naïve Bayes, Maximum Entropy, and Decision Trees), and code for evaluating classifier performance using several commonly used metrics.

But if you are not comfortable with command line programming there is also an online version that can work for smaller amounts of text.  That can be found here.

There is also a nice demo tool that can be used to identify topics, themes, sentiment, concepts at AlchemyAPI

Screenshot 2015-09-27 19.29.27Miriam Posner has written a great blog about how to interpret the results from Topic Modelling outputs.

Mapping

There are lots of online platforms out there for mapping data.  It all depends how fancy you want to get and whether you want to do more than map points.

CartoDB is definitely fast and flexible.  If you have a csv with geo-coordinates you can upload in seconds and have a map.  It also has a geo-coder that can quickly turn your list of places into latlongs.Screenshot 2015-09-27 19.49.35

Another way to go is through Google Fusion tables.  Again this is a super fast and easy way to map data.  You can also produce a nice “card view” of entries that will make your Excel spreadsheets into much more reader friendly format.  There are also other multiple ways to visualize your data in graphs, networks, and pie charts.

ArcMap online is another way to go if you want to produce far more sophisticated mapping visualizations, such as Story Maps and Presentations.  Bucknell has an institutional account.  If you want to use it, let me know.

Palladio is an interesting multi-dimensional tool from Stanford Literary Labs.  It can produce maps, networks, timelines and graphs of your data.  Here is a tutorial for my HUMN 270 class, written by Miriam Posner.

Building 3D Models

This is a part of DH I have not yet ventured into, but others on campus definitely have!  The easiest entry into modelling is SketchUp.  We also have Rhino loaded on the machines in Coleman 220 and it is regularly used in Joe Meiser’s Digital Sculpture class.

Timelines

Screenshot 2015-09-27 20.30.02There are various platforms out there for constructing digital timelines that also allow for the inclusion of multimedia elements and one, Timemapper, also allows for a mapping window.  Most frequently used are Timeglider and Timeline.js

Creating a Digital Exhibition

If you re interested in curating a digital exhibition of artifacts, the best platform to use is Omeka.net. This is a free online version of the more robust and versatile Omeka.org platform which has to be installed on Bucknell’s servers (which can take a while).  Omeka.net allows you to upload digitized images, documents, maps etc to a “collection” that can then be arranged and curated as an online exhibit.  This is particularly useful if you have found a collection of photographs (maybe your own) that you would like to present in a public facing platform with a narrative logic.

Screenshot 2015-09-29 09.00.21
View of Matthis Hehl’s Itinerant Map of Pennsylvania, annotated using Neatline. http://ssv.omeka.bucknell.edu/omeka/neatline/fullscreen/itinerant-preachers-map-of-pennsylvania

Again, if you want to do this I am happy to show you how.  Here is a link to my own (developing) Omeka site at Bucknell on the Stories of the Susquehanna.  The server-based version has a very nice visualization tool called Neatline, which allows you to link the digital artifacts in your collection to a base image (maybe a map or a painting) and then annotate. This is an example I am developing for the 1750s Itinerant Preachers’ Map of Pennsylvania which I have used a great deal in my research and also in my teaching.  There is also a Timeline widget you can activate.  I have discovered a great tutorial on how to use Omeka and Neatline here, put together from a workshop given at the Michelle Smith Collaboratory at the University of Maryland.

Networks

If you want to create a network visualization in more detail and depth, use Gephi. Gephi is an open source, free, interactive visualization and exploration platform for all kinds of networks and complex systems, dynamic and hierarchical graphs.  It runs on Windows, Linux and Mac OS X.  One of most important issues to consider before investing a lot of time in learning Gephi is whether or not your research question might be answered through using this visualization platform.  Ask yourself these questions, and then if the answer is yes, prepare your data!  Gephi has an excellent set of tutorials on GitHub.  Data can be prepared directly in Gephi or can be imported in csv format.

Pedagogical Hermeneutics and Teaching DH in a Liberal Arts Context

Diane Jakacki and I gave the following presentation yesterday at DH2015 in Sydney, Australia. We include the slides and the abbreviated form of the talk.  The complete version will be published as an article in the near future.

Thanks to everyone for coming and for your interest!

We take our title from Alan Liu’s challenge to DH educators to develop a distinctive  pedagogical hermeneutic of “practice, discovery, and community” What does this look like?  How do we put this into practice?

This paper focuses on our teaching experience at Bucknell University in the academic year 2014-15 to show how the planning, design, and execution of a new project-based course, Humanities 100, introduced undergraduate students to the world of digital humanities through the use of selected digital tools and methods of analysis. This course, taught within the Comparative Humanities program, was designed specifically for first- and second-year students with no background in digital humanities, in order to encourage the development of digital habits of mind at the earliest phases of their liberal arts curricular experience. Developed to encourage examination and experimentation with a range of digital humanities approaches, the course asks students to work with primary archival materials as core texts to encourage digital modes of inquiry and analysis. The decision to root the course in a multi-faceted analysis of archival materials provided the rare chance for students to also engage in the research process typical for a humanities scholar: namely, the discovery of artifacts, the formulation of research questions, followed by the analysis and synthesis of findings culminating in the publication of initial findings in a digital medium. In the process, we introduced students to the basic structure of how to develop a DH research project.

The Comparative Humanities program is an ideal curricular environment to teach such classes with its explicit learning goals of comparativity (historical period, cultures, genres, modality) to which we added course specific learning goals that pertain to DH. (Slide with goals) The course therefore provided us with the opportunity to not only expose students to methodologies related to distant and close reading, network and spatial visualization, but also requiring that they learn to think critically about what each of these methods, and the tools that they used within the course, reveals in the texts with which they worked.

To date the course has been taught three times: as twin sections in Fall 2014 in which we both used the same scaffolding method with discrete subject matter and core texts. We participated fully in one another’s sections – this gave us the opportunity to teach our specializations within each other’s classes. Katie Faull taught the course again in Spring 2015, and Diane Jakacki participated. Both of us will teach a section next year.

This approach to teaching is important as we consider how to incorporate DH into the classroom. It required significant commitment on both our parts to the actual execution of the course, as well as recognition that we needed to be transparent to ourselves as well as to our students about how this represented a new model for course design at Bucknell. It is important to note that while other DH-inflected courses are being taught, this is the first Digital Humanities course at Bucknell.

At  Bucknell, the focus in digital humanities scholarship and learning to date has been primarily on spatial thinking, until recently rooted in working with ArcMap-type GIS and thinking about humanities in “place”.  It was important to both of us to emphasize and extend that objective in the development of the course and its learning outcomes, and so we focused on finding materials that would be of interest to students so that they could relate to the historical context more directly.

The first time the course was  taught we decided to run it in two sections, anticipating an opportunity to reflect different perspectives of our expertise with DH methods and tools. Diane’s focus has until now been on text encoding and analysis, while Katie’s has been on mapping and data visualization. We also worked with discrete data sets of archival materials. Katie’s course focused on the Colonial mission diaries of the Moravians from Shamokin, Pennsylvania (today Sunbury) and situated 9 miles downstream from the university. Written in English, the diary sections selected dealt with interactions between some of the first Europeans to the area and the Native peoples they met and worked among. Katie has spent the past five years working with this subject matter, and is considered an expert in the field of Moravian studies.

Diane’s course considered a subset of the diaries of James Merrill Linn, one of the first graduates of the university and a soldier in the American Civil War.  The choice of the Linn material had to do solely with its accessibility – Linn’s family left his life papers to the Bucknell Archives. Diane’s research is not in 19th century American history, and so she had to be honest that engaging with Linn’s diaries would be a discovery for her, too. In Katie’s iteration of the course this Spring, she selected materials that took the students slightly further afield, but still kept them within the Susquehanna watershed and the Chesapeake Bay using a different set of Moravian archival materials.  (Slide with archival materials)

Both of our choices reflect and extend Bucknell’s interest in digital/spatial thinking in terms of its place in the larger historical and cultural narrative. In all cases, students responded well to the investigation of places familiar to them, with several students having family connections to specific locales mentioned in the archival materials. The pedagogical hermeneutics of Humanities 100 were intentionally designed to encourage student examination and experimentation  and discovery with a range of digital humanities approaches.  To this end, the sequencing of the modules was carefully designed so that the “product” of each module then became the “data” of the next module.

In addition to praxis-oriented assignments, we wanted students to understand the broader context of their work within a DH framework. To that end we assigned theoretical readings and analysis of a range of major DH projects, which students then wove into their online reflections. Extensive use was made of online platforms that emphasize important forms of digital engagement, including collaborative online writing environments. Each module ended with a short assignment and also a reflective public-facing blog post that became a shared form of intellectual engagement.

In order to begin any kind of DH archival project the students had to produce a digital text.  In the first iteration of the course we did not have a transcription desk available and so students transcribed the assigned pages of the original into a shared Google doc.  This digital text was then color-coded in terms of “proto” tags to ease the way into close reading with TEI tags in Oxygen.  By the time the second semester started we had obtained an institutional subscription to the online platform Juxta Editions which we were then able to use as the transcription platform and also the introduction to thinking about tagging. From the transcription came the lightly marked up digital text that was then imported into Oxygen for more complex tagging.  Students then began tagging in earnest and were introduced to the discoveries of close reading involved in marking up a text.  Names, places, and dates were easy (in Juxta edition they had already been imported).  However the hermeneutical fun started with working out whether a boat was a place or an object, for example.  Or whether God was a person.  And just what is balsam, an object?  an emotion?

During these classes, the historical remoteness of the texts (in Faull’s class from the first half of the 18th century, focusing on Native Americans in the fall and in the Spring on preaching to the enslaved peoples on the Tobacco Coast) was lessened by the act of tagging and the lively discussions that surrounded it. Once a reliable text had been established we then introduced students to the concept of “distant reading” through the Voyant platform.  At the same time as students were encouraged to “play” we also pointed out the circular motion of discovery and confirmation that is inherent in any research experience. The students had just read these archival texts very carefully in order to transcribe them, so we asked them the usual kinds of questions one asks when approaching any kind of new text.  What is it about?  What are the major themes?  Who are the most important characters?  Then, having read Edward Whitley’s text on distant reading we asked the students to think about what reading a text distantly does to that hermeneutic. (Slide of distant reading prompt and visualizations)

This data, the TEI tags, crucial to the success of the students’ mark up assignment and the production of a final digital document, needed some restructuring as we moved onto the next module.  To manage this, we developed a prosopography for each core text – a database of people, places, and connections that grew organically out of the focus of each specific section and provided the data for entry into Gephi and was then built out in adding geospatial data for GIS. So for example, one group of students wanted to use Gephi to interrogate the assumption that relationships between the missionaries and the Native Americans in the area around the mission remained constant.  However, by using the TEI persName tags and exporting them into a Gephi node/edge tables the students were able to show how relations between the Native leaders and the Moravian missionaries changed over a five year period of the mission (Include slide of Jerry and Henna’s work). Students also used the sigma.js plug in so that the network visualizations were interactive.  However successful this team was in their work, it was clear from all iterations of the class that the hermeneutics of social networks was the hardest for the students to analyse and manipulate (which is quite ironic, given how most of them are well plugged in to Twitter, Instagram, etc).

Lastly, students worked in ArcGIS Online to consider the evidence they had discovered within these texts in terms of spatial analysis. The story maps they produced became a new form of critical essay, with thesis, arguments supported by direct evidence, and conclusion all presented within a story map framework. so, for example, one student used Linn’s references to ships running aground during a storm at Hatteras Inlet, found a contemporary document reporting on the damage done to Union ships during this point in the campaign, and overlaid his evidence on a nautical map drawn in 1861 to determine where Linn’s ship had foundered.

Both the composition of the class (in terms of student personalities) and also the nature of the material determined to some extent the kind of final project students chose.  For example, in my section there were some natural groupings of students and there were a variety of final projects (one involving Gephi; two TEI markup; one hybrid ArcMap and TEI; and one story map). In Diane’s class all but two students chose to work independently  In the second iteration of Katie’s course, students decided that they would produce one final group project all together –a  course website that highlighted the best of their DH work. (Slide of Payne Froehlich website)

Assessment slide–self-explanatory

Another challenge to the class design was the high number of L2 students who enrolled in it.  In Katie’s Fall 2014 section there were 2 students of 9 from mainland China; in her spring section that ratio increased to three of five.  In the fall there was one from Australia and one from Vietnam (neither L2s but international students); one student in the spring course was from South Africa – her first language was Afrikaans.  Although the students admitted to being challenged by the readings and also the public facing writing in the blog site, a means for adjusting for student errors and allowing for corrections was developed that would allow the students to post their blog reflections in a way that did not impede their openness to reflection, knowing that they would have an opportunity to correct their English.

However, for all the challenges involved in teaching the class, there were moments of glory. Disengaged students became engaged; solitary learners recognized the essential need to collaborate in order to succeed; participants recognized the transformative nature of the course to their own concepts of the humanities. Students were eager to participate in crowdsourced data collection; they were intrigued to visualize ego-networks as they learned the concepts of network theory; they were excited to see their marked up transcriptions published in an online digital edition. Through these discoveries, they realized that they were creating a community of young DHers and expressed eagerness to take part in more of these learning experiences. Thank you!

“Writing a Moravian Memoir: the Intersection of History and Autobiography”

Screenshot 2015-05-14 19.19.51 Monday, May 11, 2015, University of Goteborg, Sweden

I delivered this seminar paper via Skype to a group of European scholars interested in ways of reading and analyzing Moravian memoirs.  The two day seminar was entitled “Life-writing and Lebenslauf:  Pillars of an invisible church” and was organized by Dr. Christer Ahlberger, in the faculty of History. In this paper I discuss ways of thinking about autobiography and the Moravian memoir, both as a radical act within the history of the genre and also, when analyzing the memoirs with the methods of DH, as a radical hermeneutic to reveal new voices in the historical record.

Screenshot 2015-05-14 19.24.46The genre of autobiography is a tricky one. Although only recently even acknowledged within the scholarly community as an object worthy of critical scrunity, autobiography has for millenia served the purpose of providing a model of the exemplary life. Whether in the form of saints’ lives, the chronicles of kings and queens, the political autobiography, or Johannes Arndt’s “best seller” the Historie der Wiedergeborenen, all have served the purpose of shaping others’ lives. Through autobiography the author is able to examine memory, shape experience, interrogate the reasons for action and examine conscience. For the reader, the genre provides an opportunity to view this process within another human subject, to witness the relation of authentic (or inauthentic) experience and emotion. Continue reading ““Writing a Moravian Memoir: the Intersection of History and Autobiography””

Teaching with Emerging Technology: the Centrality of the Collaborative Mode

Screenshot 2015-01-30 13.22.22Over the last 6 months I have been working with the latest instructional technologies and digital tools in my class, Humanities 100.  This course, brand new for the 2014-15 academic year is designed to teach students how to create a digital project with archival materials.  The goal of the course is to teach students the importance of the creation of a digital text; to think about the design of data that stems from that digital text; to make intelligent decisions about the presentation of that digital text on the web; to teach students how to mark up a text in TEI lite and beyond; to begin to think about how to add geo-spatial elements to the analysis; and also how that text can be mined to build up a database of people and places (at the least)  that can then be used to create a network analysis of the text. That is a lot to learn; and from my experience last semester I can say that some students wanted to stop at, say, transcription of the text, or mark-up.  Continue reading “Teaching with Emerging Technology: the Centrality of the Collaborative Mode”

The Importance of Understanding Visual Rhetoric: thoughts on Johanna Drucker’s Graphesis

I am re-posting on my personal site my blog entry for my class site for The Humanities Now!  These are questions that I have been thinking about a lot, and my reading of Johanna Drucker’s Graphesis has really helped to crystallize my ideas.  I am so happy that she will be coming to Bucknell in April of 2015 as part of our Humanities Institute on the Digital Humanities.

Over the last week or so, we have revisited visualization as a technique for interpretation. In our production of networks using Gephi, the process of creating data, preparing it for input into the software, manipulating it once in the software and then interpreting it once entered has been foremost. As we move on to mapping, we will find parallel processes at work: preparing data, entering it, manipulating it, interpreting it. And as we do so, it behooves us to think critically about what we are doing, and what we are not doing.

Johanna Drucker’s intelligent, broad view of visualization as a form of knowledge production offers us many pointers for taking each step on our path to visualization and interpretation with deliberation. The long chapter “Interpreting Visualization–Visualization Interpretation” from her book, Graphesis: Visual Forms of Knowledge Production (Harvard, 2014) presents us with an overview of forms of visualization primarily since the Renaissance, and it also issues a plea for the development of a greater understanding of the force of visual rhetoric; a plea that is directed especially at humanists, as they enter into a realm of spatialized representation that might appear to belong to the realm of the quantitative over the qualitative.

Visualizations can be either representations or knowledge generators in which the spatialization or arrangement of elements is meaningful. When reading a visualization, Drucker encourages us to use language carefully, employing terms such as “juxtapose”, “hierarchy”, “proximity”. Drucker claims that visualization exploded onto the intellectual scene at the edge of the late Renaissance and beginning of the early Enlightenment, when engraving technologies were able to produce epistemologically stunning diagrams that both described and also produced knowledge. Now with the advent of digital means to manipulate and produce data we can all produce timelines (!) without giving a thought to the revolution in the conceptualization of time and history that (our near neighbor) Joseph Priestley occasioned. So, as we play with Timemapper or Timeglider, Drucker cautions us to become aware of the visual force of such digital generations. “The challenge is to break the literalism of representational strategies and engage with innovations in interpretive and inferential modes that augment human cognition.” (p. 71)

How do we do this? Drucker argues for us to recognize three basic principles of visualization, both as producers and as interpreters: a) the rationalization of a surface; b) the distinction of figure and ground; c) the delimitation of the domain of visual elements so that they function as a relational system.

In her sections on the most prevalent forms of visualization, I find most pertinent to the coming module on mapping her insight that a graphical scheme through which we relate to the phenomenal world structures our experience of it (p. 74). In other words, the mapping of the earth, sky, sea or the measurement of time, that are in themselves complex reifications of schematic knowledge, actually become the way in which we experience that thing. The week is seven days long and the month is 28-31 days long (because of lunar cycles) and thus astronomical tables become the way we structure time. But time isn’t like that; it isn’t linear, especially in the humanities! It contains flashbacks, memories, foreshadowings, relativities (it speeds up when we are nervous, and slows down when we are scared). So we are imposing structures from social and natural sciences onto human experience. Drucker argues that the shape of temporality is a reflection of beliefs and not standard metrics, and therefore asks how do we find a graphical means to inscribe the subjective experience of temporality or the spatial?

For example, digital mapping may give us the ability to georectify a manuscript map onto a coordinate system, but what does this give us? It might show us how accurate a mapmaker was, or was not; it might help us to locate an archaeological site with more probability, but it is ignoring the fact that the manuscript map, drawn perhaps on buckskin, or stone, or vellum is a representation (and a thin one at that) of a traveler’s or observer’s experience that we are then translating into a system of coordinates. What is absent is the story; way-finding depends upon narratives, travel accounts, diaries. We must be aware that maps produce the illusion of isomorphism, but this illusion is based on an elaborate system of abstract schema and concrete reality.

I am most captivated by the section of her chapter that focuses on visualizing uncertainty and interpretive cartography, as this is an area I have thought a lot about in the last five years during which I have been working with GIS. As a software, GIS gives us enormous power to produce knowledge as a generator; through the combinatory power of layers, and base maps, and points, and embedded data tables, GIS has often seduced me with its “deceptive naturalism of maps and bar charts” generated from spreadsheets that I and my students have spent months creating. It strengthens the fiction of observer-independence; the objectivity of the “bird’s eye view”, and, as Drucker so aptly states, “we rush to suspend critical judgment in visualization.” For me, however, and for the students I have worked with, the question of how to represent ambiguity has consumed us; as has also how to make ambiguity the ground of representation. I think here of the brilliant visualizations of Steffany Meredyk, ’14 as she created her interpretive map of the main stem of the Susquehanna River.

Steffany Meredyk's map of the Susquehanna River
Steffany Meredyk’s map of the Susquehanna River

Using the work of Margaret Pearce, Steffany and I talked for long hours about the importance of reinserting the positionality of the observer into the visualizations of the river. Taking her “data” from accounts of massacres in the 1760-80s that occurred on the Susquehanna River, and using graphical means of Adobe Illustrator to represent ambiguity, uncertainty and emotion, I consider Steffany’s work to act as a model for the way in which we can use digital media and methods as humanists. We can, as Drucker observes, “model phenomenological experience; model discourse fields; model narratives and model interpretation.”

What’s Your Susquehanna Story?

The Principal Investigators of the Stories of the Susquehanna initiative are pleased to announce the launch of the “crowd sourcing” platform for the river.  As a public humanities project, the Stories of the Susquehanna initiative invites members of the public to submit their Stories of the Susquehanna for possible inclusion. If you have a story about the cultural, historical, or environmental significance of the place where you live in along the Susquehanna River, we’d love to hear from you! What’s your story?

Discussing the Untranslatable and World Literature

In case anyone wonders what academics do during the summer, read on!  This week, the Program in Comparative Humanities is hosting a faculty reading seminar on the topics of Untranslatability and World Literature.  We are nearing the halfway mark.  If you want to take a look at the course outline, readings, discussions, and my thoughts, look here.  This is definitely a work in progress, but the discussions we are having are lively!

Thanks to the Provost’s office at Bucknell University for providing the funding for this event.  We have been holding summer reading seminars in Comparative Humanities since the inception of the program in 2001.  Topics have included Film and Adaptation, Translation, Integrating Islam into Core Courses, the Philosophy of Place, Close Reading, Digital Humanities, and this year, Untranslatability.