Workshop – Text Analysis Methods in Historical Research

Micki Kaufman

http://www.mickikaufman.com/

 

1.     (30min)          Overview of Kissinger DNSA Text Analysis and Visualizations

2.     (1hr:30min)   Hands-on Workshop – The Collected Works of Yeats and Cummings

1.     Intro

a.     the Packet

2.     Basic Excel review

                                                                                                     i.     Examples: Blogger Behavior

                                                                                                   ii.     Packet Intro

3.     Data Collection and Management

a.     DownThemAll http://www.downthemall.net/

b.    Microsoft Excel / OpenOffice

c.     TextWrangler / SublimeText / NotePad http://www.barebones.com/products/textwrangler/

d.    NameChanger http://www.mrrsoftware.com/MRRSoftware/NameChanger.html

4.     Word Frequency and Correlation

a.     AntConc by Laurence Anthony, PhD http://www.antlab.sci.waseda.ac.jp/software.html

5.     Topic Modeling

a.     MALLET by Andrew McCallum and David Mimno http://mallet.cs.umass.edu/index.php

b.    Mimno scripts

6.     Sentiment Analysis

a.     LIWC2007 by James W. Pennebaker, PhD http://www.liwc.net

7.     Visualization

a.     Basic Excel Graphing

b.    Network Graphing: Gephi http://www.gephi.org

c.     Visualization and Statistics: R http://www.r-project.org

d.    Interactive Visualizations with d3

3.     Appendix

a.     Basic Resources

b.    Additional Visualization Resources (thanks to Lev Manovich)

 

Basic Resources

 

if you are new to data visualization:

 

How do visualization designers work?

 

How We Visualized 23 Years of Geo Bee Contests
Visualizing The Worlds Well-Being
How We Visualized Americas Food and Drink Spending
Visualizing The Health Care Reform

 

Look at:

http://flowingdata.com/2010/01/07/11-ways-to-visualize-changes-over-time-a-guide/

 

Other tutorials:

http://flowingdata.com/category/tutorials/

 

 

Additional Resources:

 

Graphic Design Principles for Information Visualization

 

Standard techniques for data visualization:

 

Using basic visualization techniques to create effective infographics:

 

Nicholas Felton: Annual Reports

GE Powering the Kitchen (fathom.info - Ben Fry)
   

    visualizingone dimensional data (single variable):

           

            pie chart, bar chart

Excel: van_gogh_summary.xlsx

Mondrian: van_gogh_data.txt

histogram

            Mondrian: van_gogh_data.txt

Explanation of the differences between a bar chart and a histogram

visualizing time series - line graph

Excel: van_gogh_data.txt

   
visualizing two dimensional data (two variables):

 

scatter plot

explain differences between line graph and scatter plot

Excel: van_gogh_summary.xlsx, van_gogh_data.txt

data transformations (log, etc. - Excel graphs axis options; Mondrian: Calc > Transform)

 

            visualizing multi-dimensional data (multiple variables):

 

radar plot

Excel: van_gogh_summary.xlsx

parallel plot

            Mondrian: van_gogh_data.txt

plot matrix

Mondrian - scatterplot matrix: van_gogh_additional_measurements.txt

 

 

Visualizing time:

 

Early time visualizations:

 

Time lines and visual histories

 

Visualizations of singular temporal streams:

One of the most famous visualizations of the last 200 years - Charles Joseph Minards 1869 representation of Napoleons 1812 Russian Campaign - offers other innovative solutions to show time:

Charles Joseph Minard -  visualization of Napoleons Russian Campaign

 

Visualizations of parallel temporal processes:

 

Here are a few examples of well-known innovative visualization techniques/projects to represent multiple event streams (multiple events which are taking place at the same time:)

 

Last.fm listening history

Flight patterns

The Preservation of Favored Traces

Hans Rossling TED 2006 lecture (video on TED)

Ben Fry: energy use in a kitchen


   

            Visualizations of temporal links (links between events in single or multiple temporal streams):

 

            Map of Science

Citeology: Visualizing the Relationships between Research Publications

 

Visualizations  of temporal processes in cultural artifacts:

 

Film Dialog Particles

Movie narrative chart

Cinemetrics project

Novel Views: visualizations of the novel Les Miserables by Victor Hugo

Lotr project

Culture data visualizations by Santiago Ortiz

   

            Time-based data:

3 million time-based open data sets: http://blog.revolutionanalytics.com/2013/02/quandl-a-wikipedia-for-time-series-data.html

 

R functions to use these data sets:

http://blog.revolutionanalytics.com/2013/03/quandl-package-released-to-cran.html

 

Help: http://www.quandl.com/help/r  

 

http://www.simile-widgets.org/timeline/

   

Visualizing space:

 

view as many maps as you can:

http://pinterest.com/janwillemtulp/maps/

view particular examples of recent maps using social media data:

 

atNight
Twitter NYC
Global Twitter Heartbeat
How Obama Won Re-election

Movement In Manhattan

 

What are the common features of recent maps driven by big data and social media data? Which maps stand out from the rest and why? What is missing?Why do the techniques for visualizations of temporal processes seem to be more limited in comparison to the richness of techniques and interesting projects in spatial data mapping?

 

Recommended - references/articles about science of cities

 

http://www.nytimes.com/2010/12/19/magazine/19Urban_West-t.html

http://www.nytimes.com/2013/02/24/technology/nyu-center-develops-a-science-of-cities.html?pagewanted=all

http://www.complexcity.info/

 

Recommended - historical timelines


http://www.datavis.ca/gallery/timelines.php

 

 

Additional Resources:

 

   

Resources for Data Visualization (courtesy of Lev Manovich)

 

Inspiration:

http://tulpinspiration.tumblr.com/  

http://infosthetics.com/

Data and visualization blogs worth following

popular websites and blogs about visualization

 

Innovative visualizations of temporal flows

 

Use of visualization in museum web site and online media collections

 

Visualization design patterns:

 

InfoVis wiki list of visualization design patterns

 

Spatial data:

https://openpaths.cc/about

http://en.wikipedia.org/wiki/OpenStreetMap

 

List of visualization and mapping software:

http://selection.datavisualization.ch/

 

Examples of visualization software which can create maps, timelines and all other basic vis techniques:

https://developers.google.com/chart/interactive/docs/gallery

            http://d3js.org/ (currently most popular for web vis)

 

examples of software for creating interactive web maps:

http://mapbox.com/

http://mapbox.com/reinventgreen/

 

http://cartodb.com/

http://cartodb.github.com/torque/examples/uspo.html

 

lists of other visualization tools:

 

datavisualization.ch list of visualization and mapping tools

 

.net list of the top 20 data visualization tools

 

WikiVis list of visualization tools

 

Popular software tools and applications for creating visualizations

 

Sound visualization software

 

Software to analyzing and presenting online digital collections

 

Over 100 Incredible Infographic Tools and Resources (Categorized)