Text Networks

Chris Bail
Duke University

What is a Network?

What is a Network?

Two-mode networks

Two-mode networks

Two-mode networks

Two-mode networks

From Words to Networks

From Words to Networks

State of the Union Addresses

State of the Union Addresses

Textnets

Textnets

The textnets package provides the following functions:

1) preparing texts for network analysis
2) creating text networks
3) visualizing text networks
4) detecting themes or “topics” within text networks

Textnets

library(devtools)
install_github("cbail/textnets")

Example: State of the Union Addresses

library(textnets)
data(sotu)

Part of Speech Tagging Takes Time...

sotu_first_speeches <- sotu %>% group_by(president) %>% slice(1L)

PrepText

prepped_sotu <- PrepText(sotu_first_speeches, groupvar = "president", textvar = "sotu_text", node_type = "groups", tokenizer = "words", pos = "nouns", remove_stop_words = TRUE, compound_nouns = TRUE)
save(prepped_sotu, file = "prepped_sotu.Rdata")

Creating Textnets

sotu_text_network <- CreateTextnet(prepped_sotu)

Visualize

VisTextNet(sotu_text_network, label_degree_cut = 0)

plot of chunk unnamed-chunk-7

Interactive Visualization

library(htmlwidgets)
vis <- VisTextNetD3(sotu_text_network, 
                      height=300,
                      width=400,
                      bound=FALSE,
                      zoom=FALSE,
                      charge=-30)
saveWidget(vis, "sotu_textnet.html")

Choosing Alpha

VisTextNet(sotu_text_network, alpha=.1, label_degree_cut = 2)

plot of chunk unnamed-chunk-9

Interactive Visualization

Analyzing Text Networks

sotu_communities <- TextCommunities(sotu_text_network)
head(sotu_communities)
              group modularity_class
1   Abraham Lincoln                1
2    Andrew Jackson                1
3    Andrew Johnson                1
4      Barack Obama                2
5 Benjamin Harrison                1
6   Calvin Coolidge                1

Analyzing Text Networks

top_words_modularity_classes <- InterpretText(sotu_text_network, prepped_sotu)
head(top_words_modularity_classes, 10)
# A tibble: 10 x 2
# Groups:   modularity_class [2]
   modularity_class lemma            
   <chr>            <chr>            
 1 2                recovery plan    
 2 1                structure        
 3 2                drug             
 4 2                tonight          
 5 2                budget           
 6 2                medicare         
 7 1                carolina         
 8 1                george washington
 9 1                licentiousness   
10 1                novelty          

Centrality Measures

text_centrality <- TextCentrality(sotu_text_network)

Next Steps with TextNets

Next Steps with TextNets