“Headlines as Networked Language: A study of content and audience across 73 million links on Twitter” (Masters thesis)

A study of the news ecosystem from a standpoint of textual discriminability, modeled as a source prediction task – classifiers are presented with the headline from a news article, stripped of context, and trained to predict which outlet produced it (NYT, Fox, CNN, Breitbart, etc). By analyzing the geometries of the learned representations, it becomes possible to survey the differences in content and style across media brands with a high level of granularity; and to track movement in the latent space over time, as outlets evolve into new configurations at the level of topic, semantics, and syntax.

Posted in .