Prediction and Explanation
On computational social science’s epistemological perspectives.
A continuous tension we see is how different stakeholders view the value of data. Some may be interested in models which give us insight into future events, while other may be interested in models which help us understand the underlying mechanisms of a process. In what ways do prediction and explaination differ? In what cases might we wish to use each approach?
- Hanna Wallach, “Computational Social Science \(\neq\) Computer Science + Social Data,” Communications of the ACM 61, no. 3 (February 2018): 42–44, https://doi.org/10.1145/3132698.
- “Observing Behavior,” in Bit by Bit: Social Research in the Digital Age (Princeton: Princeton University Press, 2018), 13–83.
- Jake M. Hofman et al., “Integrating Explanation and Prediction in Computational Social Science,” Nature 595, no. 7866 (July 2021): 181–88, https://doi.org/10.1038/s41586-021-03659-0.
Simulations and Agent-based Models (ABMs)
How can we use computer simulations to study social phenomena from the “buttom up”?
We discuss the role of simulations and when they may be useful in the development and explanation of theories or in forecasting.
- Rosaria Conte and Mario Paolucci, “On Agent-Based Modeling and Computational Social Science,” Frontiers in Psychology 5 (2014), https://doi.org/10.3389/fpsyg.2014.00668.
- Ivan Smirnov, Camelia Oprea, and Markus Strohmaier, “Toxic Comments Are Associated with Reduced Activity of Volunteer Editors on Wikipedia,” PNAS Nexus 2, no. 12 (December 2023): pgad385, https://doi.org/10.1093/pnasnexus/pgad385.
Ethics and Best Practices
What are the pitfalls and potential ethical issues in computational social science research?
We discuss such challenges for computational social science in practice as reidentification, potential effects on privacy, and how more data alone does not solve study design problems.
- “Ethics,” in Bit by Bit: Social Research in the Digital Age (Princeton: Princeton University Press, 2018), 281–354.
- Charlotte Jee, “You’re Very Easy to Track down, Even When Your Data Has Been Anonymized,” MIT Technology Review (https://www.technologyreview.com/2019/07/23/134090/youre-very-easy-to-track-down-even-when-your-data-has-been-anonymized/, July 2019).
- Matthew Zook et al., “Ten Simple Rules for Responsible Big Data Research,” PLOS Computational Biology 13, no. 3 (March 2017): e1005399, https://doi.org/10.1371/journal.pcbi.1005399.
- David Lazer et al., “The Parable of Google Flu: Traps in Big Data Analysis,” Science 343, no. 6176 (March 2014): 1203–5, https://doi.org/10.1126/science.1248506.
Text as Data
Methods for working with text data.
A lot of social data is encoded within unstructured text. This module is more practical than theoretical and focuses on strategies to extract data from text using natural language processing and modern, vector-based approaches.
- Paul DiMaggio, “Adapting Computational Text Analysis to Social Science (and Vice Versa),” Big Data & Society 2, no. 2 (December 2015): 2053951715602908, https://doi.org/10.1177/2053951715602908.
- Jacob Jensen et al., “Political Polarization and the Dynamics of Political Language: Evidence from 130 Years of Partisan Speech [with Comments and Discussion],” Brookings Papers on Economic Activity, 2012, 1–81, https://www.jstor.org/stable/41825364.
Experiments and Causal Inference
How can we answer cause-and-effect questions using computational social science?
Experiments allow the researcher to manipulate independent variables and observe the effect on dependent variables. However, experiments are not always possible. Causal inference provides a framework to answer causal questions even when experiments are not possible.
- “Running Experiments,” in Bit by Bit: Social Research in the Digital Age (Princeton: Princeton University Press, 2018), 147–229.
- Justin Grimmer, “We Are All Social Scientists Now: How Big Data, Machine Learning, and Causal Inference Work Together,” PS: Political Science & Politics 48, no. 1 (January 2015): 80–83, https://doi.org/10.1017/S1049096514001784.
- Eshwar Chandrasekharan et al., “You Can’t Stay Here: The Efficacy of Reddit’s 2015 Ban Examined Through Hate Speech,” Proceedings of the ACM on Human-Computer Interaction 1, no. CSCW (December 2017): 1–22, https://doi.org/10.1145/3134666.
Network Analysis
Much social data is produced in the context of networks of relationships. This section introduces the basic concepts of network analysis, and provides a few examples of how it is used in the social sciences.
- Peter Sheridan Dodds, Roby Muhamad, and Duncan J. Watts, “An Experimental Study of Search in Global Social Networks,” Science 301, no. 5634 (August 2003): 827–29, https://doi.org/10.1126/science.1081058.
- Pablo Barberá et al., “The Critical Periphery in the Growth of Social Protests,” PLOS ONE 10, no. 11 (November 2015): e0143611, https://doi.org/10.1371/journal.pone.0143611.
- Christopher A. Bail et al., “Exposure to Opposing Views on Social Media Can Increase Political Polarization,” Proceedings of the National Academy of Sciences 115, no. 37 (September 2018): 9216–21, https://doi.org/10.1073/pnas.1804840115.
Crowds and Communities
A lot of social data is not produced in isolation, but rather in the context of communities with their own norms and practices. We discuss how to think about communities and crowds, and how to study them.
- “Creating Mass Collaboration,” in Bit by Bit: Social Research in the Digital Age (Princeton: Princeton University Press, 2018), 231–80.
- Aaron Shaw and Benjamin Mako Hill, “Laboratories of Oligarchy? How the Iron Law Extends to Peer Production,” Journal of Communication 64, no. 2 (2014): 215–38, https://doi.org/10.1111/jcom.12082.
- Lev Muchnik, Sinan Aral, and Sean J. Taylor, “Social Influence Bias: A Randomized Experiment,” Science 341, no. 6146 (August 2013): 647–51, https://doi.org/10.1126/science.1240466.
Wrapping Up
We synthesize the main themes of the course and discuss the future of computational communication research.
- Wouter van Atteveldt and Tai-Quan Peng, “When Communication Meets Computation: Opportunities, Challenges, and Pitfalls in Computational Communication Science,” Communication Methods and Measures 12, no. 2-3 (April 2018): 81–92, https://doi.org/10.1080/19312458.2018.1458084.
- Alexandra Olteanu et al., “Social Data: Biases, Methodological Pitfalls, and Ethical Boundaries,” Frontiers in Big Data 2 (2019), https://doi.org/10.3389/fdata.2019.00013.