DA 101 - Introduction to Data Analytics

Fall 2020

Your Professor:

Matt Lavin

My Email:


My Office:

Burton D. Morgan Center 411

Office Hours

MTWF 9:30-11 a.m.; MW 3:20-4:40 p.m.

Our Classroom:

Class in William H. Doane Library A07; Lab Component Remote on Zoom

When We Meet:

MWF, 8:00-8:50 a.m.

When the Lab Meets:

Tuesdays, 1:50-4:40 p.m.

Course Description

Many of the most pressing problems in the world can be addressed with data. We are awash in data, and modern citizenship demands that we become literate in how to interpret data, what assumptions and processes are necessary to analyze data, as well as how we might participate in generating our own analyses and presentations of data. Consequently, data analytics is an emerging field with skills applicable to a wide variety of disciplines. This course introduces analysis, computation, and presentation concerns through the investigation of data driven puzzles in wide array of fields – political, economic, historical, social, biological, and others. No previous experience is required.

By the end of the course, you should be able to:

  • Identify, describe, and use different formats of data and data sources in class discussion and during lab projects
  • Collect, clean, store, and extract data needed for an analysis during lab projects
  • Write basic computer programs using RStudio for a reproducible data analysis workflow
  • Create data visualizations and extract and interpret meaning from the visual information
  • Perform statistical analysis on a dataset, and interpret the results
  • Reflect and evaluate on ethical, social, and legal issues in data collection, analysis, and security in discussion and in class projects using real datasets
  • Communicate and interpret all aspects of data analysis (data, cleaning, analysis, results) to a diverse, technical or non-technical audience, in oral, visual, and written format
  • Synthesize the above skills to create & present a new, independent data analysis project


Office Hours

Note that I am not planning to do "in-person" office hours this semester. I am using Google Calendar for virtual office hours, by appointment. If you go to my appointment page, you will see a real-time account of when I am available. My standard appointment slots are MTWF 9:30-11 a.m.; MW 3:20-4:40 p.m.. Note that these appointment slots will disappear once I've been booked. If I ever need to cancel office hours on a given day (say, for example, if I'm ill), I will update the calendar and email anyone with an appointment. If I find that I have more requests for appointments than I have availability, I may convert some of these slots to virtual hangouts, where anyone would be welcome to drop in, but for the moment I will hold these as opportunities for one-on-one discussion. 

If my office hours by appointment do not work for your schedule, you can also email me to request an appointment at another time. When sending me such an email (or really any email), please follow some basic conventions of formality and politeness. There's no need to construct the equivalent of a business letter, but please don't begin your message with "hey," and please take an extra moment to make sure you spelled my name correctly. I promise to show you the same courtesy. I will do my best to reply within 48 hours, barring any emergency circumstances.

Additional Norms and Policies

Here you will find information on required readings, import university policies, and course-specific policies like attendance and cell phone use.

Required Texts

R for Data Science (Wickham and Grolemund), ISBN-13: 978-1491910399
Free online (http://r4ds.had.co.nz) or order from print edition by matching the ISBN
Additional selected readings will be made available as html or pdf, and linked to the course website


All projects in this course will be scripted and analyzed using R, an open source data analysis language and environment. No previous experience with R, statistical software packages, or computer programming is required. Specifically, we will be using RStudio as our programming environment. Instructions for installing R and R Studio will be posted to the calendar below.

Grading and Feedback

Since there are multiple sections of DA 101 every semester, the various instructors work hard to make sure there is approximate parity in terms of content, workload, and expectations. However, it is also true that each professor has their strengths, areas of interest, and priorities, and I'm sure I'm no exception. We'll have to spend some time together for you to get a real sense of what I value, how I grade, etc., but I look forward to that process, and to getting to know you all better more generally. One of the big advantages of a school like Denison is that, if you want to work with me again, you'll probably be able to, whether in another data analytics course, a summer research fellowship, or some other capacity. 

As a general rule, the expectations in this course are high, and I'm confident you can all do great work. The feedback I provide on assignments is designed to help you get there. My goal is to provide specific, relevant, and honest feedback when I grade your work. This will include constructive criticism, strategies for improvement, and guidance on how students can achieve success. I will not do "compliment sandwiches" just to begin and end on a positive remark, but this means that, when I praise your work, it's an honest (and I think more meaningful) act of praise. 

Regarding the major assignment rubric, it is adapted from the standards that the data analytics program uses for all its majors. I don't expect your work to meet the same standards as a graduating senior, but I think using the same categories on our rubric will give you a better idea of what it might mean to major in data analytics, as many of you are hoping to do. If you don't want to be a data analytics major, these criteria are still highly relevant to almost any program of study. 

Primary Assignment Rubric

Item Description
Assignment Process: All materials are turned in on time and in the right place. Assignment directions are followed. Required components are all present and submitted on time.
Attention to Detail: The project is well organized, flows logically, and follows the all formatting guidelines, including attention to proofreading, proper citations, and language that is appropriate to a well-informed, non-technical reader.
Research Question and Research Design: The project has a focused and well defined research question that can be addressed with computational, data-driven analysis. The focal data set and method(s) are appropriate for the research question.
Data, Visuals, and Code: The data are fully described, properly sourced, and presented in appropriate ways. Visuals (tables, charts, graphs) are used effectively to describe multiple aspects of the research project (data, methods, or results). The paper provides sufficient details and/or points to supplementary materials that make the research reproducible by a technical reader (i.e, detailed footnotes, appendices, GitHub, code, etc.)
Data Analysis Methods: The method(s) used to test the research question is justified, validated, and applied appropriately; the student appropriately describes the strengths and weaknesses of the methods used; outside sources are used to justify how the methods are used and interpreted.
Reporting and Interpretation of Results: The results are interpreted correctly and clearly address the research question; the project discusses its limitations, the extent to which it can be generalized, and expansion to further research.
Ethical Considerations: The writing thoughtfully engages any ethical considerations of using the data, methods, and implications of communicating the findings.

Grading Scale

Letter Grade Percentile Description
A+ 97-100 Superior achievement in all aspects
A 94-96 Superior achievement in most areas
A- 90-93 Superior achievement in at least one area
B+ 87-89 Exceeds expectations in all aspects
B 84-86 Exceeds expectations in most areas
B- 80-83 Exceeds expectations in at least one area
C+ 77-79 Meets expectations in all aspects
C 74-76 Meets expectations in most areas
C- 70-73 Meets expectations in at least one area
D 65-69 Does the assigned task but does not meet expectations or work is not appropriate for college level
F 0-64 Unexcused late work, does not do the assigned task, not complete, or quality is significantly below expectations

Grade Breakdown

Item Percentage Comments
Oral Presentation 5 Individual assignment
Quizzes 10 Individual assignments
Data and Code Ledger 10 Team-based assignments
Lab Work and Lab Reports 50 Team-based assignments
Final Project 25 Individual assignment. Four separate components (see assignment description)

The Quantitative GE Requirement

The goal of the quantitative reasoning requirement is to develop the skills of all students in the descriptive, analytical, and predictive aspects of quantitative reasoning. A course fulfilling this requirement must utilize numerical quantities and employ, as an integral and sustained part of the course, at least one of the following forms of quantitative reasoning.

  1. the application of mathematical models to describe or predict the behavior of systems, and the design, construction, and interpretation of graphical representations of mathematical models.
  2. the utilization, numerical analysis, and interpretation of the significance and limitations of data to answer questions, test hypotheses, or solve problems, and the design, construction, and interpretation of graphical representations of numerical data.

Extensions Policy

Retroactive and last-minute extensions will not be granted. At the same time, life happens. Sometimes something just isn’t going to get done. If you speak to me at least a week ahead of time and I approve an extension, I will consider assigning a new due date and hold you to it. The trade off is that work turned in this way is probably not going end up in my hand when I grade everything else, so it’s going to get less feedback. If you miss a deadline entirely without getting an extension, you will automatically receive a 0 for your grade.


Cell phones should be off and put away. Laptops are okay for notes and such, but you should not be messaging, using Facebook, etc. I’ll check screens regularly give you a verbal warning on your first offense. After that, I reserve the right to ask you to leave class and mark you absent if you are creating a distraction.

Being Prepared for Class

Coming to class prepared means that you have the day's reading in hand (printed or digital) and have come to class with a way to take notes (printed or digital). If you are not prepared for class, I reserve the right to grade as if you were absent for that day. Anything due on a given day is due at the start of class. Any digital submission of material is due by the time class starts on the day the hard copy is due. These policies apply for in-person and remote particiation. 

Remote and Asynchronous Learning Policy

In these unprecedented times, our section of DA 101 will have student participants in various locations and time zones. We can also expect that one or more students may need to miss class because of illness or quarantine protocol. As a result, there will be a no-permission-needed policy of allowing students to participate remotely or asynchronously, as long as they complete all the work for each day of class. I only ask that you keep me informed and meet with me and/or our TA, as needed, in order to keep apace with the course work. Note, however, that Denison's university-wide attendance policy still applies. This means, among other things, that if a class is missed, for any reason, the student is responsible for determining what occurred in the missed class. Additionally, absence from a class will not be accepted as an excuse for not knowing class material.

If you need to participate in a particular class remotely but synchronously, you can do so by joining the remote feed for our course, which will be password protected but online for every class period. (Our lab will be completely remote and conducted in the same way.) If you need to participate in a particular class remotely and asynchronously, you will be able to access a video recording of the day's Zoom broadcast via Notebowl. You should also look at the daily calendar and complete any readings, quizzes, homework, lab reports, etc. If you are missing a team-based assignment, you should coordinate your participation with your teammates. To get access to any lectures, or to make up a peer review, you should email me about whether to make a virtual appointment with me or our TA.

If all classes, at some point in the term, are forced to switch entirely to remote learning, I will provide detailed instructions on how to complete all the remaining assignments.

Disability Resources

If you are a student who feels you may need an accommodation based on the impact of a disability, you should contact me privately as soon as possible to discuss your specific needs. I rely on the Academic Resource Center in 020 Higley Hall to verify the need for reasonable accommodations based on documentation on file in that office.

Academic Integrity

Proposed and developed by Denison students, passed unanimously by DCGA and Denison’s faculty, the Code of Academic Integrity requires that instructors notify the Associate Provost of cases of academic dishonesty. Cases are typically heard by the Academic Integrity Board, which determines whether a violation has occurred, and, if so, its severity and the sanctions. In some circumstances the case may be handled through an Administrative Resolution Procedure. Further, the code makes students responsible for promoting a culture of integrity on campus and acting in instances in which integrity is violated.

Academic honesty, the cornerstone of teaching and learning, lays the foundation for lifelong integrity. Academic dishonesty is intellectual theft. It includes, but is not limited to, providing or receiving assistance in a manner not authorized by the instructor in the creation of work to be submitted for evaluation. This standard applies to all work ranging from daily homework assignments to major exams. Students must clearly cite any sources consulted--not merely for quoted phrases, but also for ideas and information that are not common knowledge. Neither ignorance nor carelessness is an acceptable defense incases of plagiarism. It is the student’s responsibility to follow the appropriate format for citations. Students should ask their instructors for assistance in determining what sorts of materials and assistance are appropriate for assignments and for guidance in citing such materials clearly.

Our Commitment to Liberal Arts Education

Denison's mission statement articulates an explicit commitment to liberal arts education. It emphasizes active learning, which defines students as active participants in the leaning process, not passive recipients. Denison seeks to foster self-determination and to demonstrate the transformative power of education. A crucial aspect of this approach is what Denison's mission statement refers to as "a concern for the whole person," which is why the university provides a "living-learning environment" based on individual needs and an overriding concern for community. This community is based on "a firm belief in human dignity and compassion unlimited by cultural, racial, sexual, religious or economic barriers, and directed toward an engagement with the central issues of our time."

In this class, we will discuss inequality directly. In many cases, you will asked to apply quantitative reasoning skills to these subject, which can be difficult because there is always the potential for the available data to complicate or contradict something you may feel very passionate about. In these cases, you should aspire to adopt an attitude of critical skepticism, i.e. wary of claims that are not supported by evidence but potentially willing to be persuaded by evidence if you find it compelling, and willing to give that evidence a fair hearing.

How we treat one another will be a cornerstone of these conversations. Denison's "Guiding Principles" speak of "a community in which individuals respect one another and their environment." Further, "each member of the community possesses a full range of rights and responsibilities. Foremost among these is a commitment to treat each other and the environment with mutual respect, tolerance, and civility." It's easy to treat someone this way when you like them and agree with their ideas, but the real challenge is treating those who differ from us with the same compassion and respect. However, I consider disruptive, deceitful, or hateful behavior to be breaches of these responsibilities. Bullying, trolling, hate speech, and harassment of any kind will not be tolerated.

Teaching Assistants

Our TA for this section of DA 101 will be Lam Tran (tran_l7[at]denison.edu). Lam is a double major in data analytics and economics, with a minor in computer science, and she will graduate in Spring 2021. She has worked extensively in Python and R, and she experience with data visualization, data analysis, and database design. She will attend our weekly lab (remotely), and she will be available by appointment to help answer questions about the course or particular assignments. Her TA hours are Wednesdays, 9-10 p.m. and Thursdays, 6-7 p.m.. You can connect on Zoom on either day, and in person on Thursdays (in Burton D. Morgan 405). Lam's office hours are also posted on NoteBowl with a Zoom link and a Password. Office hours for all other DA 101 TAs will follow. While each TA is scheduled to work with a particular faculty instructor and attend their lab sections, our program encourages students to visit any TA during scheduled office hours, especially if their availability better fits your schedule.


Oral Presentation (5% of grade)

Each of you will be responsible for giving a PechaKucha presentation this semester. A PechaKucha is type of lightning talk where a presenter shows 20 slides for 20 seconds of commentary each (6 minutes and 40 seconds total). Read more about the PechaKucha presentation.

Quizzes (10% of grade)

This course has intermittent quizzes on material from readings and lectures. Quizzes are designed to measure how well you are integrating the material. They will generally consist of 10-20 multiple choice, fill-in-the-blank, and short answer style questions. There are four quizzes for this course, one for each quarter. Normally, I would do these in class, but in the interests of reducing hand-offs of pieces of paper, all quizzes will be online and open book, to be completed outside of class. There are also four take-home assignments, one for each quarter, and each of these counts as a quiz grade. Quizzes and take-home assignments will be conducted through Notebowl or Github Classroom. 

Data and Code Ledger (10% of grade)

The Data and Code Ledger is a living document that you and your lab teammates will assemble over the course of the semester. You will turn it in periodically in lieu of a lab report for that week. Read more about the Data and Code Ledger.

Lab Work and Lab Reports (50% of grade)

This portion of your grade consists of participating in the lab and completing all lab assignments. Assignment descriptions, datasets, and starter code will be made available through Github and Github Classroom. Links to individual lab assignments can be found on the Lab Schedule below. 

Final Project (25% of course grade)

The final project is divided into four components, each due during a different quarter of the semester. Each assignment is worth 25% of your assignment grade. Read more about the components of the Final Project Assignment.

Lab Schedule

Note: The lab component of this course will be conducted remotely using Zoom. Most labs require online submission of a team-based lab report, which is always due by 5 p.m. on the following Friday. Where noted, individual submissions are required but also due on Fridays. I'll add links to each lab as we go. The Zoom link and password are in NoteBowl. Click for general information about labs.  

Lab 1: Tuesday, August 18, 2020

Invasive Species Part 1

Due Thursday: Individual Submission

Lab 2: Tuesday, August 25, 2020

Collecting and Coding Data

Due Thursday: Individual Submission

Lab 3: Tuesday, September 1, 2020

Invasive Species Part 2

Due Thursday: Team Lab Report

Lab 4: Tuesday, September 8, 2020

Book Reviews

Due Thursday: Turn in Data and Code Ledgers (Individual Submission)

Lab 5: Tuesday, September 15, 2020

Political Polarization Part 1

Due Friday: No homework

Lab 6: Tuesday, September 22, 2020

Political Polarization Part 2

Due Friday: Team Lab Report

Lab 7: Tuesday, September 29, 2020

Substance Use

Due Friday: Team Lab Report

Lab 8: Tuesday, October 6, 2020

Simpson's Paradox

Due Next Week: Turn in Data and Code Ledgers (Individual Submission)

Lab 9: Tuesday, October 13, 2020


Due Next Week: Team Lab Report

Lab 10: Tuesday, October 20, 2020


Due Next Week: No homework

Lab 11: Tuesday, October 27, 2020

Reproducible Code

Due Next Week: Turn in Data and Code Ledgers (Individual Submission)

Lab 12: Tuesday, November 3, 2020

Authorship Attribution

Due Next Week: Team Lab Report

Lab 13: Tuesday, November 10, 2020


Due Next Week: Complete IRB Training

Lab 14: Tuesday, November 17, 2020


Completed During Class Time: Course Evaluations

Summary of Due Dates

Due Date Assignment
Wednesday, August 19, 2020 Sign up for Github, make Google Drive Folder, Complete Course Survey
Thursday, August 20, 2020 Individual Lab Report
Wednesday, August 26, 2020 Add details to your Github profiles
Thursday, August 27, 2020 Individual Lab Report
Wednesday, September 2, 2020

Sign up for oral presentations (Google drive doc)

Quiz 1 due before class

Thursday, September 3, 2020 Team Lab Report
Thursday, September 10, 2020 Data and Code Ledgers (Individual Submission)
Monday, September 14, 2020 Complete the Topic Exploration Assignment
Monday, September 21, 2020 Take-home assignment (counts as quiz grade)
Wednesday, September 23, 2020 Quiz 2 due before class
Friday, September 25, 2020 Team Lab Report (by 5 p.m.)
Wednesday, September 30, 2020 Complete the Project Plan Assignment
Friday, October 2, 2020 Team Lab Report (by 5 p.m.)
Monday, October 5, 2020 Mid-term Evaluation and Progress Report
Friday, October 9, 2020 Quiz 3 due before class
Monday, October 12, 2020 Take-home assignment (counts as quiz grade)
Tuesday, October 13, 2020 Data and Code Ledgers (Individual Submission, by 1:50 p.m.)
Tuesday, October 20, 2020 Team Lab Report (by 1:50 p.m.)
Wednesday, October 21, 2020 Complete Final Project Visualization Component
Wednesday, October 28, 2020 Quiz 4 due before class
Tuesday, November 3, 2020 Data and Code Ledgers (Individual Submission, by 1:50 p.m.)
Wednesday, November 4, 2020 Take-home assignment (counts as quiz grade)
Tuesday, November 10, 2020 Team Lab Report (by 1:50 p.m.)
Monday, November 16, 2020 Complete First Draft of Final Project Written Analysis
Tuesday, November 17, 2020 Complete IRB Training (by 1:50 p.m.)
Tuesday, November 17, 2020 Course Evaluations (completed in class)
Wednesday, November 18, 2020 Take-home assignment (counts as quiz grade)
Thursday, December 3, 2020 Complete and Submit Final Version of Written Analysis and Reflection

Weekly Calendar

Quarter 1: Introduction to Data Analytics

Week 1: Data Literacy

Monday, August 17, 2020

In Class: Introductions

Homework: Sign up for Github, make Google Drive Folder, Complete Course Survey

Wednesday, August 19, 2020

In Class: Discuss survey results

Homework: Watch Three TED Talks ... Stacy Smith, The Data Behind Hollywood's Sexism (15:36); JP Rangaswami, Information is Food (7:48); John Wilbanks, Let's Pool Our Medical Data (16:11)

Friday, August 21, 2020

Discussion: The Promise of Data

Homework: Read R for Data Science "1. Introduction" (to the book) and "2. Introduction" (to the "Explore" section)

Week 2: Data, Metadata and Quantification

Monday, August 24, 2020

Slides/Discussion: Data and Data Analysis

Homework: Read Definitions of Data, from Data Science: An Introduction;  Fill in some details on your Github profiles, including profile photos. (Note: You are not required to share a photo of yourself if you don't want to. The idea here is just to replace the default profile photo with something more personalized.)

Wednesday, August 26, 2020

Slides/Discussion: Data vs. Metadata; The Data Lifecycle

Homework: Read "Chapter 4: Field of Ignorance" from Moneyball (pdf)

Friday, August 28, 2020

Discussion: The Problems of Quantification

Homework: Read R for Data Science "5. Data transformation"

Week 3: Working with Data

Monday, August 31, 2020

Slides/Discussion: Transforming Data

Homework: Sign up for oral presentations (Google drive doc, link in NoteBowl); complete Quiz 1: Data Literacy, Metadata, and the Data Lifecycle

Wednesday, September 2, 2020

Slides/Discussion: Scaling, resampling, bootstrapping

Homework: Read Richard Jean So, "All Models Are Wrong" (pdf)

Friday, September 4, 2020

Discussion: All Models Are Wrong. Some Models Are Useful.

Homework: Read R for Data Science "3. Data visualization"

Week 4: Data Visualization

Monday, September 7, 2020

Slides/Discussion: Reviewing data visualization in R (ggplot2), common visualization types

Homework: Watch Hans Rosling, TED Talk, "The Best Stats You've Ever Seen"; Watch David McCandless, "The Beauty of Data Visualization"

Wednesday, September 9, 2020

PechaKucha: The Five Hat Racks Principle

Slides/Discussion: Storytelling with Data

Homework: Read Cairo, The Truthful Art, 41-65 (pdf)

Friday, September 11, 2020

PechaKucha: Accessible Data Visualizations

Discussion: The Five Qualities of Great Visualizations

Homework: Complete the Topic Exploration Assignment. Submit via Google Drive folder and Bring a Laptop to Next Class

Quarter 2: Data and Communication

Week 5: Descriptive Analytics 1

Monday, September 14, 2020

PechaKucha: Mean, Median, and Mode

In Class: Peer Feedback on Topic Exploration Assignment

Homework: Read R for Data Science "7. Exploratory Data Analysis"

Wednesday, September 16, 2020

PechaKucha: Variance, Standard Deviation & Interquartile Range

Slides/Discussion: Using data to generate questions and pursue insight

Homework: Read "Introduction: the Hidden Side of Everything", from Freakonomics (pdf)

Friday, September 18, 2020

PechaKucha: Z-scores and Z-differences

Discussion: Norms, trends, individuals, and outliers

Homework: Take-home assignment (counts as quiz grade)

Week 6: Descriptive Analytics 2

Monday, September 21, 2020

PechaKucha: Bias in Sampling

Slides/Discussion: Distributions, sampling, central limit theorem

Homework: Complete Quiz 2: Working with Data, Data Visualization, Sampling, and Populations

Wednesday, September 23, 2020

Slides/Discussion: Descriptive Statistics Continued 

Homework: Cathy O'Neil, "Arms Race: Going to College" from Weapons of Math Destruction (pdf)

Friday, September 25, 2020

PechaKucha: Information vs. Entropy

Discussion: Information, Overfitting, and Overconfidence

Homework: Read excerpt from Erich L. Lehmann, Fisher, Neyman, and the Creation of Classical Statistics (Springer, 2011) (pdf)

Week 7: Hypothesis Testing

Monday, September 28, 2020

PechaKucha: ANOVA Analysis

Slides/Discussion: T-Tests, P-Values

Homework: Complete the Project Plan Assignment. Submit via Google Drive folder and Bring a Laptop to Next Class

Wednesday, September 30, 2020

PechaKucha: Familywise Comparison Error

In Class: Peer Review of Project Plan Assignment

Slides/Discussion: Confidence Intervals, Effect Size

Homework: Read R for Data Science "22. Introduction (to the "Model" section) and 23. Model Basics"

Friday, October 2, 2020

PechaKucha: Chi-Squared Test

In Class: Mid-term Evaluation and Progress Report

Discussion: Frequentism and its Assumptions

Homework: No-work Weekend

Quarter 3: Data Analysis

Week 8: Predictive Analytics 1

Monday, October 5, 2020

In Class: Mid-term Evaluation and Progress Report

Slides/Discussion: Goodness of Fit

Homework: Complete Quiz 3: Hypothesis Testing, Fitting Models, P-Values, Effect Size, Confidence Intervals

Wednesday, October 7, 2020

PechaKucha: Pearson vs. Spearman Correlation

Slides/Discussion: Logistic and Linear Regression

Homework: Read "Scientific Realism," Stanford Encyclopedia of Philosophy, Parts 1.1, 1.2, 1.3, and 4.1

Friday, October 9, 2020

PechaKucha: Cohen's D Test

Discussion: Instrumentalism and its assumptions

Homework: Read Trevor Martin, Dissecting Trump's Most Rabid Online Following, FiveThirthyEight.com, 23 March 2017.

Week 9: Predictive Analytics 2

Monday, October 12, 2020

PechaKucha: Cosine Similarity

Slides/Discussion: Unsupervised learning

Homework: Take-home assignment (counts as quiz grade)

Wednesday, October 14, 2020

Slides/Discussion: Unsupervised learning continued

Homework: Read "Interpretations of Probability," Stanford Encyclopedia of Philosophy

Friday, October 16, 2020

PechaKucha: K-Means Clustering

Discussion: What is Probability?

Homework: Read Nate Silver, Less and Less Wrong," The Signal and the Noise (Penguin, 2012) 221-247 (pdf)

Week 10: Bayesian Analytics

Monday, October 19, 2020

PechaKucha: Principal Component Analysis

Slides/Discussion: Bayes Theorum and Bayesian Data Analysis

Homework: Watch Alex Edmans, TEDx Talk, What to Trust in a Post-Truth World (17:47)

Wednesday, October 21, 2020

Slides/Discussion: How to Think Like a Bayesian

Homework: Complete Final Project Visualization Component. Submit via Google Drive folder and Bring a Laptop to Next Class. (Final version due Friday at 5 p.m.)

Friday, October 23, 2020

PechaKucha: Hidden Markov Models

Discussion: Peer Review of Final Project Visualization Component

Homework: Read William Stafford Noble "A Quick Guide to Organizing Computational Biology Projects"

Quarter 4: Case Studies and Student Projects

Week 11: Case Studies 1, Computational Biology 

Monday, October 26, 2020

PechaKucha: Factor Analysis

Slides/Discussion: Organizing Computational Projects

Homework: Complete Quiz 4: Regression, Predictive Analytics, Classification, and Clustering

Wednesday, October 28, 2020

Slides/Discussion: Organizing Computational Projects Continued

Homework: Read D'Ignazio and Klein, "The Numbers Don't Speak for Themselves", Data Feminism Cambridge: MIT Press, 2020: Open Access Edition Online. https://data-feminism.mitpress.mit.edu/pub/czq9dfs5/release/2

Friday, October 30, 2020

PechaKucha: Decision Tree Classifiers

Discussion: What is Data Feminism?

Homework: Read José Nilo G. Binongo "Who Wrote the 15th Book of Oz? An Application of Multivariate Analysis to Authorship Attribution" (pdf)

Week 12: Case Studies 2, authorship attribution

Monday, November 2, 2020

PechaKucha: Support Vector Machine Classifiers

Slides/Discussion: Authorship attribution methods

Homework: Take-home assignment (counts as quiz grade)

Wednesday, November 4, 2020

PechaKucha: Neural Networks / Deep Learning

Slides/Discussion: Author, Date, Gender, and Genre Signals

Homework: Read Andrew Piper, "There Will Be Numbers"

Friday, November 6, 2020

PechaKucha: Random Forest Classifiers

Discussion: Can culture be analyzed quantitatively?

Homework: Read Harris et. al., "Two Failures to Replicate High-Performance-Goal Priming Effects" (pdf)

Week 13: Case Studies 3, Priming Effects

Monday, November 9, 2020

Slides/Discussion: Priming Effects and the Replication Crisis

Homework: Read Longino, "Values and Objectivity," Science as Social Knowledge (pdf)

Wednesday, November 11, 2020

Discussion: The curse of dimensionality, the file drawer problem, and other blind spots

Homework: Read Helena Mihaljevic et. al., "Reflections on Gender Analyses of Bibliographic Corpora" Frontiers in Big Data 2 (August 28, 2019): 29

Friday, November 13, 2020

In Class: Discussion of Gender Bias

Homework: Complete First Draft of Final Project Written Analysis. Submit via Google Drive folder and Bring a Laptop to Next Class

Week 14: Case Studies 4, Gender Prediction

Monday, November 16, 2020

Slides/Discussion: Peer Review of Final Project Written Analysis

Homework: Take-home assignment (counts as quiz grade)

Wednesday, November 18, 2020

Discussion: Data Analytics at Denison

Homework: Read Daniel Kahneman, "The Illusion of Understanding," "The Illusion of Validity," and "Intuitions vs. Formulas"Thinking Fast and Slow (pdf)

Friday, November 20, 2020

Discussion: The Utility of Data; The Limits of Quantification

Thanksgiving Break

Monday, November 23-27, 2020

Week 15: Remote Finals Week

Monday, November 28, 2020

Reading and Study Day

Tuesday, November 29, 2020

Reading and Study Day

Thursday, December 3, 2020

Homework: Complete and Submit Final Version of Written Analysis and Reflection via Google Drive by 11:59 p.m. EST