Your Professor:
Matt Lavin
My Email:
lavinm@denison.edu
My Office:
Burton D. Morgan Center 411
Office Hours
1:30 - 2:30 p.m. Fri by appointment, 1-2:30 p.m. Tue walk-in
Our Classroom:
Burton Morgan 219
When We Meet:
MWF 10:30 a.m. - 11:20 p.m.
When the Lab Meets:
Thursdays, 1:30 p.m. - 4:20 p.m.
Many of the most pressing problems in the world can be addressed with data. We are awash in data, and modern citizenship demands that we become literate in how to interpret data, what assumptions and processes are necessary to analyze data, as well as how we might participate in generating our own analyses and presentations of data. Consequently, data analytics is an emerging field with skills applicable to a wide variety of disciplines. This course introduces analysis, computation, and presentation concerns through the investigation of data driven puzzles in wide array of fields – political, economic, historical, social, biological, and others. No previous experience is required.
By the end of the course, you should be able to:
I am always happy to see students during my office hours, whether it's to discuss this class, majoring in DA, how I can contribute to your learning at Denison, or your plans for life after graduation (career, graduate school, etc.). Like many professors, I offer mix of in-person appointments (via Google Calendar) and drop-in office hours.
For office hours by appointment, visit my appointment page, where you will see a real-time account of when I am available. You can book the appointment with one or two clicks by selecting any time when I'm listed as available. My standard appointment slots are divided into 15-minute blocks from 1:30 to 2:30 p.m. on Fridays. Note that these appointment slots will disappear from my calendar once I've been booked, and you are required to book appointments at least 24 hours in advance.
Drop-in office hours will be held in my office from 1 to 2:30 p.m. on Tuesdays. For these, you will not need an appointment, and I encourage you to drop by or work in or near the lab space across the hall from me. I see students in the order they arrive, so there is no guarantee that I will have time for everyone on a given day. In other words, if you have a very specific or time sensitive question or concern, it's best to make an appointment or email me.
If I ever need to cancel by-appointment office hours on a given day (say, for example, if I'm ill), I will update the calendar and email anyone with an appointment. If I ever need to cancel office hours on a given drop-in day, I will post to Canvas or e-mail the entire class.
Here you will find information on required readings, import university policies, and course-specific policies like attendance and cell phone use.
R for Data Science 2e (Wickham and Grolemund), ISBN-13: 978-1492097396 Free online ( https://r4ds.hadley.nz/ ) or order print edition online by matching the ISBN
Additional selected readings will be made available as html or pdf, and linked to the course website or shared via Canvas
All projects in this course will be scripted and analyzed using R, an open source data analysis language and environment. No previous experience with R, statistical software packages, or computer programming is required. Specifically, we will be using RStudio as our programming environment. Instructions for installing R and RStudio will be posted to the calendar below.
As a general rule, the expectations in this course are high, and I'm confident you can all do great work. The feedback I provide on assignments is designed to help you get there. My goal is to provide specific, relevant, and honest feedback when I grade your work. This will include constructive criticism, strategies for improvement, and guidance on how students can achieve success. I will not do "compliment sandwiches" just to begin and end on a positive remark, but this means that, when I praise your work, it's an honest (and I think more meaningful) act of praise.
Item | Percentage | Comments |
---|---|---|
Attendance and Participation | 12 | See description below. |
Oral Presentation | 2 | Individual assignment |
Quizzes | 20 | Individual assignments |
Self-Guided Learning Assignment | 11 | Individual assignment |
Lab Work and Lab Reports | 40 | Mix of individual and team-based assignments |
Final Project | 15 | Individual assignment (see assignment description) |
In a liberal arts context, it is particularly important that students arrive on time and come prepared to engage with the community we will be building in our classroom. I expect you to attend class every class meeting, and I expect you to arrive on time. Attendance will be taken every day. If you cannot attend class, it is your responsibility to get (from a classmate) all written notes about what we discussed in class, including in-class announcements.
Note: Late arrival counts as half an absence. If you have a commitment immediately before this class that will force to come late (say, a class in Mitchell) you should either drop that class or drop this one.
Since regular attendance is a prerequisite to passing this course, missing more than five classes will result in an additional penalty of -5% from your final course grade for each unexcused absence beyond the fifth.
Participation will be assessed using a mix of preparedness, speaking during class discussions, remaining attentive during lectures, and completing in-class assignments. An unexcused absence will result in a participation score of zero for the day, and coming late will result in a maximum participation score of 50 for the day.
Creating distractions in class is especially frowned upon. Disruptions such as falling asleep, leaving class excessively, and talking over the professor are frowned upon. Cell phones should be off and put away. Laptops are okay for notes and such but, when laptops are being used, you should not be messaging, using Facebook, etc. In the event of a disruption, I will typically give a verbal warning for your first offense. After that, I reserve the right to ask you to leave class and mark you absent for the day.
If you have a legitimate emergency such as a serious illness, a mental health emergency, or a death in the family, I will grant an appropriate extension with a new due date. The trade off is that work turned in this way is probably not going end up in my hand when I grade everything else, so it's going to get very sparse feedback. If you miss a deadline entirely without getting an extension, you will automatically lose 10 points off the top of your grade for each day it is late, in addition to any points you lose for the quality of the work. Retroactive and last-minute extensions will not be granted.
Coming to class prepared means that you have the day's reading in hand (printed or digital) and have come to class with a way to take notes (printed or digital). If you are not prepared for class, I reserve the right to grade as if you were absent for that day. Anything due on a given day is due at the start of class. Any digital submission of material is due by the time class starts on the day the hard copy is due.
If you are a student who feels you may need an accommodation based on the impact of a disability, you should contact me privately as soon as possible to discuss your specific needs. I rely on the Academic Resource Center in 020 Higley Hall to verify the need for reasonable accommodations based on documentation on file in that office.
The Peer Learning Strategists (PLS) program was developed by Denison students and faculty for those in introductory science classes. It is an initiative of a larger program called RAISE (Readiness and Inclusion in Science Education) and is a great resource to learn how to study more efficiently and learn more effectively. The PLS program employs peer-to-peer mentoring focused on teaching overarching learning strategies crucial to success in college science classes. Trained science majors work as PLS mentors to help hone your learning approach since skills most helpful in college often differ from skills that led to high achievement in high school. Students meet one-on-one with a PLS mentor one hour weekly for at least three weeks with some students continuing beyond the three-sessions recommendation. PLS mentors are not tutors, content is not course-specific, and conversations provide space for attaining skills for lifelong learning and success. Contact Science Initiatives Coordinator Jeni Miller or Dr. Melanie Lott with additional questions.
Proposed and developed by Denison students, passed unanimously by DCGA and DenisonÕs faculty, the Code of Academic Integrity requires that instructors notify the Associate Provost of cases of academic dishonesty. Cases are typically heard by the Academic Integrity Board, which determines whether a violation has occurred, and, if so, its severity and the sanctions. In some circumstances the case may be handled through an Administrative Resolution Procedure. Further, the code makes students responsible for promoting a culture of integrity on campus and acting in instances in which integrity is violated.
Academic honesty, the cornerstone of teaching and learning, lays the foundation for lifelong integrity. Academic dishonesty is intellectual theft. It includes but is not limited to providing or receiving assistance in a manner not authorized by the instructor in the creation of work to be submitted for evaluation. This standard applies to all work ranging from daily homework assignments to major exams. Students must clearly cite any sources consulted--not merely for quoted phrases, but also for ideas and information that are not common knowledge. Neither ignorance nor carelessness is an acceptable defense incases of plagiarism. It is the studentÕs responsibility to follow the appropriate format for citations. Students should ask their instructors for assistance in determining what sorts of materials and assistance are appropriate for assignments and for guidance in citing such materials clearly.
Note on Technology: Unauthorized use of technology (including, but not limited to, artificial intelligence sites and translation programs) in the preparation or submission of academic work can be considered a form of cheating and/or plagiarism. Instructors may at their discretion create assignments that incorporate the use of supporting technologies and will inform students of acceptable uses of technology in their courses. It is the responsibility of the student to ask the instructor for clarification whenever they are unclear about the parameters of a specific assignment and to understand that presenting the work of artificial intelligence as your own constitutes a violation of Denison's Code. Cases of suspected inappropriate use of technology may be submitted to the Academic Integrity Board to initiate an investigation of academic dishonesty. For further information about the Code of Academic Integrity, see https://denison.edu/academics/curriculum/integrity .
Denison's mission statement articulates an explicit commitment to liberal arts education. It emphasizes active learning, which defines students as active participants in the leaning process, not passive recipients. Denison seeks to foster self-determination and to demonstrate the transformative power of education. A crucial aspect of this approach is what Denison's mission statement refers to as "a concern for the whole person," which is why the university provides a "living-learning environment" based on individual needs and an overriding concern for community. This community is based on "a firm belief in human dignity and compassion unlimited by cultural, racial, sexual, religious or economic barriers, and directed toward an engagement with the central issues of our time."
In this class, we will discuss inequality directly. In many cases, you will asked to apply quantitative reasoning skills to these subject, which can be difficult because there is always the potential for the available data to complicate or contradict something you may feel very passionate about. In these cases, you should aspire to adopt an attitude of critical skepticism, i.e. wary of claims that are not supported by evidence but potentially willing to be persuaded by evidence if you find it compelling, and willing to give that evidence a fair hearing.
How we treat one another will be a cornerstone of these conversations. Denison's "Guiding Principles" speak of "a community in which individuals respect one another and their environment." Further, "each member of the community possesses a full range of rights and responsibilities. Foremost among these is a commitment to treat each other and the environment with mutual respect, tolerance, and civility." It's easy to treat someone this way when you like them and agree with their ideas, but the real challenge is treating those who differ from us with the same compassion and respect. However, I consider disruptive, deceitful, or hateful behavior to be breaches of these responsibilities. Bullying, trolling, hate speech, and harassment of any kind will not be tolerated.
Essays, journals, and other coursework submitted for this class are generally considered confidential pursuant to the UniversityÕs student record policies. However, students should be aware that University employees are required by University policy to report allegations of discrimination based on sex, gender, gender identity, gender expression, sexual orientation or pregnancy to the Title IX Coordinator or a Deputy Title IX Coordinator. This includes reporting all incidents of sexual misconduct, sexual assault and suspected abuse/neglect of a minor. Further, employees are to report these incidents that occur on campus and/or that involve students at Denison University whenever the employee becomes aware of a possible incident in the course of their employment, including via coursework or advising conversations. There are others on campus to whom you may speak in confidence, including clergy and medical staff and counselors at the Wellness Center. More information on Title IX and the UniversityÕs Policy prohibiting sex discrimination, including sexual harassment, sexual misconduct, stalking and retaliation, including support resources, how to report, and prevention and education efforts, can be found at: https://denison.edu/campus/title-ix .
This assignment has two main purposes: to help you practice oral presentation and communication skills, and to increase your ability to use data visualizations and other visual design elements effectively in service of a central goal. Working in pairs, students will choose a topic from a list provided by the professor, and will create a digital poster that provides a concise but thorough overview of their topic, including some information about theory and application. Pairs will also give a low-stakes, informal presentation on their poster.
This course has intermittent classsroom quizzes on material from readings and lectures, which are designed to measure how well you are keeping up with the readings and integrating the material. They will generally consist of ~10 multiple choice, fill-in-the-blank, and short answer style questions. These quizzes will be taken in person, without access to books or notes.
There are also two code-based quizzes held during lab sessions. For these, you will be allowed to use any resources you like, including the book, stackoverflow, etc. All quizzes are marked on the schedule below (under "Lab Schedule" and "Weekly Calendar").
For this assignment, you will choose an area of interest and conduct twenty hours of self-guided, deliberate practice in order to learn more about one of the focal areas of the course, with a particular eye on something you plan to use for your final project. This assignment will provide an opportunity to acquire skills in support of your final project, but it's also a chance to think about and pursue your own interests. Reflecting on how you learn will make skills acquisition easier and more rewarding. Note that this assignment is divided into several small-scale submissions, with a full packet submission at the end.
This portion of your grade consists of participating in the lab and completing all lab assignments. Assignment descriptions, datasets, and starter code will be made available through Github and Github Classroom. Labs will be a mix of individual and team-based assignments. In general, lab assignments will be due one week from the day they are assigned, by the start of our next lab. The lab schedule has more details about our weekly topics and which weeks have lab reports.
The purpose of the final project is to combine the core skills you have gained in the class in an application of the data cycle producing a short polished report on a question of your choosing, ideally something that youÕre passionate about or is relevant to your life or interests (some suggestions are offered toward the end). This final project will incorporate all of the steps involved in the data cycle as we have studied it. You will be in charge of stating an interesting question that doesnÕt duplicate previous projects/labs, conducting exploratory analyses using skills from the entire semester, building a model that you will interpret, then communicating your key findings in a polished, professional narrative.
Lab 0: Thursday, August 27, 2025
Quickstart
Lab 1: Thursday, September 4, 2025
Garlic Mustard pt. 1
Lab 2: Thursday, September 11, 2025
Data Collection and Annotation
Lab 3: Thursday, September 18, 2025
Garlic Mustard pt. 2
Lab 4: Thursday, September 25, 2025
Lab quiz
Lab 5: Thursday, October 2, 2025
Substance Use
Lab 6: Thursday, October 9, 2025
Audiobooks pt. 1
Lab 7: Thursday, October 16, 2025
Fall break--no lab
Lab 8: Thursday, October 23, 2025
Audiobooks pt. 2
Lab 9: Thursday, October 30, 2025
AirBNB
Lab 10: Thursday, November 6, 2025
Lab quiz
Lab 11: Thursday, November 13, 2025
Authorship Attribution
Lab 12: Thursday, November 20, 2025
Self-Guided Learning Student Presentations; no lab-based homework assignment; work on self-guided learning assignment
Lab 13: Thursday, December 4, 2025
Data ethics
Lab 14: Thursday, December 11, 2025
Reproducible code; no lab-based homework assignment; work on final projects
Weekly Rhythm
Monday | Wednesday | Thursday | Friday |
---|---|---|---|
Hands-on activity, live-coding, etc. Complete weekly reference reading by start of Monday's class. | Instructor slide presentations. Classroom quizzes are also held on Wednesdays. | Lab assignment or lab quiz. | Discuss readings or do activity based on reading. |
Week 1: Introducing Data Analytics
(Monday, August 25, 2025 - Friday, August 29, 2025)
Homework for Friday: Sign up for Github account, complete course survey
Reference Reading for Monday's Class: R4DS, "Introduction" (https://r4ds.hadley.nz/intro.html) and "Workflow Basics" (https://r4ds.hadley.nz/workflow-basics.html)
Week 2: Data, Metadata, Quantification
(Monday, September 01, 2025 - Friday, September 05, 2025)
Due before Thursday's Lab: Lab 0
Friday's Reading: "The Dream of Prediction" from Everything is Obvious* (once you know the answer) (pdf on Canvas)
Reference Reading for Monday's Class: R4DS, "Workflow: Code Style" (https://r4ds.hadley.nz/workflow-style.html), "Data Transformation" (https://r4ds.hadley.nz/data-transform.html), and "Data Tidying" (https://r4ds.hadley.nz/data-tidy.html)
Week 3: Working with Data
(Monday, September 08, 2025 - Friday, September 12, 2025)
Due before Thursday's Lab: Lab 1 - Garlic Mustard part 1
Reading for Wednesday's Class: Cathy O'Neil, excerpt from "Weapons of Math Destruction," (pdf on Canvas)
Reminder: No class on Friday
Reference Reading for Monday's Class: R4DS, "Data Visualization" (https://r4ds.hadley.nz/data-visualize.html), "Visualize" (https://r4ds.hadley.nz/visualize.html), and "Layers" (https://r4ds.hadley.nz/layers.html)
Week 4: Data Visualization p1 1
(Monday, September 15, 2025 - Friday, September 19, 2025)
Notes and Reminders: Quiz on Wednesday
Due before Thursday's Lab: Lab 2 - Data Collection
Friday's Reading: Alberto Cairo, "The Five Qualities of Great Visualizations" (pdf on Canvas)
Reference Reading for Monday's Class: R4DS, "Exploratory Data Analysis" (https://r4ds.hadley.nz/EDA.html) and "Communication" (https://r4ds.hadley.nz/communication.html)
Week 5: Descriptive Statistics
(Monday, September 22, 2025 - Friday, September 26, 2025)
Due before Thursday's Lab: Lab 3- Garlic Mustard pt 2
Reminder: Lab quiz on Thursday
Friday's Reading: No reading this week
Week 6: Statistical Significance
(Monday, September 29, 2025 - Friday, October 03, 2025)
Friday's Reading: D'Ignazio and Klein, "What Gets Counted Counts" (from Data Feminism)
Week 7: Correlation and Association
(Monday, October 06, 2025 - Friday, October 10, 2025)
Due before Thursday's Lab: Lab 4 - Substance Use
Friday's Reading: Kahneman, "The Illusion of Understanding," "The Illusion of Validity," and "Intuitions vs. Formulas" from Thinking Fast and Slow (pdf on Canvas)
Week 8: Review
(Monday, October 13, 2025 - Friday, October 17, 2025)
Notes and Reminders: Quiz on Wednesday
Friday's Reading: No reading this week
Reminder: No lab this week and no class Friday (Fall Break)
Week 9: Linear Regression pt 1
(Monday, October 20, 2025 - Friday, October 24, 2025)
Due Before Wednesday's Class: Final Project Topic and Self-Guided Learning Focal Area
Due before Thursday's Lab: Lab 5 - Audiobooks pt 1
Friday's Reading: "Bears Will Be Boys," The Pudding (https://pudding.cool/2025/07/kids-books/)
Week 10: Linear Regression pt 2
(Monday, October 27, 2025 - Friday, October 31, 2025)
Due before Thursday's Lab: Lab 6 - Audiobooks pt 2
Friday's Reading: Stephens-Davidowitz,"Big Data, Big Schmata? What it Cannot Do" ( from Everybody Lies)
Due Before Monday's Class: Self-Guided Learning Curriculum and Final Project Dataset
Week 11: Classification pt 1
(Monday, November 03, 2025 - Friday, November 07, 2025)
Due before Thursday's Lab: Lab 7 - AirBNB
Reminder: Lab quiz this week
Friday's Reading: No reading this week
Week 12: Classification pt 2
(Monday, November 10, 2025 - Friday, November 14, 2025)
Notes and Reminders: Quiz on Wednesday
Friday's Reading: José Nilo G. Binongo "Who Wrote the 15th Book of Oz? An Application of Multivariate Analysis to Authorship Attribution" (pdf on Canvas)
Due Before Monday's Class: Self-Guided Learning Log with Reflective Comments
Week 13: Learning DA
(Monday, November 17, 2025 - Friday, November 21, 2025)
Due before Thursday's Lab: Lab 8 - Authorship Attribution
In Lab Session on Thursday: Self-Guided Learning Student Presentations
Friday's Reading: Waitzkin, excerpt from "The Art of Learning" (pdf on Canvas)
By Monday after Break: Self-Guided Learning Assignment Full Packet and Final Reflection
Thanksgiving
(Monday, November 24, 2025 - Friday, November 28, 2025)
Reminder: No class--Thanksgviing Break
Week 14: DA Ethics
(Monday, December 01, 2025 - Friday, December 05, 2025)
Notes and Reminders: Quiz on Wednesday
Friday's Reading: Gebru et. al., "Datasheets for Datasets"
Week 15: Reproducible Research
(Monday, December 08, 2025 - Friday, December 12, 2025)
Due before Thursday's Lab: IRB Training
Reminder: No lab assignment--Work on final projects
Friday's Reading: William Stafford Noble "A Quick Guide to Organizing Computational Biology Projects" (https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1000424)
Week 16: Exam Week
(Sunday, December 14, 2025--Thursday, December 18, 2025)
By the time and date of our exam: Final project