Introduction to Quantitative & Computational Legal Reasoning (LAW:8645)

Revisions for coronavirus shutdown

Spring 2020; Monday + Tuesday 2:00-3:30; Classroom 265.

Professor: Paul Gowder

Assistant: Diana DeWalle


Welcome to Introduction to Quantitative and Computational Legal Reasoning, informally named "Sociological Gobbledygook" in honor of Chief Justice Roberts's slightly math-phobic remark at oral argument in Gill v. Whitford. This course is offered at the University of Iowa College of Law in Spring 2020, by Paul Gowder, and has previously been offered in Spring 2019: you can go look at last year's syllabus if you're curious. For a little bit more about the motivation for the course, you can check out the manifesto.

This is a totally open-source course. There is a Github repository which contains all of our materials and discussions. Please feel free to make a suggestion or start a conversation in the issues, or even make a pull request. For unstructured navigation of this website, you can use the [list of tags](/tags.html** that describe the subject matter of our lessons.

Note: If you're accessing this syllabus in PDF form on ICON, some of the links may not work, and it may not be fully updated. The canonical syllabus will always appear in HTML form on Much of the language below (like "this website") assumes that you are accessing the syllabus from there.

Course Summary

This course will review basic principles of probability, statistics, and computational reasoning (including elementary programming) for law students. Throughout, the emphasis will be on mathematically modest intuition, practical skills, and legal applications. No mathematical background beyond high school algebra will be assumed.

This course is not advised for students with substantial statistical or computational backgrounds---it is designed as a beginner course. Nor will it prepare students to be competent empirical researchers or computer programmers---the goal is to give students the capacity to critically evaluate and understand statistical reasoning, and to use computational methods to do so (as well as in their legal practices more generally). Focus will be on breadth rather than depth, as well as legal applications.

Introduction to Quantitative and Computational Legal Reasoning is experimental. This syllabus is not a contract; I reserve the right to make radical changes in how the course operates throughout the semester, depending on how student learning progresses. (However, as this is version 2 of the course, you can expect the changes to be a bit less radical compared to last year.)

Course Resources

Note that lessons are downloadable directly from this website, including (where lessons are in that format) Jupyter notebooks that you can execute on the IDAS service (about which below), but also in PDF.



The main readings for this course will be on this website. There may be a little bit of copyrighted stuff that I can't distribute publicly on ICON.

In addition, there will be supplemental readings drawn from Charles Severance, Python for Everybody, which is available for free online, and Michael Finkelstein & Bruce Levin, Statistics for Lawyers which should be free to download as PDF through our library's subscription. (You may have to be on the campus network to download it; you might also have to search for it through the library's directory.) We will also be using some excerpts from the Federal Judicial Center's Reference Manual on Scientific Evidence

We'll make use of some videos and exercises online. We'll talk more about this on the first day, but I will be assigning you mini-online-introductory courses to do in order to get practice with programming, from Dataquest's free tier (maybe) and from Hackerrank and Project Euler. You'll want to grab accounts at those places.

We will also use some free instructional materials from the nonprofit organizations Software Carpentry and Data Carpentry.

For the contents of this website, it's probably easiest to access them using the week-by-week links below or the content-based tags built into this website. There might be formatting glitches with some of the lessons, due to conversion between different file types and html, but every lesson will have a link to a downloadable and printable PDF at the bottom which will ordinarily have cleaner formatting (except for long lines of code, which may be cut off in PDF but should be fine on web). Some lessons should also have downloadable Jupyter notebooks associated with them, which will also be linked at the bottom.

Bonus Reading Suggestions

I am committed to only assigning resources which are free to students. However, the nature of this material is that sometimes one explanation will just "click" where another might not. So in addition to the assigned readings, I offer you this list of additional, non-free, readings which you might consult for a different perspective on the material---or for deeper engagement and exploration.


I really like the Aspen textbook by Lawless, Robbennolt and Ulen, Empirical Methods in Law. It has very good clear explanations of a number of research methods topics, and is not overly math-y. If you want to dig deeper into stats and empirical research in law, I highly recommend it. I also recommend Lee Epstein & Andrew Martin, An Introduction to Empirical Legal Research.

If you want to do serious research on your own, you will need to move to more advanced texts, but the direction you go will depend on the particular kind of research you want to conduct. For expermental research, especially experimental research out in the world (like the kinds of things done by discrimination testers, about which we will talk), a classic text is Gerber & Green Field Experiments: Design, Analysis and Interpretation; if you are more interested in observational research, I really like Angrist & Pischke, Mastering Metrics. Both of those books are rather-more math-y than the others (or our class).

Charles Wheelan, Naked Statistics: Stripping the Dread from the Data, is a well-liked book on the other end of the spectrum---it focuses on intuitive and non-math-y explanations of statistics topics. It's quite good in that respect, but I'm not fully comfortable recommending it for other reasons. The author thinks he's funnier than he actually is, and the book features a number of fairly tasteless, and in some cases offensive, jokes. If you can put up with that, however, the book is good at explaining stats in a clear way.

Some other books that might be of interest to you, though I haven't reviewed them as closely and hence can't clearly endorse, include:

Python Programming

My favorite introductory Python book (not free) is John Guttag, Introduction to Computation and Programming Using Python. This book is also the basis for a wonderful electronic course by almost the same name from MIT on EdX --- and you can go through the course for free, and without buying the book. I really do think that course (and the second course in the same series) is an amazing way to learn Python, and programming in general.

Blessedly, there are a lot of good introductory Python programming books out there which are also available online for free. One of my favorites is Al Sweigart, Automate the Boring Stuff with Python. For more advanced (and non-free) learning, I really love Luciano Ramalho's Fluent Python, although by the time you need that you should be looking at building fairly substantial programs.

It is better to use a Python book that is based on Python 3, not Python 2.

General Learning

I highly recommend Barbara Oakley's book A Mind for Numbers, which is basically a self-help book on the psychology of learning difficult things---which can help you not just in math-y classes but in law school and other classes in general. There's an online course based on her book on Coursera, called Learning How to Learn; I've never looked at that course but everyone who has done so has raved about it.

How Class Will Go

This course is structured as a vaguely "flipped" lab-style process. You will largely consume the talking-ish "content" kind of instruction outside of the classroom, primarily through readings. In classroom time, I will demonstrate the practical usage of the things you've learned about outside of class, maybe do a teeny tiny and (more commonly) assign exercises for you to carry out, with the opportunity to work together to figure them out and with me looming over your shoulder to help.

Please bring a computer to every class. Mac, Linux, or Windows computers will work best. Chromebooks and tablets will work less well, though we can get them to work if need be.

The coverage, pace, and workload in this class will be a continuing work in progress. Because this class isn't taught a lot in law schools, there is not much collective wisdom on how to do it successfully, and I expect to have to adapt the assignments and the pace to accommodate how readily the class takes to the material. So don't expect the syllabus to stay stable as we move through the course.

Class technological resources

This class will use the University's Interactive Data Analytics Service (IDAS). You do not need to request access to this, I've made arrangements to get accounts for the whole class, and I'll walk you through getting access to this resource on the first day of class. (You can also see the links on the bottom of this page.)

In addition, every session will be recorded on Panopto; and we'll use ICON to and distribute materials which (for copyright reasons, etc.) we're not allowed to distribute outside of the class. Also, you will use the discussion feature of ICON to share information and ask questions out of class.

This class is intended in part to produce resources which will be available to the legal profession at large in order to help your fellow lawyers understand code and stats as well; accordingly, many of the reading assignments will be to lessons posted on this website at Those assignments will also be available on the course GitHub repository; you will find it useful to get the assignments there in order to execute and mess around with the code yourself. I'll explain how GitHub works on the first day of class too.


Evaluation will be primarily based on four problem sets. The first two will be computer programming-based (with the second possibly including a probability problem or two), and will be worth 17.5% of the grade each. The third will be probability and statistics based and will be worth 25% of the grade. The fourth will be comprehensive, with emphasis on the statistics side, and will be worth 30% of the grade.

The weird fractions are to accommodate 10% of the grade which will be based on classroom participation and preparation, and which is meant to enforce the lab-style classroom format. Students who get full credit for that 10% will complete the simple out-of-class tasks which I will periodically assign and participate in good faith in collective problem-solving in the classroom. (This is an effort-based 10%, not a performance-based 10%.)

Under the policy described in Student Handbook section B.3 ("The curve is not applicable in upper-level seminars and other upper- level classes in which a student's grade is based primarily on the student’s performance on graded skills-oriented tasks (including writing) other than a final exam."), this course will not be curved.

In order to ensure that students in this course aren’t disadvantaged by unfamiliarity with the format or a collection of too-hard problem sets on which everyone struggles, there will be a floor for the distribution of grades for this course: at the lowest, the median grade for this course will be the official law school median of 3.3. In other words, you can’t do worse than the standard curve would otherwise dictate. But you can do better.

Last year's problem sets are available on this website, along with answers to them; looking them over will give you an idea of the approximate challenges that you'll be asked to complete.

All problem sets will be turned in via email to Diana Dewalle. The only place your name should appear in problem sets is in the filename.


Problem sets should be your own work. You are allowed to discuss the general approach to problem sets with one another, but you are not allowed to show one another your math or code.

For example: "I solved that problem by writing a loop over the list of cases" is acceptable. "Look at this code I wrote" is not.

Students will be asked to agree to an honor code.

Collaboration on in-class tasks and on homework assignments that are not one of the official problem sets is highly encouraged and probably necessary. However, you should try to do the homework assignments on your own first before consulting with your classmates, in order to maximize your learning. (Struggle and frustration are normal, expected, and healthy.)

Technology, Bugs, and Accommodations

This course will be technologically driven, obviously. Please let me know ASAP if there are any glitches of any kind.

Also, please contact me or the dean of students as soon as humanly possible if you need accommodations, so that these accommodations can be built into the tech. All course materials will be provided in formats that I believe are accessible (e.g., to screen readers), however, if I'm mistaken about their accessibility, please let me know and the problem will promptly be fixed.

Office Hours, Contacts, etc.

I will maintain office hours (Mon., Tue. 11am-1pm, and by appointment). I'm also happy to make appointments at other times, and you're always free to drop by when my door is open. I'm very good at replying to e-mail and very bad at checking telephone messages.

That being said, I very strongly encourage you to ask substantive questions in a way that will be accessible to your fellow students. This means using the copious time that will be made available in class time for that purpose, as well as making use of the discussion forum on ICON (in which I will very actively participate). If you have a question, it's almost certainly the case that several other people do too.

Some schedule notes

Spring Break is March 14-22. No class then, obviously. (Addendum: also no class the week after, for transition week)

Problem sets are due on the following Fridays, each at 5pm Central: Feb 14 (week 4), Mar 6 (week 7), and May 8 (combined problem sets 3 and 4, in the middle of finals period).

Learning outcomes

By the end of the course, you should be able to:

embedded image: Justice Gorsuch makes a statistical oops

Coverage by Week

The first few weeks will be spent on computation; subsequent weeks will be spent on data analysis and statistics. As we get further into the future, the below becomes more subject to change, obviously.

Week 1 (Jan 21)

Coverage: Basic ideas of programming, units of computation, functions and loops. Computational logic and legal logic, law as computation. We will front-load the quantity of reading a little heavily to get us started quickly, but it'll ease off as we move on to more conceptually difficult material. (Also, I know the reading seems like a lot, but it goes faster than cases in 1L year!) In particular, don't feel obliged to fully absorb everything from Python for Everybody on the first reading. Just read it quickly so you get a feel for the terrain, and then more carefully read the stuff posted on this site, then dig back into Python for Everybody to fill out the details.


Like I said, I know this is a lot of reading. Don't worry, there won't be nearly so much as we move forward.

In the first day of class, we will get everyone set up with the different services and installation options for the software we need in the course, and, if there's time, demonstrate some basic programming ideas and work through some exercises.

For Week 1 practice homework, do the "Introduction" problems (click the introduction checkbox on the right of the screen) in the HackerRank Python Domain.

I'd also like you to complete the factorial exercise at the end of the first Python lesson; you don't need to turn it in, but let me know if you can't complete it; we'll look at people's solutions next week and use this as our test to make sure everyone is set up and functional.

Here is the in-class notebook that we saw on day 1.

Week 2 (Jan 27, 28)

Coverage: using Python to get access to other people's code, libraries. Accessing the filesystem and the internet from Python. Error handling. Strings.


In class on Tuesday for this week last year we worked through an example of accessing the Openstates API. We'll probably do that exercise again this year, and when it's done you can take a look at the example here.

For Week 2 practice homework, do the "Basic Data Types" problems (click the Basic Data Types checkbox on the right of the screen) in the HackerRank Python Domain, except the "List Comprehensions," and "Lists" problems, which you can skip (we'll try to do those in class).

Here are our in-class notebooks for week 2: Jan 27 (Monday) and Jan 28 (Tuesday). The due dates for the first two problem sets have also changed, and this change is reflected on the page you're looking at. I've also moved around the practice homework a bit.

Week 3 (Feb 3, 4)

Coverage: Regular expressions. Simulation and why you might want to do it. A very light introduction to object-oriented programming. (But we're a little behind and so we'll start this week with the networking and API stuff from last week.)


For Week 3 practice homework, go to the "Strings" problems in the HackerRank Python Domain and do:

Here are our in-class notebooks for week 3: Feb 3 (Monday) and Feb 4 (Tuesday).

Week 4 (Feb 10, 11)

Basic probability math. Bayes rule and conditional probability.

Focused legal applications: probabilistic causation in torts, junk science in criminal trials.


Problem set 1 due Friday, February 14, at 5pm Central time. Here are my answers to pset 1

See last year's Problem set 1, which you can do for practice if you'd like. After doing that practice, you can check out my answers to it.

No week 4 practice homework because the problem set is due.

Here are our in-class notebooks for week 4: Feb 10 (Monday); and Feb 11(Tuesday).

Week 5 (Feb 17, 18)

Initial explorations into data with data visualization in Python. Basic properties of data, measures of central tendency, exploratory data analysis.


For Week 5 practice homework, do the "Errors and Exceptions" problems in the HackerRank Python Domain, then go to the "Regular Expressions and Parsing" section and do:

Here are our in-class notebooks for week 5: Feb 17 (Monday); for Feb 18, instead look in the class exercises for the answers to the data scavenger hunt.

Week 6 (Feb 24, 25)

Probability distributions, central limit theorem, hypothesis testing.


For Week 6 practice homework, do the "Debugging" problems in the HackerRank Python Domain

Here is our one in-class notebook for week 6: Feb 25 (Tuesday); Feb 24 was a lecture without a notebook.

Week 7 (Mar 2, 3)

Experiments, random assignment. Causation and correlation. Focused legal application: audit tests in discrimination cases.


Problem set 2 due Friday, March 6, at 5pm Central time. Here are my answers to pset 2

See last year's problem set 2, which, as before, you should do for practice; afterward you can look at my answers.

Here are our in-class notebooks for week 7: Mar 2 (Monday); plus a supplemental tutorial on installing third-party libraries; March 3.

Week 8 (Mar 9, 10)

Focus week: statistical extrapolation and simulation in the law. Shonubi case.


We'll spend this week catching up further, if necessary, discussing the Shonubi case, and replicating the data analysis used by experts in that case (approximately---we don't have quite the identical dataset). As time permits, I'll introduce the basic concepts of linear regression for next week.

For Week 8 practice homework, do the Multiples of 3 and 5 and Smallest multiple problems on Project Euler. (Note, you don't need to submit code on that site, just write the code to get the correct math answer.)

Week 9 (Mar 23, 24)

Regression analysis. Linear regression. Statistical evidence of discrimination.


For Week 9 practice homework do Even Fibonacci numbers from Project Euler. Also, look at the hypothetical dataset mickel.csv, and use the techniques we've learned in class to come to some conclusion about whether discrimination is occurring in the provision of public benefits in this disability services agency context. We'll go over this assignment in class at an appropriate time.

Week 10 (April 6, 7, revised schedule)

Applications and reinforcement. Slow-down week, solidify our existing knowledge by thinking about a concrete use of statistics in law: determining disparate impact in Title VII cases.


No practice problems this week.

Week 11 (April 13, 14)

P-values, p-hacking, publication bias, the replication crisis in psychology, power and underpoweredness, multiple comparisons, and other terrible pitfalls of scientific research.


Discussion: what are the legal implications of scientific failures?

For Week 11 practice homework do Power Digit Sum from Project Euler. This will be the last of our practice homeworks, as I know that you're going to start freaking out about exams by now.

Week 12 (April 20, 21)

How regression analysis can go horribly wrong. Assumptions of regression. Failures of regression assumptions. Simpson's paradox.


Week 14 (April 27, 28)

Look at real-life expert witness reports, how arguments about quantitative methodology are used in court.

From statistics to machine learning: what is it that fancy data science people actually do with their time? Prediction vs inference. Algorithmic accountability. Discrimination by computer, and legal implications of statistical discrimination (intentional racial profiling and unintentional racial profiling).


Optional, bonus (but HIGHLY recommended) reading:

Final exam period

Consolidated Problem sets 3 and 4 due on Friday, May 8, during the exam period, at 5pm Central.

See last year's problem set 3, last year's problem set 4 and a makeup assignment the class did.