Recall that all frequentist statistics are fundamentally based on the idea of the sampling distribution. We have certain beliefs (derived from things like the central limit theorem) about what kinds of data the world will show us, under given assumptions, and then we look at the data that the world has, in fact, shown us, and ask how much does our data embarass the assumptions about the world that we started with.
By contrast, the foundational question of Bayesian statistics is "given some prior belief about the world, and some evidence, what do we believe now?" I don't want to dig too deeply into this, as we don't even remotely have the time to cover it. But the basic idea is:
Come up with a prior probability distribution. For example, in a discrimination case, you might ask "what's the probability that men and women are paid differently?" This can be an "informative prior"---one that's based on, for example, prior research. Or it can be an "uninformative prior" that more-or-less doesn't bias the results in one way or another.
Apply Bayes rule. The prior distribution gives us the probability of seeing our data, and we can then update quite directly to get a posterior distribution.
There's a good clear explanation of the basic ideas of bayesian vs frequentist inference in this old MIT lecture notes document. The major issue that lots of people have with Bayesian inference is the business with priors. The choice of an uninformative or an informative prior can matter a lot, and which is appropriate depends a lot on context. For example, a Bayesian statistician trying to learn about whether telekenesis exists would probably be well-advised to use an informative prior, because a freak result in an experiment probably shouldn't budge your beliefs very much. A statistician trying to figure out whether a particular employer engaged in discrimination probably ought to start with an uninformative prior, because it would unfairly put a thumb on the scale to begin with a belief that the employer discriminates against women, say, to the same degree as the observed nationwide pay gap.
In the past, the major barriers to the widespread adoption of Bayesian statistics have probably been the scarier math and the computational demands; these days computers are lots faster so it's the scarier math and the priors. (The math does get quite scary when you get to complicated questions... although honestly the math in frequentist maximum likelihood estimation, which I haven't even talked about in this class, also gets a little scary-looking and calculus-ey.) But the flip side to the priors issue is the overwhelming advantage of Bayesian statistics, namely that it answers the question we actually want to answer, viz, how likely is the hypothesis given our data, rather than how likely is the data given the hypothesis.
A second advantage is Bayesian statistics is that collecting more data isn't cheating. In the frequentist null hypothesis significance testing approach, collecting data, looking at it, seeing a nonsignificant p-value, and then going out and getting more data, is totally p-hacking; you're looking at the data a bunch of times, and so you're radically increasing the risk of type I errors (of falling into that 5% of cases where you get a significant result when the null hypothesis is true). But you don't have to worry about stopping rules in Bayesian hypothesis testing, because more data unproblematically gets integrated into the overall estimate. (This is a slight oversimplification... Bayesian statistics folks have endless debates about stopping rules. But if there are problems, they aren't nearly as severe as the problem of "peeking" at the data in the middle of data collection in a frequentist/null-hypothesis context.)
This class has focused on classical statistics mainly because I believe that this is what you are most likely to encounter in legal practice.