Course content

0h03m

Crash Course Statistics Preview

Welcome to Crash Course Statistics! In this series we're going to take a closer look at how statistics play a significant role in our everyday lives. Now this a "math" course, and there will definitely be some math, but we're going to focus on how statistics is useful and valuable to you - someone that performs AND consumes statistics all the time. Statistics are everywhere from batting averages and insurance rates to weather forecasting and smart assistants, and it's our hope that when you finish this series you'll get a better idea of the role statistics play in helping us better understand the world!

Crash Course is on Patreon! You can support us directly by signing up at http://www.patreon.com/crashcourse

Thanks to the following Patrons for their generous monthly contributions that help keep Crash Course free for everyone forever:

Mark Brouwer, Nickie Miskell Jr., Jessica Wode, Eric Prestemon, Kathrin Benoit, Tom Trval, Jason Saslow, Nathan Taylor, Divonne Holmes à Court, Brian

0h13m

What Is Statistics: Crash Course Statistics #1

Welcome to Crash Course Statistics! In this series we're going to take a look at the important role statistics play in our everyday lives, because statistics are everywhere! Statistics help us better understand the world and make decisions from what you'll wear tomorrow to government policy. But in the wrong hands, statistics can be used to misinform. So we're going to try to do two things in this series. Help show you the usefulness of statistics, but also help you become a more informed consumer of statistics. From probabilities, paradoxes, and p-values there's a lot to cover in this series, and there will be some math, but we promise only when it's most important. But first, we should talk about what statistics actually are, and what we can do with them. Statistics are tools, but they can't give us all the answers.

Episode Notes:

On Tea Tasting:
"The Lady Tasting Tea" by David Salsburg

On Chain Saw Injuries:
https://www.cdc.gov/disasters/chainsaws.html
https://www.ncbi.nlm.nih.g

0h11m

Mathematical Thinking: Crash Course Statistics #2

Today we’re going to talk about numeracy - that is understanding numbers. From really really big numbers to really small numbers, it's difficult to comprehend information at this scale, but these are often the types of numbers we see most in statistics. So understanding how these numbers work, how to best visualize them, and how they affect our world can help us become better decision makers - from deciding if we should really worry about Ebola to helping improve fighter jets during World War II!

Episode Notes:

Tim Urban’s wonderful post about visualizing large numbers:
https://waitbutwhy.com/2014/11/from-1-to-1000000.html

Some of our reading that inspired this episode:
How Not to Be Wrong: The Power of Mathematical Thinking by Jordan Ellenberg
Innumeracy: Mathematical Illiteracy and its Consequences by John Allen Paulos

Crash Course is on Patreon! You can support us directly by signing up at http://www.patreon.com/crashcourse

Thanks to the following Patrons for their generous mon

0h11m

Mean, Median, and Mode: Measures of Central Tendency: Crash Course Statistics #3

Today we’re going to talk about measures of central tendency - those are the numbers that tend to hang out in the middle of our data: the mean, the median, and mode. All of these numbers can be called “averages” and they’re the numbers we tend to see most often - whether it’s in politics when talking about polling or income equality to batting averages in baseball (and cricket) and Amazon reviews. Averages are everywhere so today we’re going to discuss how these measures differ, how their relationship with one another can tell us a lot about the underlying data, and how they are sometimes used to mislead.


Crash Course is on Patreon! You can support us directly by signing up at http://www.patreon.com/crashcourse

Thanks to the following Patrons for their generous monthly contributions that help keep Crash Course free for everyone forever:

Mark Brouwer, Nickie Miskell Jr., Jessica Wode, Eric Prestemon, Kathrin Benoit, Tom Trval, Jason Saslow, Nathan Taylor, Divonne Holmes à Court, Br

0h11m

Measures of Spread: Crash Course Statistics #4

Today, we're looking at measures of spread, or dispersion, which we use to understand how well medians and means represent the data, and how reliable our conclusions are. They can help understand test scores, income inequality, spot stock bubbles, and plan gambling junkets. They're pretty useful, and now you're going to know how to calculate them!

Crash Course is on Patreon! You can support us directly by signing up at http://www.patreon.com/crashcourse

Thanks to the following Patrons for their generous monthly contributions that help keep Crash Course free for everyone forever:

Mark Brouwer, Nickie Miskell Jr., Jessica Wode, Eric Prestemon, Kathrin Benoit, Tom Trval, Jason Saslow, Nathan Taylor, Divonne Holmes à Court, Brian Thomas Gossett, Khaled El Shalakany, Indika Siriwardena, Robert Kunz, SR Foxley, Sam Ferguson, Yasenia Cruz, Daniel Baulig, Eric Koslow, Caleb Weeks, Tim Curwick, Evren Türkmeno?lu, Alexander Tamas, Justin Zingsheim, D.A. Noe, Shawn Arnold, mark austin, Ruth Pe

0h10m

Charts Are Like Pasta - Data Visualization Part 1: Crash Course Statistics #5

Today we're going to start our two-part unit on data visualization. Up to this point we've discussed raw data - which are just numbers - but usually it's much more useful to represent this information with charts and graphs. There are two types of data we encounter, categorical and quantitative data, and they likewise require different types of visualizations. Today we'll focus on bar charts, pie charts, pictographs, and histograms and show you what they can and cannot tell us about their underlying data as well as some of the ways they can be misused to misinform.

Crash Course is on Patreon! You can support us directly by signing up at http://www.patreon.com/crashcourse

Thanks to the following Patrons for their generous monthly contributions that help keep Crash Course free for everyone forever:

Mark Brouwer, Nickie Miskell Jr., Jessica Wode, Eric Prestemon, Kathrin Benoit, Tom Trval, Jason Saslow, Nathan Taylor, Divonne Holmes à Court, Brian Thomas Gossett, Khaled El Shalakany,

0h11m

Plots, Outliers, and Justin Timberlake: Data Visualization Part 2: Crash Course Statistics #6

Today we’re going to finish up our unit on data visualization by taking a closer look at how dot plots, box plots, and stem and leaf plots represent data. We’ll also talk about the rules we can use to identify outliers and apply our new data viz skills by taking a closer look at how Justin Timberlake’s song lyrics have changed since he went solo.

We scraped our Justin Timberlake song data from lyrics.com. If you're interested in how we did it or would like to try out the code on a different artist, check out our code on GitHub: https://github.com/cmparlettpelleriti/CC2018/tree/master/unique_lyrs

DISCLAIMER: Please be respectful to lyrics websites when scraping data. Some sites may have limits for the number of requests you can make each day.

Crash Course is on Patreon! You can support us directly by signing up at http://www.patreon.com/crashcourse

Thanks to the following Patrons for their generous monthly contributions that help keep Crash Course free for everyone forever:

Mark

0h11m

The Shape of Data: Distributions: Crash Course Statistics #7

When collecting data to make observations about the world it usually just isn't possible to collect ALL THE DATA. So instead of asking every single person about student loan debt for instance we take a sample of the population, and then use the shape of our samples to make inferences about the true underlying distribution our data. It turns out we can learn a lot about how something occurs, even if we don't know the underlying process that causes it. Today, we’ll also introduce the normal (or bell) curve and talk about how we can learn some really useful things from a sample's shape - like if an exam was particularly difficult, how often old faithful erupts, or if there are two types of runners that participate in marathons!

Crash Course is on Patreon! You can support us directly by signing up at http://www.patreon.com/crashcourse

Thanks to the following Patrons for their generous monthly contributions that help keep Crash Course free for everyone forever:

Mark Brouwer, Justin Zing

0h12m

Correlation Doesn’t Equal Causation: Crash Course Statistics #8

Today we’re going to talk about data relationships and what we can learn from them. We’ll focus on correlation, which is a measure of how two variables move together, and we’ll also introduce some useful statistical terms you’ve probably heard of like regression coefficient, correlation coefficient (r), and r^2. But first, we’ll need to introduce a useful way to represent bivariate continuous data - the scatter plot. The scatter plot has been called “the most useful invention in the history of statistical graphics” but that doesn’t necessarily mean it can tell us everything. Just because two data sets move together doesn’t necessarily mean one CAUSES the other. This gives us one of the most important tenets of statistics: correlation does not imply causation.

Crash Course is on Patreon! You can support us directly by signing up at http://www.patreon.com/crashcourse

Thanks to the following Patrons for their generous monthly contributions that help keep Crash Course free for everyone

0h12m

Controlled Experiments: Crash Course Statistics #9

We may be living IN a simulation (according to Elon Musk and many others), but that doesn't mean we don't need to perform simulations ourselves. Today, we're going to talk about good experimental design and how we can create controlled experiments to minimize bias when collecting data. We'll also talk about single and double blind studies, randomized block design, and how placebos work.

Crash Course is on Patreon! You can support us directly by signing up at http://www.patreon.com/crashcourse

Thanks to the following Patrons for their generous monthly contributions that help keep Crash Course free for everyone forever:

Mark Brouwer, Justin Zingsheim, Nickie Miskell Jr., Jessica Wode, Eric Prestemon, Kathrin Benoit, Tom Trval, Jason Saslow, Nathan Taylor, Divonne Holmes à Court, Brian Thomas Gossett, Khaled El Shalakany, Indika Siriwardena, Robert Kunz, SR Foxley, Sam Ferguson, Yasenia Cruz, Daniel Baulig, Eric Koslow, Caleb Weeks, Tim Curwick, Evren Türkmeno?lu, Alexander Tamas, D.

0h11m

Sampling Methods and Bias with Surveys: Crash Course Statistics #10

Participate in our survey! We'll analyze the results in future episodes! (individual data will be kept anonymous). https://bit.ly/2J1zimn
Today we’re going to talk about good and bad surveys. From user feedback surveys, telephone polls, and those questionnaires at your doctors office, surveys are everywhere, but with their ease to create and distribute, they're also susceptible to bias and error. So today we’re going to talk about how to identify good and bad survey questions, and how groups (or samples) are selected to represent the entire population since it's often just not feasible to ask everyone.

Crash Course is on Patreon! You can support us directly by signing up at http://www.patreon.com/crashcourse

Thanks to the following Patrons for their generous monthly contributions that help keep Crash Course free for everyone forever:

Mark Brouwer, Justin Zingsheim, Nickie Miskell Jr., Jessica Wode, Eric Prestemon, Kathrin Benoit, Tom Trval, Jason Saslow, Nathan Taylor, Divonne Hol

0h10m

Science Journalism: Crash Course Statistics #11

We’ve talked a lot in this series about how often you see data and statistics in the news and on social media - which is ALL THE TIME! But how do you know who and what you can trust? Today, we’re going to talk about how we, as consumers, can spot flawed studies, sensationalized articles, and just plain poor reporting. And this isn’t to say that all science articles you read on facebook or in magazines are wrong, but that it's valuable to read those catchy headlines with some skepticism.

Crash Course is on Patreon! You can support us directly by signing up at http://www.patreon.com/crashcourse

Thanks to the following Patrons for their generous monthly contributions that help keep Crash Course free for everyone forever:

Mark Brouwer, Glenn Elliott, Justin Zingsheim, Jessica Wode, Eric Prestemon, Kathrin Benoit, Tom Trval, Jason Saslow, Nathan Taylor, Divonne Holmes à Court, Brian Thomas Gossett, Khaled El Shalakany, Indika Siriwardena, Robert Kunz, SR Foxley, Sam Ferguson, Yasenia Cr

0h11m

Henrietta Lacks, the Tuskegee Experiment, & Ethical Data Collection: Crash Course Statistics #12

Today we’re going to talk about ethical data collection. From the Tuskegee syphilis experiments and Henrietta Lacks’ HeLa cells to the horrifying experiments performed at Nazi concentration camps, many strides have been made from Institutional Review Boards (or IRBs) to the Nuremberg Code to guarantee voluntariness, informed consent, and beneficence in modern statistical gathering. But as we’ll discuss, with the complexities of research in the digital age many new ethical questions arise.

Crash Course is on Patreon! You can support us directly by signing up at http://www.patreon.com/crashcourse

Thanks to the following Patrons for their generous monthly contributions that help keep Crash Course free for everyone forever:

Mark Brouwer, Glenn Elliott, Justin Zingsheim, Jessica Wode, Eric Prestemon, Kathrin Benoit, Tom Trval, Jason Saslow, Nathan Taylor, Divonne Holmes à Court, Brian Thomas Gossett, Khaled El Shalakany, Indika Siriwardena, Robert Kunz, SR Foxley, Sam Ferguson, Yasenia

0h12m

Probability Part 1: Rules and Patterns: Crash Course Statistics #13

Today we’re going to begin our discussion of probability. We’ll talk about how the addition (OR) rule, the multiplication (AND) rule, and conditional probabilities help us figure out the likelihood of sequences of events happening - from optimizing your chances of having a great night out with friends to seeing Cole Sprouse at IHop!

Crash Course is on Patreon! You can support us directly by signing up at http://www.patreon.com/crashcourse

Thanks to the following Patrons for their generous monthly contributions that help keep Crash Course free for everyone forever:

Mark Brouwer, Glenn Elliott, Justin Zingsheim, Jessica Wode, Eric Prestemon, Kathrin Benoit, Tom Trval, Jason Saslow, Nathan Taylor, Divonne Holmes à Court, Brian Thomas Gossett, Khaled El Shalakany, Indika Siriwardena, Robert Kunz, SR Foxley, Sam Ferguson, Yasenia Cruz, Eric Koslow, Caleb Weeks, Tim Curwick, Evren Türkmeno?lu, Alexander Tamas, D.A. Noe, Shawn Arnold, mark austin, Ruth Perez, Malcolm Callis, Ken Penttinen,

0h12m

Probability Part 2: Updating Your Beliefs with Bayes: Crash Course Statistics #14

Today we're going to introduce bayesian statistics and discuss how this new approach to statistics has revolutionized the field from artificial intelligence and clinical trials to how your computer filters spam! We'll also discuss the Law of Large Numbers and how we can use simulations to help us better understand the "rules" of our data, even if we don't know the equations that define those rules.

Want to try out the law of large numbers simulation yourself? More details here:
https://github.com/cmparlettpelleriti/CC2018/blob/master/LLN.md

Crash Course is on Patreon! You can support us directly by signing up at http://www.patreon.com/crashcourse

Thanks to the following Patrons for their generous monthly contributions that help keep Crash Course free for everyone forever:

Mark Brouwer, Glenn Elliott, Justin Zingsheim, Jessica Wode, Eric Prestemon, Kathrin Benoit, Tom Trval, Jason Saslow, Nathan Taylor, Divonne Holmes à Court, Brian Thomas Gossett, Khaled El Shalakany, Indika Siri

0h14m

The Binomial Distribution: Crash Course Statistics #15

Today we're going to discuss the Binomial Distribution and a special case of this distribution known as a Bernoulli Distribution. The formulas that define these distributions provide us with shortcuts for calculating the probabilities of all kinds of events that happen in everyday life.They can can also be used to help us look at how probabilities are connected! For instance, knowing the chance of getting a flat tire today is useful, but knowing the likelihood of getting one this year, or in the next five years, may be more useful. And heads up, this episode is going to have a lot more equations than normal, but to sweeten the deal, we added zombies!

If you want to try out some of the math from this video here is a great binomial probability calculator: http://vassarstats.net/textbook/ch5apx.html

If you'd like more information on calculating the binomial coefficient (n-choose-k) read this: http://www.statisticshowto.com/binomial-coefficient/

Want to find Crash Course elsewhere on

0h10m

Geometric Distributions & The Birthday Paradox: Crash Course Statistics #16

Geometric probabilities, and probabilities in general, allow us to guess how long we'll have to wait for something to happen. Today, we'll discuss how they can be used to figure out how many Bertie Bott's Every Flavour Beans you could eat before getting the dreaded vomit flavored bean, and how they can help us make decisions when there is a little uncertainty - like getting a Pikachu in a pack of Pokémon Cards! We'll finish off this unit on probability by taking a closer look at the Birthday Paradox (or birthday problem) which asks the question: how many people do you think need to be in a room for there to likely be a shared birthday? (It's likely much fewer than you would expect!)

Crash Course is on Patreon! You can support us directly by signing up at http://www.patreon.com/crashcourse

Thanks to the following Patrons for their generous monthly contributions that help keep Crash Course free for everyone forever:

Mark Brouwer, Glenn Elliott, Justin Zingsheim, Jessica Wode, Eric Pre

0h12m

Randomness: Crash Course Statistics #17

There are a lot of events in life that we just can’t predict, but just because something is random doesn’t mean we don’t know or can’t learn anything about it. Today, we’re going to talk about how we can extract information from seemingly random events starting with the expected value or mean of a distribution and walking through the first four “moments” - the mean, variance, skewness, and kurtosis.

Note: There are many formulas to calculate skewness and kurtosis (https://www.itl.nist.gov/div898/handbook/eda/section3/eda35b.htm), our formulas deal with what they have in common, their moment generating functions.

More on sheep study: http://aiweirdness.com/post/171451900302/do-neural-nets-dream-of-electric-sheep

More on fecal matter study: http://aem.asm.org/content/early/2018/02/05/AEM.00044-18.abstract

Crash Course is on Patreon! You can support us directly by signing up at http://www.patreon.com/crashcourse

Thanks to the following Patrons for their generous monthly contribut

0h10m

Z-Scores and Percentiles: Crash Course Statistics #18

Today we’re going to talk about how we compare things that aren’t exactly the same - or aren’t measured in the same way. For example, if you wanted to know if a 1200 on the SAT is better than the 25 on the ACT. For this, we need to standardize our data using z-scores - which allow us to make comparisons between two sets of data as long as they’re normally distributed. We’ll also talk about converting these scores to percentiles and discuss how percentiles, though valuable, don’t actually tell us how “extreme” our data really is.

Crash Course is on Patreon! You can support us directly by signing up at http://www.patreon.com/crashcourse

Thanks to the following Patrons for their generous monthly contributions that help keep Crash Course free for everyone forever:

Mark Brouwer, Glenn Elliott, Justin Zingsheim, Jessica Wode, Eric Prestemon, Kathrin Benoit, Tom Trval, Jason Saslow, Nathan Taylor, Divonne Holmes à Court, Brian Thomas Gossett, Khaled El Shalakany, Indika Siriwardena, SR Fo

0h11m

The Normal Distribution: Crash Course Statistics #19

Today is the day we finally talk about the normal distribution! The normal distribution is incredibly important in statistics because distributions of means are normally distributed even if populations aren't. We'll get into why this is so - due to the Central Limit Theorem - but it's useful because it allows us to make comparisons between different groups even if we don't know the underlying distribution of the population being studied.

Crash Course is on Patreon! You can support us directly by signing up at http://www.patreon.com/crashcourse

Thanks to the following Patrons for their generous monthly contributions that help keep Crash Course free for everyone forever:

Mark Brouwer, Glenn Elliott, Justin Zingsheim, Jessica Wode, Eric Prestemon, Kathrin Benoit, Tom Trval, Jason Saslow, Nathan Taylor, Divonne Holmes à Court, Brian Thomas Gossett, Khaled El Shalakany, Indika Siriwardena, SR Foxley, Sam Ferguson, Yasenia Cruz, Eric Koslow, Caleb Weeks, Tim Curwick, Evren Türkmeno?lu,