Model Reliability

Data Science for Studying Language and the Mind

Katie Schuler

2024-11-12

Announcements

  • No pset 05 (due to off week!); pset 06 will have a few extra challenge questions in case you want them
  • pset 04 solutions posted tomorrow morning! (turn it in if you haven’t!)
  • Practice exam 02 posted by Friday at midnight!

You are here

Data science with R
  • Hello, world!
  • R basics
  • Data visualization
  • Data wrangling
Stats & Model buidling
  • Sampling distribution
  • Hypothesis testing
  • Model specification
  • Model fitting
  • Model accuracy
  • Model reliability
More advanced
  • Classification
  • Inference for regression
  • Mixed-effect models

Model building overview

  • Model specification: what is the form?
  • Model fitting: you have the form, how do you guess the free parameters?
  • Model accuracy: you’ve estimated the parameters, how well does that model describe your data?
  • Model reliability: when you estimate the parameters, there is some uncertainty on them

Dataset

data_n10 <- read_csv("http://kathrynschuler.com/datasets/model-reliability-sample10.csv") 
data_n200 <- read_csv("http://kathrynschuler.com/datasets/model-reliability-sample200.csv") 

Explore the data

Specify a model
  • supervised learning | regression | linear
  • y ~ x
  • \(y=w_0+w_1x_1\)
Fit the model

Specify and fit with infer
data_n10 %>%
  specify(y ~ x) %>%
  fit()
# A tibble: 2 × 2
  term      estimate
  <chr>        <dbl>
1 intercept    1.75 
2 x            0.733

Model reliability asks:

How certain can we be about the parameter estimates we obtained?

observed_fit <- data_n10 %>%
  specify(y ~ x) %>%
  fit()

observed_fit
# A tibble: 2 × 2
  term      estimate
  <chr>        <dbl>
1 intercept    1.75 
2 x            0.733

But… why is there uncertainty around the parameter estimates at all?

Because of sampling error

We are interested in the model parameters that best describe the population from which the sample was drawn (not a given sample)

  • Due to sampling error, we can expect some variability in the model parameters that describe a sample of data.

Model reliability

  • We can think of model reliability as the stability of the parameters of a fitted model.
  • The more data we collect, the more reliable the model parameters will be.

Confidence intervals via bootstrapping

We can obtain confidence intervals around parameter estimates for models in the same we we did for point estimates like the mean: bootstrapping

  1. Draw bootstrap samples from the observed data
  2. Fit the model of interest to each bootstrapped sample
  3. Construct the sampling distribution of parameter estimates across bootstraps

The more data we collect, the more reliable

The more data we collect, the more reliable

Confidence intervals with infer

Fit bootstraps

boot_fits <- data_n200 %>%
  specify(y ~ x) %>%
  generate(
    reps = 1000, 
    type = "bootstrap"
  ) %>%
  fit()

head(boot_fits)
# A tibble: 6 × 3
# Groups:   replicate [3]
  replicate term      estimate
      <int> <chr>        <dbl>
1         1 intercept    1.84 
2         1 x            0.485
3         2 intercept    1.95 
4         2 x            0.585
5         3 intercept    1.82 
6         3 x            0.332

Get confidence interval

ci <- boot_fits %>%
  get_confidence_interval(
    point_estimate = observed_fit_200, 
    level = 0.95
  )

ci 
# A tibble: 2 × 3
  term      lower_ci upper_ci
  <chr>        <dbl>    <dbl>
1 intercept    1.78     2.06 
2 x            0.362    0.634

Visualize distribution & ci

bootstrapped_fits %>%
  visualize() +
  shade_ci(endpoints = ci) 

Accuracy v. Reliability

Model acuracy and model reliability are closely related concepts in model building, but they aren’t the same.

  • Accuracy refers to how close a model’s predictions are to the true values we want to predict.
  • Reliability, is about the model’s stability—how consistent the model’s parameters and outputs are when new data is sampled.

Accuracy v. Reliability

  1. Reliable and accurate: The model is both close to the true model and stable across different samples. This is the ideal case, indicating we have enough data to produce both a precise and consistent model fit.

Accuracy v. Reliability

  1. Reliable but inaccurate: The model parameters are stable across samples, meaning it’s reliable, but it’s far from the true model. This could happen if our model is structurally limited or misses some aspect of the data, even if we have plenty of data for stable estimates.

Accuracy v. Reliability

  1. Unreliable but accurate: This situation is unlikely. Without enough data, the model’s predictions will fluctuate widely from sample to sample, making it hard to consistently approximate the true model. So, without reliability, achieving accuracy is improbable.

Accuracy v. Reliability

  1. Unreliable and inaccurate: Here, the model’s estimates are unstable and far from the true model. This could be due to either insufficient data or an inappropriate model choice that doesn’t match the data’s structure. With limited data, it’s hard to tell which factor is to blame.

Accuracy v. Reliability