Model Reliability

Data Science for Studying Language and the Mind

Katie Schuler

2024-11-12

Announcements

No pset 05 (due to off week!); pset 06 will have a few extra challenge questions in case you want them
pset 04 solutions posted tomorrow morning! (turn it in if you haven’t!)
Practice exam 02 posted by Friday at midnight!

You are `here`

Data science with R

Hello, world!
R basics
Data visualization
Data wrangling

Stats & Model buidling

Sampling distribution
Hypothesis testing
Model specification
Model fitting
Model accuracy
Model reliability

More advanced

Classification
Inference for regression
Mixed-effect models

Model building overview

Model specification: what is the form?
Model fitting: you have the form, how do you guess the free parameters?
Model accuracy: you’ve estimated the parameters, how well does that model describe your data?
Model reliability: when you estimate the parameters, there is some uncertainty on them

Dataset

data_n10 <- read_csv("http://kathrynschuler.com/datasets/model-reliability-sample10.csv") 
data_n200 <- read_csv("http://kathrynschuler.com/datasets/model-reliability-sample200.csv")

Explore the data

Specify a model

supervised learning | regression | linear
y ~ x
\(y=w_0+w_1x_1\)

Fit the model

Specify and fit with `infer`

data_n10 %>%
  specify(y ~ x) %>%
  fit()

# A tibble: 2 × 2
  term      estimate
  <chr>        <dbl>
1 intercept    1.75 
2 x            0.733

Model reliability asks:

How certain can we be about the parameter estimates we obtained?

observed_fit <- data_n10 %>%
  specify(y ~ x) %>%
  fit()

observed_fit

# A tibble: 2 × 2
  term      estimate
  <chr>        <dbl>
1 intercept    1.75 
2 x            0.733

But… why is there uncertainty around the parameter estimates at all?

Because of sampling error

We are interested in the model parameters that best describe the population from which the sample was drawn (not a given sample)

Due to sampling error, we can expect some variability in the model parameters that describe a sample of data.

Model reliability

We can think of model reliability as the stability of the parameters of a fitted model.
The more data we collect, the more reliable the model parameters will be.

Confidence intervals via bootstrapping

We can obtain confidence intervals around parameter estimates for models in the same we we did for point estimates like the mean: bootstrapping

Draw bootstrap samples from the observed data
Fit the model of interest to each bootstrapped sample
Construct the sampling distribution of parameter estimates across bootstraps

The more data we collect, the more reliable

The more data we collect, the more reliable

Confidence intervals with `infer`

Fit bootstraps

boot_fits <- data_n200 %>%
  specify(y ~ x) %>%
  generate(
    reps = 1000, 
    type = "bootstrap"
  ) %>%
  fit()

head(boot_fits)

# A tibble: 6 × 3
# Groups:   replicate [3]
  replicate term      estimate
      <int> <chr>        <dbl>
1         1 intercept    1.84 
2         1 x            0.485
3         2 intercept    1.95 
4         2 x            0.585
5         3 intercept    1.82 
6         3 x            0.332

Get confidence interval

ci <- boot_fits %>%
  get_confidence_interval(
    point_estimate = observed_fit_200, 
    level = 0.95
  )

ci

# A tibble: 2 × 3
  term      lower_ci upper_ci
  <chr>        <dbl>    <dbl>
1 intercept    1.78     2.06 
2 x            0.362    0.634

Visualize distribution & ci

bootstrapped_fits %>%
  visualize() +
  shade_ci(endpoints = ci)

Accuracy v. Reliability

Model acuracy and model reliability are closely related concepts in model building, but they aren’t the same.

Accuracy refers to how close a model’s predictions are to the true values we want to predict.
Reliability, is about the model’s stability—how consistent the model’s parameters and outputs are when new data is sampled.

Accuracy v. Reliability

Reliable and accurate: The model is both close to the true model and stable across different samples. This is the ideal case, indicating we have enough data to produce both a precise and consistent model fit.

Accuracy v. Reliability

Reliable but inaccurate: The model parameters are stable across samples, meaning it’s reliable, but it’s far from the true model. This could happen if our model is structurally limited or misses some aspect of the data, even if we have plenty of data for stable estimates.

Accuracy v. Reliability

Unreliable but accurate: This situation is unlikely. Without enough data, the model’s predictions will fluctuate widely from sample to sample, making it hard to consistently approximate the true model. So, without reliability, achieving accuracy is improbable.

Accuracy v. Reliability

Unreliable and inaccurate: Here, the model’s estimates are unstable and far from the true model. This could be due to either insufficient data or an inappropriate model choice that doesn’t match the data’s structure. With limited data, it’s hard to tell which factor is to blame.

Model Reliability

Announcements

You are here

Data science with R

Stats & Model buidling

More advanced

Model building overview

Dataset

Explore the data

Specify a model

Fit the model

Specify and fit with infer

Model reliability asks:

Because of sampling error

Model reliability

Confidence intervals via bootstrapping

The more data we collect, the more reliable

The more data we collect, the more reliable

Confidence intervals with infer

Accuracy v. Reliability

Accuracy v. Reliability

Accuracy v. Reliability

Accuracy v. Reliability

Accuracy v. Reliability

Accuracy v. Reliability

You are `here`

Specify and fit with `infer`

Confidence intervals with `infer`