Model Fitting with R (demo)

Data Science for Studying Language and the Mind

Katie Schuler

2023-10-17

You are here

Data science with R
  • Hello, world!
  • R basics
  • Data importing
  • Data visualization
  • Data wrangling
Stats & Model buidling
  • Sampling distribution
  • Hypothesis testing
  • Model specification
  • Model fitting
  • Model accuracy
  • Model reliability
More advanced
  • Classification
  • Feature engineering (preprocessing)
  • Inference for regression
  • Mixed-effect models

Roadmap

  • Fit a model in R
  • Goodness of fit: quantifying our intuition
  • How do we estimate the free parameters?
    1. Gradient descent - iterative optimization algorithm
    2. Ordinary least squares - analytical solution for linear regression
  • If time: another full example

Fit a model

model specification
R syntax rt ~ 1 + experience
R syntax rt ~ experience
Equation \(y=w_0+w_1x_1\)
flowchart TD
    spec(Model specification) --> fit(Estimate free parameters) 
    fit(Estimate free parameters) --> fitted(Fitted model) 

\(y = 211.271 + -1.695x\)

Fit a model in R

model specification
R syntax rt ~ 1 + experience
R syntax rt ~ experience
Equation \(y=w_0+w_1x_1\)
lm(rt ~ experience, data = data)

Call:
lm(formula = rt ~ experience, data = data)

Coefficients:
(Intercept)   experience  
    211.271       -1.695  
data %>% 
    specify(rt ~ experience) %>%
    fit()
# A tibble: 2 × 2
  term       estimate
  <chr>         <dbl>
1 intercept    211.  
2 experience    -1.69
linear_reg() %>%
    set_engine("lm") %>%
    fit(rt ~ experience, data = data)
parsnip model object


Call:
stats::lm(formula = rt ~ experience, data = data)

Coefficients:
(Intercept)   experience  
    211.271       -1.695  

Goodness of fit

Quantifying our intution with sum of squared error

experience rt prediction error squared_error
49 124 128.216 4.216 17.774656
69 95 94.316 -0.684 0.467856
89 71 60.416 -10.584 112.021056
99 45 43.466 -1.534 2.353156
109 18 26.516 8.516 72.522256

\(SSE=\sum_{i=i}^{n} (d_{i} - m_{i})^2 = 205.139\)

Sum of squared error

Given some data:

experience rt prediction error squared_error
49 124 128.216 4.216 17.774656
69 95 94.316 -0.684 0.467856
89 71 60.416 -10.584 112.021056
99 45 43.466 -1.534 2.353156
109 18 26.516 8.516 72.522256

Compute the sum of squared error:

\(SSE=\sum_{i=i}^{n} (d_{i} - m_{i})^2 = 205.139\)

  data %>%
    mutate(prediction = 211.271 + -1.695 * experience) %>%
    mutate(error = prediction - rt) %>%
    mutate(squared_error = error^2) %>%
    with(sum(squared_error))
[1] 205.139

Gradient descent

A search problem: we have a parameter space, a cost function, and our job is to search through the space to find the point that minimizes the cost function.

Gradient descent

Define our cost function:

SSE <- function(data, par) {
  data %>%
    mutate(prediction = par[1] + par[2]* experience) %>%
    mutate(error = prediction - rt) %>%
    mutate(squared_error = error^2) %>%
    with(sum(squared_error))
}

SSE(data = data, par = c(0, 0))
[1] 31791

Impliment gradient descent algorithm with optimg

optimg(
    data = data,  # our data
    par = c(0,0), # our parameters
    fn = SSE,     # our cost function
    method = "STGD") # our iterative optimization algorithm 
$par
[1] 211.267867  -1.694792

$value
[1] 205.1379

$counts
[1] 18

$convergence
[1] 0

Ordinary least squares solution

\(y = w_0 + w_1x_1\)

We have a system of equations:

  • \(124 = w_01 + w_149\)
  • \(95 = w_01 + w_169\)
  • \(18 = w_01 + w_1109\)

We can express them as a matrix: \(Y = Xw + \epsilon\)

And solve with linear algebra: \(w = (X^TX)^{-1}X^TY\)

Ordinary least squares solution in R

 ols_matrix_way <- function(X, Y){
  solve(t(X) %*% X) %*% t(X) %*% Y
 }

We need to construct X and Y (must be matrices):

(response_matrix <- data %>% select(rt) %>% as.matrix())
(explanatory_matrix <- data %>% mutate(int = 1) %>% select(int, experience) %>% as.matrix())
      rt
[1,] 124
[2,]  95
[3,]  71
[4,]  45
[5,]  18
     int experience
[1,]   1         49
[2,]   1         69
[3,]   1         89
[4,]   1         99
[5,]   1        109

Then we can use our function to generate the OLS solution:

ols_matrix_way(explanatory_matrix, response_matrix)
                   rt
int        211.270690
experience  -1.694828
lm(rt ~ experience, data = data)

Call:
lm(formula = rt ~ experience, data = data)

Coefficients:
(Intercept)   experience  
    211.271       -1.695