# Model Fitting with R (demo)

Data Science for Studying Language and the Mind

Katie Schuler

2023-10-17

## You are here

##### Data science with R
• Hello, world!
• R basics
• Data importing
• Data visualization
• Data wrangling
##### Stats & Model buidling
• Sampling distribution
• Hypothesis testing
• Model specification
• Model fitting
• Model accuracy
• Model reliability
• Classification
• Feature engineering (preprocessing)
• Inference for regression
• Mixed-effect models

• Fit a model in R
• Goodness of fit: quantifying our intuition
• How do we estimate the free parameters?
1. Gradient descent - iterative optimization algorithm
2. Ordinary least squares - analytical solution for linear regression
• If time: another full example

## Fit a model

model specification
R syntax rt ~ 1 + experience
R syntax rt ~ experience
Equation $y=w_0+w_1x_1$
flowchart TD
spec(Model specification) --> fit(Estimate free parameters)
fit(Estimate free parameters) --> fitted(Fitted model)


$y = 211.271 + -1.695x$

## Fit a model in R

model specification
R syntax rt ~ 1 + experience
R syntax rt ~ experience
Equation $y=w_0+w_1x_1$
lm(rt ~ experience, data = data)

Call:
lm(formula = rt ~ experience, data = data)

Coefficients:
(Intercept)   experience
211.271       -1.695  
data %>%
specify(rt ~ experience) %>%
fit()
# A tibble: 2 × 2
term       estimate
<chr>         <dbl>
1 intercept    211.
2 experience    -1.69
linear_reg() %>%
set_engine("lm") %>%
fit(rt ~ experience, data = data)
parsnip model object

Call:
stats::lm(formula = rt ~ experience, data = data)

Coefficients:
(Intercept)   experience
211.271       -1.695  

## Goodness of fit

Quantifying our intution with sum of squared error

experience rt prediction error squared_error
49 124 128.216 4.216 17.774656
69 95 94.316 -0.684 0.467856
89 71 60.416 -10.584 112.021056
99 45 43.466 -1.534 2.353156
109 18 26.516 8.516 72.522256

$SSE=\sum_{i=i}^{n} (d_{i} - m_{i})^2 = 205.139$

## Sum of squared error

Given some data:

experience rt prediction error squared_error
49 124 128.216 4.216 17.774656
69 95 94.316 -0.684 0.467856
89 71 60.416 -10.584 112.021056
99 45 43.466 -1.534 2.353156
109 18 26.516 8.516 72.522256

Compute the sum of squared error:

$SSE=\sum_{i=i}^{n} (d_{i} - m_{i})^2 = 205.139$

  data %>%
mutate(prediction = 211.271 + -1.695 * experience) %>%
mutate(error = prediction - rt) %>%
mutate(squared_error = error^2) %>%
with(sum(squared_error))
 205.139

A search problem: we have a parameter space, a cost function, and our job is to search through the space to find the point that minimizes the cost function.

Define our cost function:

SSE <- function(data, par) {
data %>%
mutate(prediction = par + par* experience) %>%
mutate(error = prediction - rt) %>%
mutate(squared_error = error^2) %>%
with(sum(squared_error))
}

SSE(data = data, par = c(0, 0))
 31791

Impliment gradient descent algorithm with optimg

optimg(
data = data,  # our data
par = c(0,0), # our parameters
fn = SSE,     # our cost function
method = "STGD") # our iterative optimization algorithm 
$par  211.26155 -1.69473$value
 205.138

$counts  12$convergence
 0

## Ordinary least squares solution

$y = w_0 + w_1x_1$

We have a system of equations:

• $124 = w_01 + w_149$
• $95 = w_01 + w_169$
• $18 = w_01 + w_1109$

We can express them as a matrix: $Y = Xw + \epsilon$

And solve with linear algebra: $w = (X^TX)^{-1}X^TY$

## Ordinary least squares solution in R

 ols_matrix_way <- function(X, Y){
solve(t(X) %*% X) %*% t(X) %*% Y
}

We need to construct X and Y (must be matrices):

(response_matrix <- data %>% select(rt) %>% as.matrix())
(explanatory_matrix <- data %>% mutate(int = 1) %>% select(int, experience) %>% as.matrix())
      rt
[1,] 124
[2,]  95
[3,]  71
[4,]  45
[5,]  18
     int experience
[1,]   1         49
[2,]   1         69
[3,]   1         89
[4,]   1         99
[5,]   1        109

Then we can use our function to generate the OLS solution:

ols_matrix_way(explanatory_matrix, response_matrix)
                   rt
int        211.270690
experience  -1.694828
lm(rt ~ experience, data = data)

Call:
lm(formula = rt ~ experience, data = data)

Coefficients:
(Intercept)   experience
211.271       -1.695