Model Fitting with R (demo)
Data Science for Studying Language and the Mind
2023-10-17
You are here
Data science with R
Hello, world!
R basics
Data importing
Data visualization
Data wrangling
Stats & Model buidling
Sampling distribution
Hypothesis testing
Model specification
Model fitting
Model accuracy
Model reliability
More advanced
Classification
Feature engineering (preprocessing)
Inference for regression
Mixed-effect models
Roadmap
Fit a model in R
Goodness of fit: quantifying our intuition
How do we estimate the free parameters?
Gradient descent - iterative optimization algorithm
Ordinary least squares - analytical solution for linear regression
If time: another full example
Fit a model
R syntax
rt ~ 1 + experience
R syntax
rt ~ experience
Equation
\(y=w_0+w_1x_1\)
flowchart TD
spec(Model specification) --> fit(Estimate free parameters)
fit(Estimate free parameters) --> fitted(Fitted model)
\(y = 211.271 + -1.695x\)
Fit a model in R
R syntax
rt ~ 1 + experience
R syntax
rt ~ experience
Equation
\(y=w_0+w_1x_1\)
lm (rt ~ experience, data = data)
Call:
lm(formula = rt ~ experience, data = data)
Coefficients:
(Intercept) experience
211.271 -1.695
data %>%
specify (rt ~ experience) %>%
fit ()
# A tibble: 2 × 2
term estimate
<chr> <dbl>
1 intercept 211.
2 experience -1.69
linear_reg () %>%
set_engine ("lm" ) %>%
fit (rt ~ experience, data = data)
parsnip model object
Call:
stats::lm(formula = rt ~ experience, data = data)
Coefficients:
(Intercept) experience
211.271 -1.695
Goodness of fit
Quantifying our intution with sum of squared error
49
124
128.216
4.216
17.774656
69
95
94.316
-0.684
0.467856
89
71
60.416
-10.584
112.021056
99
45
43.466
-1.534
2.353156
109
18
26.516
8.516
72.522256
\(SSE=\sum_{i=i}^{n} (d_{i} - m_{i})^2 = 205.139\)
Sum of squared error
Given some data:
49
124
128.216
4.216
17.774656
69
95
94.316
-0.684
0.467856
89
71
60.416
-10.584
112.021056
99
45
43.466
-1.534
2.353156
109
18
26.516
8.516
72.522256
Compute the sum of squared error:
\(SSE=\sum_{i=i}^{n} (d_{i} - m_{i})^2 = 205.139\)
data %>%
mutate (prediction = 211.271 + - 1.695 * experience) %>%
mutate (error = prediction - rt) %>%
mutate (squared_error = error^ 2 ) %>%
with (sum (squared_error))
Gradient descent
A search problem: we have a parameter space, a cost function, and our job is to search through the space to find the point that minimizes the cost function.
Gradient descent
Define our cost function:
SSE <- function (data, par) {
data %>%
mutate (prediction = par[1 ] + par[2 ]* experience) %>%
mutate (error = prediction - rt) %>%
mutate (squared_error = error^ 2 ) %>%
with (sum (squared_error))
}
SSE (data = data, par = c (0 , 0 ))
Impliment gradient descent algorithm with optimg
optimg (
data = data, # our data
par = c (0 ,0 ), # our parameters
fn = SSE, # our cost function
method = "STGD" ) # our iterative optimization algorithm
$par
[1] 211.26155 -1.69473
$value
[1] 205.138
$counts
[1] 12
$convergence
[1] 0
Ordinary least squares solution
\(y = w_0 + w_1x_1\)
We have a system of equations:
\(124 = w_01 + w_149\)
\(95 = w_01 + w_169\)
…
\(18 = w_01 + w_1109\)
We can express them as a matrix: \(Y = Xw + \epsilon\)
And solve with linear algebra: \(w = (X^TX)^{-1}X^TY\)
Ordinary least squares solution in R
ols_matrix_way <- function (X, Y){
solve (t (X) %*% X) %*% t (X) %*% Y
}
We need to construct X and Y (must be matrices):
(response_matrix <- data %>% select (rt) %>% as.matrix ())
(explanatory_matrix <- data %>% mutate (int = 1 ) %>% select (int, experience) %>% as.matrix ())
rt
[1,] 124
[2,] 95
[3,] 71
[4,] 45
[5,] 18
int experience
[1,] 1 49
[2,] 1 69
[3,] 1 89
[4,] 1 99
[5,] 1 109
Then we can use our function to generate the OLS solution:
ols_matrix_way (explanatory_matrix, response_matrix)
rt
int 211.270690
experience -1.694828
lm (rt ~ experience, data = data)
Call:
lm(formula = rt ~ experience, data = data)
Coefficients:
(Intercept) experience
211.271 -1.695