[1] "double"
Data Science for Studying Language and the Mind
2023-08-31
here
R basics
Expressions
: fundamental building blocks of programmingObjects
: allow us to store stuff, created with assignment operatorName
s: names w give objects must be letters, numbers, ., or _Attributes
: allow us to attach arbitrary metadata to objectsFunctions
: take some input, perform some computation, and return some outputEnvironment
: collection of all objects we defined in current R sessionPackages
: collections of functions, data, and documentation bundled together in RComments
: notes you leave for yourself, not evaluatedMessages
: notes R leaves for you (FYI, warning, error)str(x)
- returns summary of object’s structuretypeof(x)
- returns object’s data typelength(x)
- returns object’s lengthattributes(x)
- returns list of object’s attributesls()
- list all variables in environmentrm(x)
- remove x variable from environmentrm(list = ls())
- remove all variables from environmentinstall.packages()
to install packageslibrary()
to load package into current R session.data()
to load data from package into environmentsessionInfo()
- version info, packages for current R session?mean
- get help with a functionhelp('mean')
- search help files for word or phrasehelp(package='tidyverse')
- find help for a packageare fundamental data structures in R. There are two types:
Atomic vectors can be one of six data types:
typeof(x) |
examples |
---|---|
double | 3, 3.32 |
integer | 1L, 144L |
character | “hello”, ‘hello, world!’ |
logical | TRUE, F |
atomic because they must contain only one type
with c()
for concatenate
with sequences seq()
or repetitions rep()
with typeof(x)
- returns the type of vector x
with is.*(x)
- returns TRUE
if x has type *
If you try to include elements of different types, R will coerce them into the same type without warning (implicit coercion)
You can also use explict coercion to change a vector to another data type with as.*()
Some more complex data structures are built from atomic vectors by adding attributes:
Structure | Description |
---|---|
matrix |
vector with dim attribute representing 2 dimensions |
array |
vector with dim attribute representing n dimensions |
data.frame |
a named list of vectors (of equal length) with attributes for names (column names), row.names , and class="data.frame" |
Operator | Operation |
---|---|
() |
Parentheses |
^ |
Exponent |
* |
Multiply |
/ |
Divide |
+ |
Add |
- |
Subtract |
follow the order of operations you expect (PEMDAS)
Operator | Comparison |
---|---|
x < y |
less than |
x > y |
greater than |
x <= y |
less than or equal to |
x >= y |
greater than or equal to |
x != y |
not equal to |
x == y |
equal to |
Operator | Operation |
---|---|
x | y |
or |
x & y |
and |
!x |
not |
any() |
true if any element meets condition |
all() |
true if all elements meet condition |
%in% |
true if any element is in following vector |
Almost all operations (and many functions) are vectorized
math
Operators and functions will also coerce values when needed (and without warning)
Subsetting is a natural complement to str(). While str() shows you all the pieces of any object (its structure), subsetting allows you to pull out the pieces that you’re interested in. ~ Hadley Wickham, Advanced R
str()
There are three operators for subsetting objects:
[
- subsets (one or more) elements[[
and $
- extracts a single element[
Code | Returns |
---|---|
x[c(1,2)] |
positive integers select elements at specified indexes |
x[-c(1,2)] |
negative integers select all but elements at specified indexes |
x[c("x", "y")] |
select elements by name, if elements are named |
x[] |
nothing returns the original object |
x[0] |
zero returns a zero-length vector |
x[c(TRUE, TRUE)] |
select elements where corresponding logical value is TRUE |
[
atomic vector
Code | Returns |
---|---|
[[2]] |
a single positive integer (index) |
[['name']] |
a single string |
x$name |
the $ operator is a useful shorthand for [['name']] |
NA
NA
is contageous: expressions including NA
usually return NA
NA
values with is.na()
functions
are reusable pieces of code that take some input, perform some task or computation, and return an output
control flow
refers to managing the order in which expressions are executed in a program
if
…else
- if something is true, do this; otherwise do thatfor
loops - repeat code a specific number of timeswhile
loops - repeat code as long as certain conditions are truebreak
- exit a loop earlynext
- skip to next iteration in a loopIf we have time
[
with higher dim objects[[
and $
:both [[
and [
work for vectors; use [[
$
does partial matching without warning
Have a great weekend!