# Problem set 1

due Monday, September 9, 2024 at 11:59am

Instructions
Upload your `.ipynb` notebook to gradescope by 11:59am on the due date. Please include your name, Problem set 1, and any collaborators you worked with in a text cell at the top of your notebook. Please also number your problems and include comments in your code to indicate what part of a problem you are working on.

## Problem 1

Suppose you track the number of hours you spend studying each day for a week. Create a vector with the following values and store it as `study_hours` : `0 4 5 1 0 0 7`. Use R’s built-in functions to compute the total number of hours you studied that week, the average number of hours studied per day, and the maximum number of hours studied on a single day. Perform a comparison operation on the `study_hours` vector to determine whether each day’s study hours were greater than zero.

## Problem 2

Create the matrix given below. Subtract 6 from every number in the matrix and store the output as a new matrix called `new_matrix`. Then use subsetting to return the value in the first row and third column of `new_matrix`.

``````     [,1] [,2] [,3]
[1,]   -5   -2    1
[2,]   -4   -1    2
[3,]   -3    0    3``````

## Problem 3

Create a data frame that looks like the one below. Return the structure of the dataframe with `str()`. Use subsetting such that you select the age column and return a vector (not a dataframe). Use a comparison operation on the vector to determine whether each individual is over 80 years old.

``````  age height  major score firstgen
1  30     65 cogsci   100     TRUE
2  45     66   ling    75    FALSE
3  81     72  psych    88     TRUE
4  27     59   ling    97    FALSE``````

## Problem 4

Read the documentation for the `emojifont` package. Install and load the package. Use the package’s `search_emoji()` function to find all of the emojis with hearts. Then use the `emoji()` function to return all of these emjois as a vector, as shown below. Finally, select your favorite emoji and visulize it with ggplot, using `theme_void()`.

Hint

emojifont has it’s own ggplot geom called `geom_emoji()`! Read the `emojifont` docs to learn more.

An example of a vector of heart emojis (approximately the same is fine!):

`````` [1] "😍"    "😘"    "😻"    "💑"    "💑"    "👩‍❤️‍👩" "👨‍❤️‍👨" "❤️"     "💛"
[10] "💚"    "💙"    "💜"    "🖤"    "💔"    "❣️"     "💕"    "💞"    "💓"
[19] "💗"    "💖"    "💝"    "💟"    "♥️"    ``````

An example of a favorite emoji visualized with ggplot and theme_void:

## Problem 5

Problems 5-7 make use of the `english` dataset in the `languageR` package. From the documentation:

This data set gives mean visual lexical decision latencies and word naming latencies to 2284 monomorphemic English nouns and verbs, averaged for old and young subjects, with various predictor variables.

Install and load the `languageR` library. Use `str()` to return the structure of the `english` dataset. Use subsetting via the `\$` operator and the `typeof()` function to return the type of the `NounFrequency` column.

## Problem 6

Use the `WrittenFrequency`, `Familiarity`, and `WordCategory` columns in the `english` dataset to recreate (as faithfully as possible) the figure below.

## Problem 7

Compute the mean of `Familiarity` and store it in a variable called `mean_familiarity`. Add a dashed horizontal line on top of the dots in your figure to indicate this value, as shown below. Also include an annotation layer above the dashed line to indicate the line is the mean familiarity with text.

Hint

`geom_hline()` adds a horizontal line and `annotate()` adds a text layer! Investigate these geoms and layers to figure out this problem!