# problem set 1

Make sure you run this chunk before attempting any of the problems:
library(tidyverse)2 BasicsCalculate 2 22 2:
2 2## [1] 4Calculate 2∗32∗3:
# your code hereCalculate (2 2)×(32 5)(6/4)(2 2)×(32 5)(6/4):
# your code here3 dplyrLet’s work with the data set diamonds:
data(diamonds) # this will load a dataset called “diamonds”Calculate the average price of a diamond. Use the %>% and summarise() syntax (hint: see lectures).
# your code hereCalculate the average, median and standard deviation price of a diamond. Use the %>% and summarise() syntax.
# your code hereUse group_by() to group diamonds by color, then use summarise() to calculate the average price and the standard deviation in price by color:
# your code hereUse filter() to remove observations with a depth greater than 62, then usegroup_by() to group diamonds by clarity, then use summarise() to find the maximum price of a diamond by clarity:
# your code hereUse mutate() and log() to create a new variable to the data called “log_price”. Make sure you add the variable to the dataset diamonds.
# your code here(Hint: if I wanted to add a variable called “max_price” that calculates the max price, the code would look like this:)
diamonds = diamonds %>%
mutate(max_price = max(price))4 ggplot2Continue using diamonds.
Use geom_histogram() to plot a histogram of prices:
# your code hereUse geom_density() to plot the density of log prices (the variable you added to the data frame):
# your code hereUse geom_point() to plot carats against log prices (i.e. carats on the x-axis, log prices on the y-axis):
# your code hereSame as above, but now add a regression line with geom_smooth():
# your code hereUse stat_summary() to make a bar plot of average log price by cut:
# your code hereSame as above but change the theme to theme_classic():
# your code here5 InferenceUse lm() to estimate the model
log(price)=β0 β1carat β2table εlog(price)=β0 β1carat β2table ε
and store the output in an object called “m1”:
# your code hereUse summary() to view the output of “m1”:
# your code hereUse lm() to estimate the model
log(price)=β0 β1carat β2table β3depth εlog(price)=β0 β1carat β2table β3depth ε
and store the output in an object called “m2”:
# your code hereUse summary() to view the output of “m2”: