1. In this exercise we will be building regression models for predicting house prices. We will be using data collected on 91 houses in Gainesville, Florida. The dataset contains the selling price of each house and information on four other explanatory variables, and it can be found on moodle.
The variables contained in the dataset are:
Y: Price. It is measured in thousands of dollars.
X1: Area. It is the floor area of the house measured in thousands of square feet.
X2: Bed. The number of bedrooms of the house.
X3: Bath. The number of bathrooms of the house.
X4: Pool. Indicates whether the house has a swimming pool. (it takes the value 1 if the house has a pool, and 0 otherwise).
Simple linear regression.
i) Fit 3 simple linear regression models with area, bed, and bath as the only predictor in each. Report the estimated parameters from the model that you consider to be the most useful in predicting house prices, along with an explanation why you consider that model to be the most useful one.
ii) Assuming that the best single predictor model is area, provide a 99% confidence interval for the mean price for a house area = 2500 square feet.
iii) Assume your neighbors own a house with area = 2500 square feet. Obtain a 99% prediction interval for the selling price of the house if they decided to sell it.