Posts

Showing posts from November, 2019

Investigating House Prices

Image
Having now learnt a variety of machine learning models, as well as evaluative methods, it was time for some light practice using an old Kaggle dataset on house prices. Predicting house prices have a variety of applications from prospective buyers estimating the price of a house based on desired features to homeowners who wish to value their house before selling. While the entire analysis was carried out in R, I did a few cleaning bits in Python too to gain further practice. Data Exploration The training dataset is quite small, with only 1460 observations and about 79 variables. These variables included a variety of salient features about houses, including the neighbourhood, square footage, details on the number of rooms, whether the house has a fireplace, pool, garage and so on, as well as the age, and so on. Let's first begin by exploring the data. Here is the distribution of house prices. We can see that the distribution is skewed to the right, with most houses around t...