| Summary: | This thesis investigates the joint role of the house-specific features mainly, with the sociodemographic
and macro-economic features on the rental price prediction. The research focusses on
the property rental market in Brazil utilizing a data-set with over 10.000 properties. By employing
machine learning algorithms such as the Linear Regression, the Random Forest Regressor and the
Support Vector Regression, we aim to find the set of features that contributes most to the model
quality. The analysis proved that the highest explanatory power and the lower error come from the
combination of the house-specific features and the city dummies. The optimized versions of these
machine learning algorithms are forecasting based on this feature set, in order to evaluate their
performance and extract the feature importance. The most improved model was the Tuned Random
Forest Regressor but with quite similar performance metrics to the Tuned Support Vector
Regression. The results of the analysis show that the most important features in the forecasting
procedure are the number of bathrooms, the size of the rooms and the parking spaces. Additionally,
beta coefficients imply that properties located on the top floors have a considerable higher rental
price, while properties situated in Porto Alegre or in Campinas face a negative impact on rental
prices due to their location.
|