Machine Learning Valuation in Dual Market Dynamics: A Case Study of the Formal and Informal Real Estate Market in Dar es Salaam
<p>Histogram dependent variable.</p> "> Figure 2
<p>(<b>a</b>–<b>p</b>) The eight ML models’ in-sample and out-of-sample performance (prediction error) on the formal market. Note: The figure shows the prediction error defined as actual minus predicted values regarding in-sample (left in green) and out-of-sample (right in blue) data. Each pair of diagrams relates to a specific learner. The vertical axis measures the prediction error in Tanzanian shillings (TZS 1,000,000), and the horizontal axis measures each transaction’s identification number. The figure refers only to valuations made on the formal market. The results are based on the estimates in <a href="#buildings-14-03172-t002" class="html-table">Table 2</a>.</p> "> Figure 2 Cont.
<p>(<b>a</b>–<b>p</b>) The eight ML models’ in-sample and out-of-sample performance (prediction error) on the formal market. Note: The figure shows the prediction error defined as actual minus predicted values regarding in-sample (left in green) and out-of-sample (right in blue) data. Each pair of diagrams relates to a specific learner. The vertical axis measures the prediction error in Tanzanian shillings (TZS 1,000,000), and the horizontal axis measures each transaction’s identification number. The figure refers only to valuations made on the formal market. The results are based on the estimates in <a href="#buildings-14-03172-t002" class="html-table">Table 2</a>.</p> "> Figure 2 Cont.
<p>(<b>a</b>–<b>p</b>) The eight ML models’ in-sample and out-of-sample performance (prediction error) on the formal market. Note: The figure shows the prediction error defined as actual minus predicted values regarding in-sample (left in green) and out-of-sample (right in blue) data. Each pair of diagrams relates to a specific learner. The vertical axis measures the prediction error in Tanzanian shillings (TZS 1,000,000), and the horizontal axis measures each transaction’s identification number. The figure refers only to valuations made on the formal market. The results are based on the estimates in <a href="#buildings-14-03172-t002" class="html-table">Table 2</a>.</p> "> Figure 2 Cont.
<p>(<b>a</b>–<b>p</b>) The eight ML models’ in-sample and out-of-sample performance (prediction error) on the formal market. Note: The figure shows the prediction error defined as actual minus predicted values regarding in-sample (left in green) and out-of-sample (right in blue) data. Each pair of diagrams relates to a specific learner. The vertical axis measures the prediction error in Tanzanian shillings (TZS 1,000,000), and the horizontal axis measures each transaction’s identification number. The figure refers only to valuations made on the formal market. The results are based on the estimates in <a href="#buildings-14-03172-t002" class="html-table">Table 2</a>.</p> "> Figure 3
<p>(<b>a</b>–<b>p</b>): The eight ML models’ in-sample and out-of-sample performance on the formal and informal market. Note: the figure shows the prediction error defined as actual minus predicted values regarding in-sample (left in green) and out-of-sample (right in blue) data. Each pair of diagrams relates to a specific learner. The vertical axis measures the prediction error in Tanzanian shillings (TZS 1,000,000), and the horizontal axis measures each transaction’s identification number. The figure refers only to valuations made on the formal market. The results are based on the estimates in <a href="#buildings-14-03172-t003" class="html-table">Table 3</a>.</p> "> Figure 3 Cont.
<p>(<b>a</b>–<b>p</b>): The eight ML models’ in-sample and out-of-sample performance on the formal and informal market. Note: the figure shows the prediction error defined as actual minus predicted values regarding in-sample (left in green) and out-of-sample (right in blue) data. Each pair of diagrams relates to a specific learner. The vertical axis measures the prediction error in Tanzanian shillings (TZS 1,000,000), and the horizontal axis measures each transaction’s identification number. The figure refers only to valuations made on the formal market. The results are based on the estimates in <a href="#buildings-14-03172-t003" class="html-table">Table 3</a>.</p> "> Figure 3 Cont.
<p>(<b>a</b>–<b>p</b>): The eight ML models’ in-sample and out-of-sample performance on the formal and informal market. Note: the figure shows the prediction error defined as actual minus predicted values regarding in-sample (left in green) and out-of-sample (right in blue) data. Each pair of diagrams relates to a specific learner. The vertical axis measures the prediction error in Tanzanian shillings (TZS 1,000,000), and the horizontal axis measures each transaction’s identification number. The figure refers only to valuations made on the formal market. The results are based on the estimates in <a href="#buildings-14-03172-t003" class="html-table">Table 3</a>.</p> ">
Abstract
:1. Introduction
2. Literature Review
3. Machine Learning Techniques
3.1. Learner: Regression
3.2. Learner: Elastic Net
3.3. Learner: Tree
3.4. Learner: Forest
3.5. Learner: Boost
3.6. Learner: SVM
3.7. Learner: Neural Network
3.8. Learner: Nearest Neighbour
4. Data Analysis and Evaluation of Models
4.1. The Research Design and Data Used
4.2. Machine Learning Results
5. Discussion
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- United Nations. 2018 Revision of World Urbanization Prospects; United Nations Department of Economic and Social Affairs: New York City, NY, USA, 2018. [Google Scholar]
- Centre for Affordable Housing Finance Africa. Housing Finance in Africa Yearbook, 13th ed.; Centre for Affordable Housing Finance Africa: Johannesburg, South Africa, 2022. [Google Scholar]
- Centre for Affordable Housing Finance Africa. Housing Finance Yearbook: Tanzania; Centre for Affordable Housing Finance Africa: Johannesburg, South Africa, 2023. [Google Scholar]
- Sanga, A. The value of formal titles to land in residential property transactions: Evidence from Kinondoni municipality Tanzania. Int. J. Hous. Mark. Anal. 2018, 11, 117–148. [Google Scholar] [CrossRef]
- Panman, A.; Lozano Gracia, N. Titling and beyond: Evidence from Dar es Salaam, Tanzania. Land Use Policy 2022, 117, 105905. [Google Scholar] [CrossRef]
- Andreasen, M.H.; McGranahan, G.; Steel, G.; Khan, S. Self-builder landlordism: Exploring the supply and production of private rental housing in Dar es Salaam and Mwanza. J. Hous. Built Environ. 2021, 36, 1011–1031. [Google Scholar] [CrossRef]
- Kemwita, E.F.; Kombe, W.J.; Nguluma, H.M. Acquisition of land in flood risk informal setlements in Dar es Salaam: Choices and Compromises. Afr. J. Land Policy Geospat. Sci. 2023, 6, 188–208. [Google Scholar]
- Komu, F. Analysis of real estate value determinants—The case of valuation practice in Tanzania. In Proceedings of the 19th Annual AfRES Conference, Arusha, Tanzania, 10–13 September 2019. [Google Scholar]
- Huang, G.; Li, D.; Ng, S.T.; Wang, L.; Wang, T. A methodology for assessing supply-demand matching of smart government services from citizens’ perspective. Habitat Int. 2023, 138, 102880. [Google Scholar] [CrossRef]
- Huang, G.; Li, D.; Yu, L.; Yang, D.; Wang, Y. Factors affecting sustaintbility of smart city services in China: From the perspective of citizens’ sense of gain. Habitat Int. 2022, 128, 102645. [Google Scholar] [CrossRef]
- Makulilo, A.B. Analysis of the regime of systematic government access to private sector data in Tanzania. Inf. Commun. Technol. Law 2020, 29, 250–278. [Google Scholar] [CrossRef]
- Lalika, C.; Mujahid, A.U.H.; James, M. Machine learning algorithms for the prediction of drought conditions in the Wami River sub-catchment, Tanzania. J. Hydrol. Reg. Stud. 2024, 53, 101794. [Google Scholar] [CrossRef]
- Das, R.C.; Chatterjee, T.; Ivaldi, E. Nexus between housing price and magnitude of pollution: Evidence from the panel of some high-and-low polluting cities of the world. Sustainability 2022, 14, 9283. [Google Scholar] [CrossRef]
- Nyanda, F. The effect of proximity and spatial dependence on the house price index for Dar es Salaam. Int. J. Hous. Mark. Anal. 2024, 17, 945–963. [Google Scholar] [CrossRef]
- Prosise, J. Applied Machine Learning and AI for Engineers; O’Reilly Publishing: Newton, MA, USA, 2022. [Google Scholar]
- Kontrimas, V.; Verikas, A. The mass appraisal of the real estate by computational intelligence. Appl. Soft Comput. 2011, 11, 443–448. [Google Scholar] [CrossRef]
- McCluskey, W.J.; Zulkarnain Daud, D.; Kamarudin, N. Boosted regression trees: An application for the mass appraisal of residential property in Malaysia. Financ. Manag. Prop. Constr. 2014, 19, 152–167. [Google Scholar] [CrossRef]
- Hoxha, V. Comparative Analysis of Machine Learning Models in Predicting Housing Prices: A Case Study of Prishtina’s Real Estate Market; Emerald Publishing Limited: Bradford, UK, 2024. [Google Scholar] [CrossRef]
- Mullainathan, S.; Spiess, J. Machine learning: An applied econometric approach. J. Econ. Perspect. 2017, 31, 87–106. [Google Scholar] [CrossRef]
- Valier, A. Who performs better? AVMs vs. hedonic models. J. Prop. Invest. Financ. 2020, 38, 213–225. [Google Scholar] [CrossRef]
- Teoh, E.Z.; Yau, W.C.; Ong, T.S.; Connie, T. Explainable housing price prediction with determinant analysis. Int. J. Hous. Mark. Anal. 2023, 16, 1021–1045. [Google Scholar] [CrossRef]
- Kutasi, D.; Badics, M.C. Valuation methods for the housing market: Evidence from Budapest. Acta Oecon. 2016, 66, 527–546. [Google Scholar] [CrossRef]
- Park, B.; Bae, J.K. Using machine learning algorithms for housing price prediction: The case of Fairfax County, Virginia housing data. Expert Syst. Appl. 2015, 42, 2928–2934. [Google Scholar] [CrossRef]
- Chen, J.H.; Ong, C.F.; Zheng, L.; Hsu, S.C. Forecasting spatial dynamics of the housing market using Support Vector Machine. Int. J. Strateg. Prop. Manag. 2017, 21, 273–283. [Google Scholar] [CrossRef]
- Phan, T.D. Housing Price Prediction Using Machine Learning Algorithms: The Case of Melbourne City, Australia; IEEE Publication: New York City, NY, USA, 2019. [Google Scholar] [CrossRef]
- Zhang, Y.; Rahman, A.; Miller, E. Longitudinal modelling of housing prices with machine learning and temporal regression. Int. J. Hous. Mark. Anal. 2023, 16, 693–715. [Google Scholar] [CrossRef]
- Tchuente, D.; Nyawa, S. Real estate price estimation in French cities using geocoding and machine learning. Ann. Oper. Res. 2022, 308, 571–608. [Google Scholar] [CrossRef]
- Deppner, J.; von Ahlefeldt-Dehn, B.; Beracha, E.; Schaefers, W. Boosting the Accuracy of Commercial Real Estate Appraisals: An Interpretable Machine Learning Approach; Springer: Berlin/Heidelberg, Germany, 2023. [Google Scholar] [CrossRef]
- Sezer, O.B.; Gudelek, M.U.; Ozbayoglu, A.M. Financial time series forecasting with deep learning: A systematic literature review: 2005–2019. Appl. Soft Comput. 2020, 90, 106181. [Google Scholar] [CrossRef]
- Cerulli, G. Improving econometric prediction by machine learning. Appl. Econ. Lett. 2021, 28, 1425. [Google Scholar] [CrossRef]
- Rampini, L.; Re Cecconi, F. Artificial intelligence algorithms to predict Italian real estate market prices. J. Prop. Invest. Financ. 2022, 40, 588–611. [Google Scholar] [CrossRef]
- Lorenz, F.; Willwersch, J.; Cajias, M.; Fuerst, F. Interpretable machine learning for real estate market analysis. Real Estate Econ. 2023, 51, 1178–1208. [Google Scholar] [CrossRef]
- Molnar, C. A Guide for Making Black Box Models Explainable; Leanpub Publishing: Victoria, BC, Canada, 2020. [Google Scholar]
- Glumac, B.; Des Rosiers, F. Towards a taxonomy for real estate and land automated valuation systems. J. Prop. Invest. Financ. 2021, 39, 450–463. [Google Scholar] [CrossRef]
- Lenaers, I.; Boudt, K.; De Moor, L. Predictability of Belgian residential real estate rents using tree-based ML models and IML techniques. Int. J. Hous. Mark. Anal. 2024, 17, 96–113. [Google Scholar] [CrossRef]
- Osunsanmi, T.O.; Olawumi, T.O.; Smith, A.; Jaradat, S.; Aigbavboa, C.; Aliu, J.; Oke, A.; Ajayi, O.; Oyeyipo, O. Modelling the drivers of data science techniques for real estate professionals in the fourth industrial revolution era. Prop. Manag. 2024, 42, 310–331. [Google Scholar] [CrossRef]
- Abidoye, R.B.; Chan, A.P.C.; Abidoye, F.A.; Oshodi, O.S. Predicting property price index using artificial intelligence techniques: Evidence from Hong Kong. Int. J. Hous. Mark. Anal. 2019, 12, 1072–1092. [Google Scholar] [CrossRef]
- James, G.; Witten, D.; Hastie, T.; Tibshirani, R. An Introduction to Statistical Learning: With Applications in R; Springer: Berlin/Heidelberg, Germany, 2021. [Google Scholar]
- Kuhn, M.; Johnson, K. Applied Predictive Modeling; Springer: Berlin/Heidelberg, Germany, 2013. [Google Scholar] [CrossRef]
- Zou, H.; Hastie, T. Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B Stat. Methodol. 2005, 67, 301–320. [Google Scholar] [CrossRef]
- Breiman, L.; Friedman, J.H.; Olshen, R.A.; Stone, C.J. Classification and Regression Trees; Taylor and Francis: Abingdon, UK, 2017. [Google Scholar] [CrossRef]
- Scornet, E. Trees, forests, and impurity-based variable importance in regression. Ann. Inst. H. Poincaré Probab. Stat. 2023, 59, 21–52. [Google Scholar] [CrossRef]
- Cerulli, G. Machine learning using Stata/Python. Stata J. 2022, 22, 772–810. [Google Scholar] [CrossRef]
- Neves, F.T.; Aparicio, M.; Neto, M.C. The impacts of open data and eXplainable AI on real estate price predictions in smart cities. Appl. Sci. 2024, 14, 2209. [Google Scholar] [CrossRef]
- Yağmur, A.; Kayakuş, M.; Terzioğlu, M. House price prediction modeling using machine learning techniques: A comparative study. Aestimum 2023, 81, 39–51. [Google Scholar] [CrossRef]
- Meharie, M.G.; Mengesha, W.J.; Gariy, Z.A.; Mutuku, R.N.N. Application of stacking ensemble machine learning algorithm in predicting the cost of highway construction projects. Eng. Constr. Archit. Manag. 2022, 29, 2836–2853. [Google Scholar] [CrossRef]
- Nguyen, N.; Cripps, A. Predicting Housing Value: A Comparison of Multiple Regression Analysis and Artificial Neural Networks. J. Real Estate Res. 2001, 22, 313–336. [Google Scholar] [CrossRef]
- Ho, W.K.; Tang, B.S.; Wong, S.W. Predicting property prices with machine learning algorithms. J. Prop. Res. 2021, 38, 48–70. [Google Scholar] [CrossRef]
- Lohith, O.; Jha, A.; Tamboli, S.C. Comparative Analysis of Random Forest Regression for House Price Prediction. Int. J. Creat. Res. Thoughts 2023, 11, h336–h343. [Google Scholar]
- Rico-Juan, J.R.; Taltavull de La Paz, P. Machine learning with explainability or spatial hedonic tools? An analysis of the asking prices in the housing market in Alicante, Spain. Expert Syst. Appl. 2021, 171, 114590. [Google Scholar] [CrossRef]
- Han, S.; Williamson, B.D.; Fong, Y. Improving random forest predictions in small datasets from two-phase sampling designs. BMC Med. Inform. Decis. Mak. 2021, 21, 322. [Google Scholar] [CrossRef]
- Qi, Y. Random forest for bioinformatics. In Ensemble Machine Learning; Springer: Berlin/Heidelberg, Germany, 2012. [Google Scholar]
- Pellatt, J.; Palfreman, J. Smart technology for cleaner city: A case study of Dar es Salaam, Tanzania. GeoJournal 2023, 88, 5221–5245. [Google Scholar] [CrossRef]
Variable | Obs | Mean | Std. Dev. | Min | Max |
---|---|---|---|---|---|
Price (1,000,000) | 954 | 193.124 | 184.342 | 5 | 1566 |
Kimabu | 954 | 0.481 | 0.5 | 0 | 1 |
Goba | 954 | 0.123 | 0.328 | 0 | 1 |
Tabata | 954 | 0.129 | 0.335 | 0 | 1 |
Kawe | 954 | 0.071 | 0.257 | 0 | 1 |
No storeys | 954 | 1.06 | 0.246 | 1 | 3 |
Roof cas | 954 | 0.158 | 0.365 | 0 | 1 |
Roof asbestos | 954 | 0.021 | 0.143 | 0 | 1 |
Roof clay tiles | 954 | 0.093 | 0.291 | 0 | 1 |
Ceil gypchtng | 954 | 0.773 | 0.419 | 0 | 1 |
Window wood | 954 | 0.637 | 0.481 | 0 | 1 |
Floor cerrtiles | 954 | 0.405 | 0.491 | 0 | 1 |
Floor terrazo | 954 | 0.006 | 0.079 | 0 | 1 |
No. bedrooms | 954 | 3.351 | 0.944 | 1 | 8 |
Plotsize | 954 | 303.32 | 268.455 | 24 | 2000 |
Fence | 954 | 0.51 | 0.5 | 0 | 1 |
2010 | 954 | 0.072 | 0.259 | 0 | 1 |
2011 | 954 | 0.039 | 0.193 | 0 | 1 |
2012 | 954 | 0.048 | 0.214 | 0 | 1 |
2013 | 954 | 0.08 | 0.271 | 0 | 1 |
2014 | 954 | 0.078 | 0.268 | 0 | 1 |
2016 | 954 | 0.146 | 0.353 | 0 | 1 |
2017 | 954 | 0.151 | 0.358 | 0 | 1 |
2018 | 954 | 0.128 | 0.334 | 0 | 1 |
2019 | 954 | 0.101 | 0.301 | 0 | 1 |
Distance road | 954 | 0.154 | 0.191 | 0.005 | 1.64 |
Distance hospital | 954 | 1.612 | 1.479 | 0.045 | 10.544 |
Distance airport | 954 | 14.864 | 7.49 | 3.161 | 33.707 |
Distance food market | 954 | 2.439 | 2.716 | 0.131 | 12.53 |
Y-coordinate | 954 | −6.746 | 0.069 | −7.004 | −6.579 |
X-coordinate | 954 | 39.204 | 0.041 | 39.083 | 39.342 |
MAPE | MSE | Cross-Validation | ||||
---|---|---|---|---|---|---|
Training | Testing | Training | Testing | Training | Testing | |
Regression | 68.142 | 92.410 | 11,503.529 | 13,731.797 | 0.745 | 0.229 |
Elastic net | 80.727 | 89.890 | 22,128.828 | 20,498.686 | 0.495 | 0.336 |
Regression tree | 74.329 | 94.020 | 17,022.891 | 26,605.547 | 0.599 | 0.045 |
Boost | 83.225 | 101.460 | 16,299.616 | 13,646.772 | 0.604 | 0.188 |
Forest | 35.586 | 56.420 | 7844.813 | 15,963.183 | 0.838 | 0.477 |
Neural network | 88.532 | 108.594 | 23,820.044 | 21,733.466 | 0.562 | 0.392 |
SVM | 9.306 | 52.399 | 9031.117 | 17,819.553 | 0.250 | 0.154 |
Nearest neighbour | 0.000 | 37.611 | 0.000 | 14,385.118 | 1.000 | 0.335 |
MAPE | MSE | Cross-Validation | ||||
---|---|---|---|---|---|---|
Training | Testing | Training | Testing | Training | Testing | |
Regression | 80.256 | 73.098 | 11,796.093 | 12,532.698 | 0.672 | 0.370 |
Elastic net | 74.886 | 69.075 | 12,586.828 | 13,439.529 | 0.647 | 0.424 |
Regression tree | 152.035 | 137.899 | 24,895.351 | 28,213.442 | 0.425 | 0.040 |
Boost | 37.652 | 47.985 | 4426.025 | 13,670.108 | 0.842 | 0.203 |
Forest | 30.897 | 52.652 | 4709.234 | 13,728.414 | 0.880 | 0.503 |
Neural network | 59.752 | 63.486 | 14,859.287 | 15,296.036 | 0.344 | 0.206 |
SVM | 3.830 | 92.332 | 2937.825 | 15,161.066 | 0.921 | 0.239 |
Nearest neighbour | 0.000 | 75.58 | 0.000 | 13,702.390 | 1.000 | 0.067 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Nyanda, F.; Muyingo, H.; Wilhelmsson, M. Machine Learning Valuation in Dual Market Dynamics: A Case Study of the Formal and Informal Real Estate Market in Dar es Salaam. Buildings 2024, 14, 3172. https://doi.org/10.3390/buildings14103172
Nyanda F, Muyingo H, Wilhelmsson M. Machine Learning Valuation in Dual Market Dynamics: A Case Study of the Formal and Informal Real Estate Market in Dar es Salaam. Buildings. 2024; 14(10):3172. https://doi.org/10.3390/buildings14103172
Chicago/Turabian StyleNyanda, Frank, Henry Muyingo, and Mats Wilhelmsson. 2024. "Machine Learning Valuation in Dual Market Dynamics: A Case Study of the Formal and Informal Real Estate Market in Dar es Salaam" Buildings 14, no. 10: 3172. https://doi.org/10.3390/buildings14103172
APA StyleNyanda, F., Muyingo, H., & Wilhelmsson, M. (2024). Machine Learning Valuation in Dual Market Dynamics: A Case Study of the Formal and Informal Real Estate Market in Dar es Salaam. Buildings, 14(10), 3172. https://doi.org/10.3390/buildings14103172