[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Location-Centered House Price Prediction: A Multi-Task Learning Approach

Published: 05 January 2022 Publication History

Abstract

Accurate house prediction is of great significance to various real estate stakeholders such as house owners, buyers, and investors. We propose a location-centered prediction framework that differs from existing work in terms of data profiling and prediction model. Regarding data profiling, we make an important observation as follows – besides the in-house features such as floor area, the location plays a critical role in house price prediction. Unfortunately, existing work either overlooked it or had a coarse grained measurement of locations. Thereby, we define and capture a fine-grained location profile powered by a diverse range of location data sources, including transportation profile, education profile, suburb profile based on census data, and facility profile. Regarding the choice of prediction model, we observe that a variety of approaches either consider the entire data for modeling, or split the entire house data and model each partition independently. However, such modeling ignores the relatedness among partitions, and for all prediction scenarios, there may not be sufficient training samples per partition for the latter approach. We address this problem by conducting a careful study of exploiting the Multi-Task Learning (MTL) model. Specifically, we map the strategies for splitting the entire house data to the ways the tasks are defined in MTL, and select specific MTL-based methods with different regularization terms to capture and exploit the relatedness among tasks. Based on real-world house transaction data collected in Melbourne, Australia, we design extensive experimental evaluations, and the results indicate a significant superiority of MTL-based methods over state-of-the-art approaches. Meanwhile, we conduct an in-depth analysis on the impact of task definitions and method selections in MTL on the prediction performance, and demonstrate that the impact of task definitions on prediction performance far exceeds that of method selections.

References

[1]
Alastair S. Adair, Jim N. Berry, and W. Stanley McGreal. 1996. Hedonic modelling, housing submarkets and residential valuation. Journal of Property Research 13, 1 (1996), 67–83.
[2]
Andreas Argyriou, Theodoros Evgeniou, and Massimiliano Pontil. 2006. Multi-task feature learning. In Proceedings of the 20th Annual Conference on Neural Information Processing Systems. 41–48.
[3]
Debasish Basak, Srimanta Pal, and Dipak Chandra Patranabis. 2007. Support vector regression. Neural Information Processing-Letters and Reviews 11, 10 (2007), 203–224.
[4]
Steven Bourassa, Eva Cantoni, and Martin Hoesli. 2010. Predicting house prices with spatial dependence: A comparison of alternative methods. Journal of Real Estate Research 32, 2 (2010), 139–160. DOI:
[5]
Ayse Can. 1992. Specification and estimation of hedonic housing price models. Regional Science and Urban Economics 22, 3 (1992), 453–474.
[6]
Rich Caruana. 1997. Multitask learning. Machine Learning 28, 1 (1997), 41–75.
[7]
Bradford Case, John Clapp, Robin Dubin, and Mauricio Rodriguez. 2004. Modeling spatial and temporal house price patterns: A comparison of four models. Journal of Real Estate Finance and Economics 29, 2 (2004), 167–191. DOI:
[8]
Tianfeng Chai and Roland R. Draxler. 2014. Root mean square error (RMSE) or mean absolute error (MAE)?–arguments against avoiding RMSE in the literature. Geoscientific Model Development 7, 3 (2014), 1247–1250. DOI:
[9]
Chao Chen, Qiang Liu, Xingchen Wang, Chengwu Liao, and Daqing Zhang. 2021. semi-Traj2Graph: Identifying fine-grained driving style with GPS trajectory data via multi-task learning. IEEE Transactions on Big Data (2021), 1–15. DOI:
[10]
Boris Chidlovskii. 2017. Multi-task learning of time series and its application to the travel demand. CoRR abs/1712.08164 (2017).
[11]
Jakob A. Dambon, Fabio Sigrist, and Reinhard Furrer. 2021. Maximum likelihood estimation of spatially varying coefficient models for large data with an application to real estate price prediction. Spatial Statistics 41 (2021), 100470. DOI:
[12]
Sarkar Snigdha Sarathi Das, Mohammed Eunus Ali, Yuan-Fang Li, Yong-Bin Kang, and Timos Sellis. 2020. Boosting house price predictions using geo-spatial network embedding. CoRR abs/2009.00254.
[13]
Saba Emrani, Anya McGuirk, and Wei Xiao. 2017. Prognosis and diagnosis of Parkinson’s disease using multi-task learning. In Proceedings of the 23rd International Conference on Knowledge Discovery and Data Mining. ACM, 1457–1466. DOI:https://doi.org/10.1145/3097983.3098065
[14]
Theodoros Evgeniou and Massimiliano Pontil. 2004. Regularized multi-task learning. In Proceedings of the 10th International Conference on Knowledge Discovery and Data Mining. ACM, 109–117. DOI:https://doi.org/10.1145/1014052.1014067
[15]
Gang-Zhi Fan, Seow Eng Ong, and Hian Chye Koh. 2006. Determinants of house price: A decision tree approach. Urban Studies 43, 12 (2006), 2301–2315. DOI:
[16]
Ibrahim Halil Gerek. 2014. House selling price assessment using two different adaptive neuro-fuzzy techniques. Automation in Construction 41 (2014), 33–39. DOI:
[17]
Joumana Ghosn and Yoshua Bengio. 1996. Multi-task learning for stock selection. In Proceedings of the 10th Annual Conference on Neural Information Processing Systems. 946–952.
[18]
Jirong Gu, Mingcang Zhu, and Liuguangyan Jiang. 2011. Housing price forecasting based on genetic algorithm and support vector machine. Expert Systems with Applications 38, 4 (2011), 3383–3386. DOI:https://doi.org/10.1016/j.eswa.2010.08.123
[19]
Suiming Guo, Chao Chen, Jingyuan Wang, Yaxiao Liu, Ke Xu, Zhiwen Yu, Daqing Zhang, and Dah Ming Chiu. 2020. ROD-revenue: Seeking strategies analysis and revenue prediction in ride-on-demand service using multi-source urban data. IEEE Transactions on Mobile Computing 19, 9 (2020), 2202–2220. DOI:
[20]
Arthur E. Hoerl and Robert W. Kennard. 1970. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 12, 1 (1970), 55–67.
[21]
Natasha Jaques, Sara Taylor, Ehimwenma Nosakhare, Akane Sano, and Rosalind Picard. 2016. Multi-task learning for predicting health, stress, and happiness. In NIPS Workshop on Machine Learning for Healthcare.
[22]
John F. Kain and John M. Quigley. 1970. Measuring the value of housing quality. J. Amer. Statist. Assoc. 65, 330 (1970), 532–548.
[23]
Anna Król. 2015. Application of hedonic methods in modelling real estate prices in Poland. In Data Science, Learning by Latent Structures, and Knowledge Discovery. 501–511. DOI:
[24]
Marko Kryvobokov and Mats Wilhelmsson. 2007. Analysing location attributes with a hedonic model for apartment prices in Donetsk, Ukraine. International Journal of Strategic Property Management 11, 3 (2007), 157–178. DOI:
[25]
Michael Kuntz and Marco Helbich. 2014. Geostatistical mapping of real estate prices: An empirical comparison of kriging and cokriging. International Journal of Geographical Information Science 28, 9 (2014), 1904–1921. DOI:https://doi.org/10.1080/13658816.2014.906041
[26]
Hakan Kusan, Osman Aytekin, and Ilker Özdemir. 2010. The use of fuzzy logic in predicting house selling price. Expert Systems with Applications 37, 3 (2010), 1808–1813. DOI:https://doi.org/10.1016/j.eswa.2009.07.031
[27]
Kelvin J. Lancaster. 1966. A new approach to consumer theory. Journal of Political Economy 74, 2 (1966), 132–157.
[28]
Stephen Law, Brooks Paige, and Chris Russell. 2019. Take a look around: Using street view and satellite images to estimate house prices. ACM Transactions on Intelligent Systems and Technology 10, 5 (2019), 54:1–54:19. DOI:https://doi.org/10.1145/3342240
[29]
Mingzhao Li, Zhifeng Bao, Farhana Murtaza Choudhury, and Timos Sellis. 2018. Supporting large-scale geographical visualization in a multi-granularity way. In Proceedings of the 11th International Conference on Web Search and Data Mining. ACM, 767–770. DOI:https://doi.org/10.1145/3159652.3160587
[30]
Mingzhao Li, Zhifeng Bao, Timos Sellis, and Shi Yan. 2016. Visualization-aided exploration of the real estate data. In Databases Theory and Applications. Springer, 435–439. DOI:
[31]
Mingzhao Li, Zhifeng Bao, Timos Sellis, Shi Yan, and Rui Zhang. 2018. HomeSeeker: A visual analytics system of real estate data. Journal of Visual Languages and Computing 45 (2018), 1–16. DOI:
[32]
Mingzhao Li, Farhana Murtaza Choudhury, Zhifeng Bao, Hanan Samet, and Timos Sellis. 2018. ConcaveCubes: Supporting cluster-based geographical visualization in large data scale. Computer Graphics Forum 37, 3 (2018), 217–228. DOI:
[33]
Andy Liaw, Matthew Wiener, et al. 2002. Classification and regression by RandomForest. R News 2, 3 (2002), 18–22.
[34]
Ye Liu, Yu Zheng, Yuxuan Liang, Shuming Liu, and David S. Rosenblum. 2016. Urban water quality prediction based on multi-task multi-view learning. In Proceedings of the 25th International Joint Conference on Artificial Intelligence. 2576–2581.
[35]
José-María Montero, Román Mínguez, and Gema Fernández-Avilés. 2018. Housing price prediction: Parametric versus semi-parametric spatial hedonic models. Journal of Geographical Systems 20, 1 (2018), 27–55. DOI:
[36]
John R. Ottensmann, Seth Payton, and Joyce Man. 2008. Urban location and housing prices within a hedonic model. Journal of Regional Analysis and Policy 38, 1 (2008), 19–35. DOI:
[37]
Ayse Yavuz Ozalp and Halil Akinci. 2017. The use of hedonic pricing method to determine the parameters affecting residential real estate prices. Arabian Journal of Geosciences 10, 24 (2017), 535–548. DOI:
[38]
Ping-Feng Pai and Wen-Chang Wang. 2020. Using machine learning models and actual transaction data for predicting real estate prices. Applied Sciences 10, 17 (2020), 5832–5843. DOI:
[39]
Byeonghwa Park and Jae Kwon Bae. 2015. Using machine learning algorithms for housing price prediction: The case of Fairfax County, Virginia housing data. Expert Systems with Applications 42, 6 (2015), 2928–2934. DOI:https://doi.org/10.1016/j.eswa.2014.11.040
[40]
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12, Oct (2011), 2825–2830.
[41]
Zhen Peng, Qiang Huang, and Yincheng Han. 2019. Model research on forecast of second-hand house price in Chengdu based on XGboost algorithm. In Proceedings of the 11th International Conference on Advanced Infocomm Technology. IEEE, 168–172. DOI:
[42]
Sherwin Rosen. 1974. Hedonic prices and implicit markets: Product differentiation in pure competition. Journal of Political Economy 82, 1 (1974), 34–55.
[43]
Sebastian Ruder. 2016. An overview of gradient descent optimization algorithms. CoRR abs/1609.04747.
[44]
Rainer Schulz and Axel Werwatz. 2004. A state space model for Berlin house prices: Estimation and economic interpretation. Journal of Real Estate Finance and Economics 28, 1 (2004), 37–57. DOI:
[45]
Hasan Selim. 2009. Determinants of house prices in Turkey: Hedonic regression versus artificial neural network. Expert Systems with Applications 36, 2 (2009), 2843–2852. DOI:https://doi.org/10.1016/j.eswa.2008.01.044
[46]
Dimitri P. Solomatine and Durga L. Shrestha. 2004. AdaBoost.RT: A boosting algorithm for regression problems. In Proceedings of the 17th International Joint Conference on Neural Networks, Vol. 2. 1163–1168. DOI:
[47]
Robert Tibshirani. 1996. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological) (1996), 267–288.
[48]
Xibin Wang, Junhao Wen, Yihao Zhang, and Yubiao Wang. 2014. Real estate price forecasting based on SVM optimized by PSO. Optik-International Journal for Light and Electron Optics 125, 3 (2014), 1439–1443. DOI:
[49]
Geoffrey I. Webb. 2000. MultiBoosting: A technique for combining boosting and wagging. Machine Learning 40, 2 (2000), 159–196. DOI:https://doi.org/10.1023/A:1007659514849
[50]
Frank Wilcoxon. 1945. Individual comparisons by ranking methods. Biometrics Bulletin 1, 6 (1945), 80–83.
[51]
Cort J. Willmott and Kenji Matsuura. 2005. Advantages of the mean absolute error (MAE) over the root mean square error (RMSE) in assessing average model performance. Climate Research 30, 1 (2005), 79–82. DOI:
[52]
Chao Xue, Yongfeng Ju, Shuguang Li, Qilong Zhou, and Qingqing Liu. 2020. Research on accurate house price analysis by using GIS technology and transport accessibility: A case study of Xi’an, China. Symmetry 12, 8 (2020), 1329–1350. DOI:
[53]
Rüştü Yayar and Derya Demir. 2014. Hedonic estimation of housing market prices in Turkey. Erciyes Üniversitesi İktisadi ve İdari Bilimler Fakültesi Dergisi43 (2014), 67–82. DOI:
[54]
Yu Zhang and Qiang Yang. 2021. A survey on multi-task learning. IEEE Transactions on Knowledge and Data Engineering (2021), 1–20. DOI:
[55]
Liang Zhao, Qian Sun, Jieping Ye, Feng Chen, Chang-Tien Lu, and Naren Ramakrishnan. 2015. Multi-task learning for spatio-temporal event forecasting. In Proceedings of the 21st International Conference on Knowledge Discovery and Data Mining. ACM, 1503–1512. DOI:https://doi.org/10.1145/2783258.2783377
[56]
Jiayu Zhou, Jianhui Chen, and Jieping Ye. 2012. Malsar: Multi-task learning via structural regularization. Arizona State University 21 (2012), 1–36.
[57]
Jiayu Zhou, Lei Yuan, Jun Liu, and Jieping Ye. 2011. A multi-task learning formulation for predicting disease progression. In Proceedings of the 17th International Conference on Knowledge Discovery and Data Mining. ACM, 814–822. DOI:https://doi.org/10.1145/2020408.2020549
[58]
Nor Hamizah Zulkifley, Shuzlina Abdul Rahman, Nor Hasbiah Ubaidullah, and Ismail Ibrahim. 2020. House price prediction using a machine learning model: A survey of literature. International Journal of Modern Education and Computer Science 12, 6 (2020), 46–54. DOI:

Cited By

View all
  • (2024)Research for Car Price Prediction Base on Machine LearningTransactions on Computer Science and Intelligent Systems Research10.62051/k55feh595(1608-1617)Online publication date: 12-Aug-2024
  • (2024)House Price Prediction: A Multi-Source Data Fusion PerspectiveBig Data Mining and Analytics10.26599/BDMA.2024.90200197:3(603-620)Online publication date: Sep-2024
  • (2024)Optimizing Data Acquisition to Enhance Machine Learning PerformanceProceedings of the VLDB Endowment10.14778/3648160.364817217:6(1310-1323)Online publication date: 3-May-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Intelligent Systems and Technology
ACM Transactions on Intelligent Systems and Technology  Volume 13, Issue 2
April 2022
392 pages
ISSN:2157-6904
EISSN:2157-6912
DOI:10.1145/3508464
  • Editor:
  • Huan Liu
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 January 2022
Accepted: 01 November 2021
Revised: 01 August 2021
Received: 01 November 2020
Published in TIST Volume 13, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Price prediction
  2. real estate
  3. multi-task learning
  4. multiple auxiliary information
  5. linear regression

Qualifiers

  • Research-article
  • Refereed

Funding Sources

  • National Natural Science Foundation of China
  • International Innovation Cooperation Province
  • High-Level Introduction of Talent Scientific Research Start-up Fund of Jiangsu Police Institute

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)344
  • Downloads (Last 6 weeks)29
Reflects downloads up to 19 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Research for Car Price Prediction Base on Machine LearningTransactions on Computer Science and Intelligent Systems Research10.62051/k55feh595(1608-1617)Online publication date: 12-Aug-2024
  • (2024)House Price Prediction: A Multi-Source Data Fusion PerspectiveBig Data Mining and Analytics10.26599/BDMA.2024.90200197:3(603-620)Online publication date: Sep-2024
  • (2024)Optimizing Data Acquisition to Enhance Machine Learning PerformanceProceedings of the VLDB Endowment10.14778/3648160.364817217:6(1310-1323)Online publication date: 3-May-2024
  • (2024)Spatio-Temporal Predictive Modeling Techniques for Different Domains: a SurveyACM Computing Surveys10.1145/369666157:2(1-42)Online publication date: 20-Sep-2024
  • (2024)HydraGAN: A Cooperative Agent Model for Multi-Objective Data GenerationACM Transactions on Intelligent Systems and Technology10.1145/365398215:3(1-21)Online publication date: 17-May-2024
  • (2024)Mobility Data Science: Perspectives and ChallengesACM Transactions on Spatial Algorithms and Systems10.1145/3652158Online publication date: 7-May-2024
  • (2024)LINKin-PARK: Land Valuation Information and Knowledge in Predictive Analysis and Reporting Kit via Dual Attention-DCCNNProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679239(5289-5293)Online publication date: 21-Oct-2024
  • (2024)Exploring Structure Incentive Domain Adversarial Learning for Generalizable Sleep Stage ClassificationACM Transactions on Intelligent Systems and Technology10.1145/362523815:1(1-30)Online publication date: 16-Jan-2024
  • (2024)Building Multimodal Knowledge Bases With Multimodal Computational Sequences and Generative Adversarial NetworksIEEE Transactions on Multimedia10.1109/TMM.2023.329150326(2027-2040)Online publication date: 1-Jan-2024
  • (2024)A Comprehensive Survey on Traffic Missing Data ImputationIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2024.347881625:12(19252-19275)Online publication date: 1-Dec-2024
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media