HiPaR: Hierarchical Pattern-Aided Regression

Luis Galárraga¹⁵,
Olivier Pelgrin¹⁶ &
Alexandre Termier¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12712))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

3916 Accesses

Abstract

We introduce HiPaR, a novel pattern-aided regression method for data with both categorical and numerical attributes. HiPaR mines hybrid rules of the form \(p \Rightarrow y = f(X)\) where p is the characterization of a data region and f(X) is a linear regression model on a variable of interest y. The novelty of the method lies in the combination of an enumerative approach to explore the space of regions and efficient heuristics that guide the search. Such a strategy provides more flexibility when selecting a small set of jointly accurate and human-readable hybrid rules that explain the entire dataset. As our experiments shows, HiPaR mines fewer rules than existing pattern-based regression methods while still attaining state-of-the-art prediction performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 71.50; Price includes VAT (United Kingdom)

Softcover Book: GBP 89.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

FP-Tree and Its Variants: Towards Solving the Pattern Mining Challenges

Pattern Mining: Current Challenges and Opportunities

Rule Learning

Notes

1.
We set \(\nu \) empirically to the \(k=85\)-th percentile of \( iv _D\) (line 4 in Algorithm 3). Lower percentiles did not yield better performance in our experimental datasets.
2.
Our abuse of notation treats p as a set of conditions.
3.
attrs(p) returns the set of attributes present in a pattern.
4.
Hyper-parameters were tuned using Hyperopt. For CPXR we set \(\theta =0.02\) as in [3].
5.
These are the only publicly available datasets used in [3].
6.
By setting this limit to the avg. number of rules found by HiPaR in cross-validation.

References

HiPaR: hierarchical pattern-aided regression. Technical report. https://arxiv.org/abs/2102.12370
Breiman, L.: Random forests. Machine Learn. 45(1), 5–32 (2001)
Article Google Scholar
Dong, G., Taslimitehrani, V.: Pattern-aided regression modeling and prediction model analysis. IEEE Trans. Knowl. Data Eng. 27(9), 2452–2465 (2015)
Article Google Scholar
Duivesteijn, W., Feelders, A., Knobbe, A.: Different slopes for different folks: mining for exceptional regression models with Cook’s distance. In: ACM SIGKDD (2012)
Google Scholar
Duivesteijn, W., Feelders, A.J., Knobbe, A.: Exceptional model mining. Data Min. Knowl. Disc. 30(1), 47–98 (2015). https://doi.org/10.1007/s10618-015-0403-4
Article MathSciNet MATH Google Scholar
Fayyad, U.M., Irani, K.B.: Multi-interval discretization of continuous-valued attributes for classification learning. In: IJCAI (1993)
Google Scholar
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29(5), 1189–1232 (2001)
Article MathSciNet Google Scholar
Friedman, J.H., Popescu, B.E.: Predictive learning via rule ensembles. Ann. Appl. Stat. 2(3), 916–954 (2008)
Article MathSciNet Google Scholar
Grosskreutz, H., Rüping, S.: On subgroup discovery in numerical domains. Data Min. Knowl. Disc. 19(2), 210–226 (2009)
Article MathSciNet Google Scholar
Herrera, F., Carmona, C.J., González, P., del Jesus, M.J.: An overview on subgroup discovery: foundations and applications. Knowl. Inf. Syst. 29(3), 495–525 (2011)
Article Google Scholar
Kramer, S.: Structural regression trees. In: AAAI (1996)
Google Scholar
Malerba, D., Esposito, F., Ceci, M., Appice, A.: Top-down induction of model trees with regression and splitting nodes. IEEE Trans. Pattern Anal. Mach. Intell. 26(5), 612–625 (2004)
Article Google Scholar
McGee, V.E., Carleton, W.T.: Piecewise regression. J. Am. Stat. Assoc. 65(331), 1109–1124 (1970)
Article Google Scholar
Morishita, S., Sese, J.: Traversing itemset lattices with statistical metric pruning. In: SIGMOD/PODS (2000)
Google Scholar
Uno, T., Asai, T., Uchida, Y., Arimura, H.: LCM: an efficient algorithm for enumerating frequent closed item sets. In: FIMI (2003)
Google Scholar
Wang, Y., Witten, I.H.: Inducing model trees for continuous classes. In: ECML Poster Papers (1997)
Google Scholar

Download references

Author information

Authors and Affiliations

Inria, Rennes, France
Luis Galárraga
Aalborg University, Aalborg, Denmark
Olivier Pelgrin
University of Rennes I, Rennes, France
Alexandre Termier

Authors

Luis Galárraga
View author publications
You can also search for this author in PubMed Google Scholar
Olivier Pelgrin
View author publications
You can also search for this author in PubMed Google Scholar
Alexandre Termier
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Luis Galárraga .

Editor information

Editors and Affiliations

IIIT, Hyderabad, Hyderabad, India
Kamal Karlapalem
Chinese University of Hong Kong, Shatin, Hong Kong
Hong Cheng
Virginia Tech, Arlington, VA, USA
Naren Ramakrishnan
Jawaharlal Nehru University, New Delhi, India
R. K. Agrawal
IIIT Hyderabad, Hyderabad, India
P. Krishna Reddy
University of Minnesota, Minneapolis, MN, USA
Jaideep Srivastava
IIIT Delhi, New Delhi, India
Tanmoy Chakraborty

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Galárraga, L., Pelgrin, O., Termier, A. (2021). HiPaR: Hierarchical Pattern-Aided Regression. In: Karlapalem, K., et al. Advances in Knowledge Discovery and Data Mining. PAKDD 2021. Lecture Notes in Computer Science(), vol 12712. Springer, Cham. https://doi.org/10.1007/978-3-030-75762-5_26

Download citation

DOI: https://doi.org/10.1007/978-3-030-75762-5_26
Published: 09 May 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-75761-8
Online ISBN: 978-3-030-75762-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

HiPaR: Hierarchical Pattern-Aided Regression

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

FP-Tree and Its Variants: Towards Solving the Pattern Mining Challenges

Pattern Mining: Current Challenges and Opportunities

Rule Learning

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

HiPaR: Hierarchical Pattern-Aided Regression

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

FP-Tree and Its Variants: Towards Solving the Pattern Mining Challenges

Pattern Mining: Current Challenges and Opportunities

Rule Learning

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation