[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3292500.3330679acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

AutoCross: Automatic Feature Crossing for Tabular Data in Real-World Applications

Published: 25 July 2019 Publication History

Abstract

Feature crossing captures interactions among categorical features and is useful to enhance learning from tabular data in real-world businesses. In this paper, we present AutoCross, an automatic feature crossing tool provided by 4Paradigm to its customers, ranging from banks, hospitals, to Internet corporations. By performing beam search in a tree-structured space, AutoCross enables efficient generation of high-order cross features, which is not yet visited by existing works. Additionally, we propose successive mini-batch gradient descent and multi-granularity discretization to further improve efficiency and effectiveness, while ensuring simplicity so that no machine learning expertise or tedious hyper-parameter tuning is required. Furthermore, the algorithms are designed to reduce the computational, transmitting, and storage costs involved in distributed computing. Experimental results on both benchmark and real-world business datasets demonstrate the effectiveness and efficiency of AutoCross. It is shown that AutoCross can significantly enhance the performance of both linear and deep models.

References

[1]
R. Agrawal, T. Imieli'nski, and A. Swami. 1993. Mining association rules between sets of items in large databases. In ACM Sigmod Record, Vol. 22. ACM, 207--216.
[2]
M. Blondel, A. Fujino, N. Ueda, and M. Ishihata. 2016. Higher-order factorization machines. In Advances in Neural Information Processing Systems. 3351--3359.
[3]
J. Bobadilla, F. Ortega, A. Hernando, and A. Gutiérrez. 2013. Recommender systems survey. Knowledge-Based Systems, Vol. 46 (2013), 109--132.
[4]
R. Bolton and D. Hand. 2002. Statistical fraud detection: A review. Statistical science (2002), 235--249.
[5]
O. Chapelle, E. Manavoglu, and R. Rosales. 2015. Simple and scalable response prediction for display advertising. ACM Transactions on Intelligent Systems and Technology (TIST), Vol. 5, 4 (2015), 61.
[6]
C. Cheng, F. Xia, T. Zhang, I. King, and M. Lyu. 2014. Gradient boosting factorization machines. In ACM Conference on Recommender systems. 265--272.
[7]
H.-T. Cheng, L. Koc, J. Harmsen, T. Shaked, T. Chandra, H. Aradhye, G. Anderson, G. Corrado, W. Chai, and M. Ispir. 2016. Wide & deep learning for recommender systems. In Workshop on Deep Learning for Recommender Systems. 7--10.
[8]
D. Crankshaw, X. Wang, G. Zhou, M. Franklin, J. Gonzalez, and I. Stoica. 2017. Clipper: A low-latency online prediction serving system. In USENIX Symposium on Networked Systems Design and Implementation. 613--627.
[9]
P. Domingos. 2012. A few useful things to know about machine learning. Commun. ACM, Vol. 55, 10 (2012), 78--87.
[10]
D. Evans. 2009. The online advertising industry: Economics, evolution, and privacy. Journal of Economic Perspectives, Vol. 23, 3 (2009), 37--60.
[11]
W. Fan, E. Zhong, J. Peng, O. Verscheure, K. Zhang, J. Ren, R. Yan, and Q. Yang. 2010. Generalized and heuristic-free feature construction for improved accuracy. In SIAM International Conference on Data Mining. 629--640.
[12]
J. Friedman. 2001. Greedy function approximation: a gradient boosting machine. Annals of statistics (2001), 1189--1232.
[13]
H. Guo and W. Hsu. 2002. A survey of algorithms for real-time Bayesian network inference. In Join Workshop on Real Time Decision Support and Diagnosis Systems.
[14]
H. Guo, R. Tang, Y. Ye, Z. Li, and X. He. 2017. DeepFM: a factorization-machine based neural network for CTR prediction. In International Joint Conference on Artificial Intelligence. 1725--1731.
[15]
J. Han, J. Pei, and M. Kamber. 2011. Data mining: concepts and techniques. Elsevier.
[16]
J. Han, J. Pei, and Y. Yin. 2000. Mining frequent patterns without candidate generation. In ACM Sigmod Record, Vol. 29. 1--12.
[17]
S. Han, H. Mao, and W. Dally. 2015. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. In International Conference on Learning Representations.
[18]
K. Jamieson and A. Talwalkar. 2016. Non-stochastic best arm identification and hyperparameter optimization. In Artificial Intelligence and Statistics. 240--248.
[19]
Y. Juan, Y. Zhuang, W.-S. Chin, and C.-J. Lin. 2016. Field-aware factorization machines for CTR prediction. In ACM Conference on Recommender Systems. 43--50.
[20]
J. Kanter and K. Veeramachaneni. 2015. Deep feature synthesis: Towards automating data science endeavors. In IEEE International Conference on Data Science and Advanced Analytics. 1--10.
[21]
G. Katz, E. Shin, and D. Song. 2016. Explorekit: Automatic feature generation and selection. In International Conference on Data Mining. 979--984.
[22]
D. Kingma and J. Ba. 2014. Adam: A method for stochastic optimization. In International Conference on Learning Representations.
[23]
I. Kononenko. 2001. Machine learning for medical diagnosis: history, state of the art and perspective. Artificial Intelligence in medicine, Vol. 23, 1 (2001), 89--109.
[24]
S. Kotsiantis and D. Kanellopoulos. 2006. Discretization techniques: A recent survey. GESTS International Transactions on Computer Science and Engineering, Vol. 32, 1 (2006), 47--58.
[25]
M. Li, L. Zhou, Z. Yang, A. Li, F. Xia, D. Andersen, and A. Smola. 2013. Parameter server for distributed machine learning. In Big Learning NIPS Workshop, Vol. 6. 2.
[26]
J. Lian, X. Zhou, F. Zhang, Z. Chen, X. Xie, and G. Sun. 2018. xDeepFM: Combining Explicit and Implicit Feature Interactions for Recommender Systems. In International Conference on Knowledge Discovery & Data Mining.
[27]
H. Liu, F. Hussain, C. Tan, and M. Dash. 2002. Discretization: An enabling technique. Data mining and knowledge discovery, Vol. 6, 4 (2002), 393--423.
[28]
H. Liu, H. sand Motoda. 1998. Feature extraction, construction and selection: A data mining perspective. Vol. 453. Springer Science & Business Media.
[29]
M. Medress, F. Cooper, J. Forgie, C. Green, D. Klatt, M. O'Malley, E. Neuburg, A. Newell, and B. Reddy, D Ritea. 1977. Speech understanding systems: Report of a steering committee. Artificial Intelligence, Vol. 9, 3 (1977), 307--316.
[30]
L. Meier, S. Van De Geer, and P. Bühlmann. 2008. The group lasso for logistic regression. Journal of the Royal Statistical Society: Series B (Statistical Methodology), Vol. 70, 1 (2008), 53--71.
[31]
T. Mitchell. 1997. Machine learning. Springer Science & Business Media.
[32]
R. Ng, L. Lakshmanan, J. Han, and A. Pang. 1998. Exploratory mining and pruning optimizations of constrained associations rules. In ACM Sigmod Record, Vol. 27. ACM, 13--24.
[33]
Y. Qu, H. Cai, K. Ren, W. Zhang, Y. Yu, Y. Wen, and J. Wang. 2016. Product-based neural networks for user response prediction. In IEEE International Conference on Data Mining. IEEE, 1149--1154.
[34]
R. Rosales, H. Cheng, and E. Manavoglu. 2012. Post-click conversion modeling and analysis for non-guaranteed delivery display advertising. In ACM International Conference on Web Search and Data Mining. 293--302.
[35]
M. Smith and L. Bull. 2005. Genetic programming with a genetic algorithm for feature construction and selection. Genetic Programming and Evolvable Machines, Vol. 6, 3 (2005), 265--281.
[36]
B. Tran, B. Xue, and M. Zhang. 2016. Genetic programming for feature construction and selection in classification on high-dimensional data. Memetic Computing, Vol. 8, 1 (2016), 3--15.
[37]
R. Wang, B. Fu, G. Fu, and M. Wang. 2017. Deep & cross network for ad click predictions. In KDD Workshop. ACM, 12.
[38]
S. Wang. 2010. A comprehensive survey of data mining-based accounting-fraud detection research. In Intelligent Computation Technology and Automation (ICICTA), 2010 International Conference on, Vol. 1. IEEE, 50--53.
[39]
K. Weinberger, A. Dasgupta, J. Attenberg, J. Langford, and A. Smola. 2009. Feature hashing for large scale multitask learning. In International Conference on Machine Learning.
[40]
Q. Yao, M. Wang, Y. Chen, W. Dai, Y. Hu, Y. Li, W.-W. Tu, Q. Yang, and Y. Yu. 2018. Taking Human out of Learning Applications: A Survey on Automated Machine Learning. Technical Report. arXiv preprint.
[41]
R. Zeff and B. Aronson. 1999. Advertising on the Internet. John Wiley & Sons, Inc.
[42]
W. Zhang, T. Du, and J. Wang. 2016. Deep learning over multi-field categorical data. In European conference on information retrieval. Springer, 45--57.
[43]
Y. Zhang, Q. Yao, W. Dai, and L. Chen. 2019. AutoKGE: Searching Scoring Functions for Knowledge Graph Embedding. Technical Report. arXiv preprint arXiv:1904.11682.

Cited By

View all
  • (2024)Cross Feature Engineering for Anti-Fraud Task in InsuranceArtificial Intelligence and Robotics Research10.12677/AIRR.2024.13204813:02(467-477)Online publication date: 2024
  • (2024)Exploring Cross-Site User Modeling without Cross-Site User Identity Linkage: A Case Study of Content Preference PredictionACM Transactions on Information Systems10.1145/369783243:1(1-28)Online publication date: 1-Oct-2024
  • (2024)A Tutorial on Feature Interpretation in Recommender SystemsProceedings of the 18th ACM Conference on Recommender Systems10.1145/3640457.3687094(1281-1282)Online publication date: 8-Oct-2024
  • Show More Cited By

Index Terms

  1. AutoCross: Automatic Feature Crossing for Tabular Data in Real-World Applications

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
    July 2019
    3305 pages
    ISBN:9781450362016
    DOI:10.1145/3292500
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 25 July 2019

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. automl
    2. feature crossing
    3. tabular data

    Qualifiers

    • Research-article

    Conference

    KDD '19
    Sponsor:

    Acceptance Rates

    KDD '19 Paper Acceptance Rate 110 of 1,200 submissions, 9%;
    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)84
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 12 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Cross Feature Engineering for Anti-Fraud Task in InsuranceArtificial Intelligence and Robotics Research10.12677/AIRR.2024.13204813:02(467-477)Online publication date: 2024
    • (2024)Exploring Cross-Site User Modeling without Cross-Site User Identity Linkage: A Case Study of Content Preference PredictionACM Transactions on Information Systems10.1145/369783243:1(1-28)Online publication date: 1-Oct-2024
    • (2024)A Tutorial on Feature Interpretation in Recommender SystemsProceedings of the 18th ACM Conference on Recommender Systems10.1145/3640457.3687094(1281-1282)Online publication date: 8-Oct-2024
    • (2024)OptDist: Learning Optimal Distribution for Customer Lifetime Value PredictionProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679712(2523-2533)Online publication date: 21-Oct-2024
    • (2024)Scalable Dynamic Embedding Size Search for Streaming RecommendationProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679638(1941-1950)Online publication date: 21-Oct-2024
    • (2024)Budgeted Embedding Table For Recommender SystemsProceedings of the 17th ACM International Conference on Web Search and Data Mining10.1145/3616855.3635778(557-566)Online publication date: 4-Mar-2024
    • (2024)LightCS: Selecting Quadratic Feature Crosses in Linear ComplexityCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3648300(38-46)Online publication date: 13-May-2024
    • (2024)Towards Cross-Table Masked Pretraining for Web Data MiningProceedings of the ACM Web Conference 202410.1145/3589334.3645707(4449-4459)Online publication date: 13-May-2024
    • (2024)I-Razor: A Differentiable Neural Input Razor for Feature Selection and Dimension Search in DNN-Based Recommender SystemsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.333267136:9(4736-4749)Online publication date: Sep-2024
    • (2024)FeatAug: Automatic Feature Augmentation From One-to-Many Relationship Tables2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00146(1805-1818)Online publication date: 13-May-2024
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media