[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1109/CASE49997.2022.9926420guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
research-article

CIPCaD-Bench: Continuous Industrial Process datasets for benchmarking Causal Discovery methods

Published: 20 August 2022 Publication History

Abstract

Causal relationships are commonly examined in manufacturing processes to support faults investigations, perform interventions, and make strategic decisions. Industry 4.0 has made available an increasing amount of data that enable data-driven Causal Discovery (CD). Considering the growing number of recently proposed CD methods, it is necessary to introduce strict benchmarking procedures on publicly available datasets since they represent the foundation for a fair comparison and validation of different methods. This work introduces two novel public datasets for CD in continuous manufacturing processes. The first dataset employs the well-known Tennessee Eastman simulator for fault detection and process control. The second dataset is extracted from an ultra-processed food manufacturing plant, and it includes a description of the plant, as well as multiple ground truths. These datasets are used to propose a benchmarking procedure based on different metrics and evaluated on a wide selection of CD algorithms. This work allows testing CD methods in realistic conditions enabling the selection of the most suitable method for specific target applications. The datasets are available at the following link: [https://github.com/giovanniMen]

References

[1]
A. Scavarda, T. Bouzdine-Chameeva, S. Goldstein, J. Hays, and A. Hill, “A methodology for constructing collective causal maps,” Decision Sciences, vol. 37, no. 2, pp. 263–283, 2006.
[2]
M. Vuković and S. Thalmann, “Causal discovery in manufacturing: A structured literature review,” Journal of Manufacturing and Materials Processing, vol. 6, no. 1, 2022.
[3]
G. Menegozzo, D. Dall’Alba, and P. Fiorini, “Causal interaction modeling on ultra-processed food manufacturing,” in 2020 IEEE 16th International Conference on Automation Science and Engineering (CASE), 2020, pp. 200–205.
[4]
B. Schölkopf, F. Locatello, S. Bauer, N. R. Ke, N. Kalchbrenner, A. Goyal, and Y. Bengio, “Toward causal representation learning,” Proceedings of the IEEE, vol. 109, no. 5, pp. 612–634, 2021.
[5]
J. Peters, D. Janzing, and B. Schölkopf, Elements of Causal Inference: Foundations and Learning Algorithms.Cambridge, MA, USA: MIT Press, 2017.
[6]
J. Pearl, “The seven tools of causal inference, with reflections on machine learning,” Communications of the ACM, vol. 62, pp. 54–60, 2019.
[7]
G. Menegozzo, D. Dall’Alba, and P. Fiorini, “Industrial time series modeling with causal precursors and separable temporal convolutions,” IEEE Robotics and Automation Letters, vol. 6, no. 4, pp. 6939–6946, 2021.
[8]
J. Pearl, Graphical Models for Probabilistic and Causal Reasoning.Dordrecht: Springer Netherlands, 1998, pp. 367–389.
[9]
A. R. Lawrence, M. Kaiser, R. Sampaio, and M. Sipos, “Data generating process to evaluate causal discovery techniques for time series data,” 2021. [Online]. Available: https://github.com/causalens/cdml-neurips2020
[10]
I. Guyon and A. Statnikov, Results of the Cause-Effect Pair Challenge.Cham: Springer International Publishing, 2019, pp. 237–256. [Online]. Available: https://doi.org/10.1007/978-3-030-21810-2_7
[11]
J. M. Mooij, J. Peters, D. Janzing, J. Zscheischler, and B. Schölkopf, “Distinguishing cause from effect using observational data: Methods and benchmarks,” Journal of Machine Learning Research, vol. 17, no. 32, pp. 1–102, 2016. [Online]. Available: http://jmlr.org/papers/v17/14-518.html
[12]
J. Runge, P. Nowack, M. Kretschmer, S. Flaxman, and D. Sejdinovic, “Detecting and quantifying causal associations in large nonlinear time series datasets,” Science Advances, vol. 5, no. 11, p. eaau4996, 2019. [Online]. Available: https://jakobrunge.github.io/tigramite/
[13]
A. Sharma, E. Kiciman, et al., “DoWhy: A Python package for causal inference,” https://github.com/microsoft/dowhy, 2019.
[14]
C. K. Wongchokprasitti, H. Hochheiser, J. Espino, E. Maguire, B. Andrews, M. Davis, and C. Inskip, “pycausal.” [Online]. Available: https://doi.org/10.5281/zenodo.3364589
[15]
K. Zhang, J. Ramsey, M. Gong, R. Cai, S. Shimizu, P. Spirtes, and C. Glymour, “causal-learn.” [Online]. Available: https://github.com/cmu-phil/causal-learn
[16]
D. Kalainathan and O. Goudet, “Causal discovery toolbox: Uncover causal relationships in python,” 2019.
[17]
K. Zhang, S. Zhu, M. Kalander, I. Ng, J. Ye, Z. Chen, and L. Pan, “gcastle: A python toolbox for causal discovery,” CoRR, vol. abs/2111.15155, 2021. [Online]. Available: https://github.com/huawei-noah/trustworthyAI/tree/master/gcastle
[18]
S. Beckers, “Causal sufficiency and actual causation,” Journal of Philosophical Logic, vol. 50, no. 6, pp. 1341–1374, 2021.
[19]
C. Glymour, K. Zhang, and P. Spirtes, “Review of causal discovery methods based on graphical models,” Frontiers in Genetics, vol. 10, 2019.
[20]
M. J. Vowels, N. C. Camgöz, and R. Bowden, “D’ya like dags? A survey on structure learning and causal discovery,” CoRR, vol. abs/2103.02582, 2021.
[21]
P. Spirtes and K. Zhang, “Causal discovery and inference: concepts and recent methodological advances,” Applied Informatics, vol. 3, no. 1, p. 3, Feb 2016.
[22]
R. F. Engle and C. W. J. Granger, “Co-integration and error correction: Representation, estimation, and testing,” Econometrica, vol. 55, no. 2, pp. 251–276, 2022/02/10/1987, full publication date: Mar., 1987.
[23]
S. Shimizu, “Lingam: Non-gaussian methods for estimating causal structures,” Behaviormetrika, vol. 41, no. 1, pp. 65–98, Jan 2014. [Online]. Available: https://doi.org/10.2333/bhmk.41.65
[24]
D. Chickering, “Optimal structure identification with greedy search.” Journal of Machine Learning Research, vol. 3, pp. 507–554, 01 2002.
[25]
M. Kalisch and P. Buehlmann, “Estimating high-dimensional directed acyclic graphs with the pc-algorithm,” 2005.
[26]
P. Spirtes, C. Meek, and T. Richardson, “Causal inference in the presence of latent variables and selection bias,” in Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, ser. UAI’95. Morgan Kaufmann, 1995, p. 499–506.
[27]
M. Kaiser and M. Sipos, “Unsuitability of notears for causal graph discovery when dealing with dimensional quantities,” Neural Processing Letters, Jan 2022. [Online]. Available: https://doi.org/10.1007/s11063-021-10694-5
[28]
S. Shimizu, T. Inazumi, Y. Sogawa, A. Hyvarinen, Y. Kawahara, T. Washio, P. O. Hoyer, and K. Bollen, “Directlingam: A direct method for learning a linear non-gaussian structural equation model,” 2011.
[29]
S. Shimizu, P. O. Hoyer, A. Hyvärinen, and A. Kerminen, “A linear non-gaussian acyclic model for causal discovery,” J. Mach. Learn. Res., vol. 7, p. 2003–2030, dec 2006.
[30]
X. Zheng, B. Aragam, P. Ravikumar, and E. P. Xing, “Dags with no tears: Continuous optimization for structure learning,” 2018.
[31]
X. Zheng, C. Dan, B. Aragam, P. Ravikumar, and E. P. Xing, “Learning sparse nonparametric dags,” 2020.
[32]
I. Ng, A. Ghassami, and K. Zhang, “On the role of sparsity and dag constraints for learning linear dags,” 2021.
[33]
X. Wang, Y. Du, S. Zhu, L. Ke, Z. Chen, J. Hao, and J. Wang, “Ordering-based causal discovery with reinforcement learning,” 2021.
[34]
I. Ng, S. Zhu, Z. Fang, H. Li, Z. Chen, and J. Wang, “Masked gradient-based causal structure learning,” 2022.
[35]
C. Reinartz, M. Kulahci, and O. Ravn, “An extended tennessee eastman simulation dataset for fault-detection and decision support systems,” Computers and Chemical Engineering, vol. 149, p. 107281, 2021.
[36]
J. Downs and E. Vogel, “A plant-wide industrial process control problem,” Computers and Chemical Engineering, vol. 17, no. 3, pp. 245–255, 1993, industrial challenge problems in process control.
[37]
C. Reinartz, M. Kulahci, and O. Ravn, “An extended tennessee eastman simulation dataset for fault-detection and decision support systems,” Computers and Chemical Engineering, vol. 149, p. 107281, 2021.
[38]
G. Manca, “"tennessee-eastman-process” alarm management dataset," 2020. [Online]. Available: https://doi.org/10.21227/326k-qr90
[39]
A. Bathelt, N. L. Ricker, and M. Jelali, “Revision of the tennessee eastman process model,” IFAC-PapersOnLine, vol. 48, no. 8, pp. 309–314 2015, 9th IFAC Symposium on Advanced Control of Chemical Processes ADCHEM 2015.
[40]
C. A. Rieth, B. D. Amsel, R. Tran, and M. B. Cook, “Additional Tennessee Eastman Process Simulation Data for Anomaly Detection Evaluation,” 2017.
[41]
X. Chen, “Tennessee eastman simulation dataset,” 2019. [Online]. Available: https://doi.org/10.21227/4519-z502
[42]
N. Ricker and J. Lee, “Nonlinear modeling and state estimation for the tennessee eastman challenge process,” Computers and Chemical Engineering, vol. 19, no. 9, pp. 983–1005, 1995. [Online]. Available: https://www.sciencedirect.com/science/article/pii/0098135494001133
[43]
X. Chen, J. Wang, and S. X. Ding, “Complex system monitoring based on distributed least squares method,” IEEE Transactions on Automation Science and Engineering, vol. 18, no. 4, pp. 1892–1900, 2021.

Cited By

View all
  • (2024)Causal Discovery from Temporal Data: An Overview and New PerspectivesACM Computing Surveys10.1145/370529757:4(1-38)Online publication date: 23-Nov-2024

Index Terms

  1. CIPCaD-Bench: Continuous Industrial Process datasets for benchmarking Causal Discovery methods
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Please enable JavaScript to view thecomments powered by Disqus.

        Information & Contributors

        Information

        Published In

        cover image Guide Proceedings
        2022 IEEE 18th International Conference on Automation Science and Engineering (CASE)
        Aug 2022
        1894 pages

        Publisher

        IEEE Press

        Publication History

        Published: 20 August 2022

        Qualifiers

        • Research-article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 24 Dec 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Causal Discovery from Temporal Data: An Overview and New PerspectivesACM Computing Surveys10.1145/370529757:4(1-38)Online publication date: 23-Nov-2024

        View Options

        View options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media