[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Designing for Real-Time Groupware Systems to Support Complex Scientific Data Analysis

Published: 13 June 2019 Publication History

Abstract

Scientific Workflow Management Systems (SWfMSs) have become popular for accelerating the specification, execution, visualization, and monitoring of data-intensive scientific experiments. Unfortunately, to the best of our knowledge no existing SWfMSs directly support collaboration. Data is increasing in complexity, dimensionality, and volume, and the efficient analysis of data often goes beyond the realm of an individual and requires collaboration with multiple researchers from varying domains. In this paper, we propose a groupware system architecture for data analysis that in addition to supporting collaboration, also incorporates features from SWfMSs to support modern data analysis processes. As a proof of concept for the proposed architecture we developed SciWorCS - a groupware system for scientific data analysis. We present two real-world use-cases: collaborative software repository analysis and bioinformatics data analysis. The results of the experiments evaluating the proposed system are promising. Our bioinformatics user study demonstrates that SciWorCS can leverage real-world data analysis tasks by supporting real-time collaboration among users.

References

[1]
Enis Afgan, Dannon Baker, Marius Van den Beek, Daniel Blankenberg, Dave Bouvier, Martin vC ech, John Chilton, Dave Clements, Nate Coraor, Carl Eberhard, et al. 2016. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update. Nucleic acids research, Vol. 44, W1 (2016), W3--W10.
[2]
Brenda S Baker. 1995. On finding duplication and near-duplication in large software systems. In Reverse Engineering, 1995., Proceedings of 2nd Working Conference on. IEEE, 86--95.
[3]
Adam Barker and Jano Van Hemert. 2007. Scientific workflow: a survey and research directions. In International Conference on Parallel Processing and Applied Mathematics. Springer, 746--753.
[4]
Aaron Bauer and Zoran Popović. 2017. Collaborative Problem Solving in an Open-Ended Scientific Discovery Game. Proc. ACM Hum.-Comput. Interact., Vol. 1, CSCW (Dec. 2017), 22:1--22:21.
[5]
Fahima Bhuyan, Shiyong Lu, Robert Reynolds, Ishtiaq Ahmed, and Jia Zhang. 2018. Quality Analysis for Scientific Workflow Provenance Access Control Policies. In 2018 IEEE International Conference on Services Computing (SCC). IEEE, 261--264.
[6]
Robert P Bostrom. 1980. Role conflict and ambiguity: Critical variables in the MIS user-designer relationship. In Proceedings of the seventeenth annual computer personnel research conference. ACM, 88--115.
[7]
Joseph Brown, Meg Pirrung, and Lee Ann McCue. 2017. FQC Dashboard: integrates FastQC results into a web-based, interactive, and extensible FASTQ quality control tool. Bioinformatics, Vol. 33, 19 (2017), 3137--3139.
[8]
Jeffrey L Brown, Clayton S Ferner, Thomas C Hudson, Ann E Stapleton, Ronald J Vetter, Tristan Carland, Andrew Martin, Jerry Martin, Allen Rawls, William J Shipman, et al. 2005. Gridnexus: A grid services scientific workflow system. International Journal of Computer Information Science (IJCIS), Vol. 6, 2 (2005), 72--82.
[9]
Steven P Callahan, Juliana Freire, Emanuele Santos, Carlos E Scheidegger, Cláudio T Silva, and Huy T Vo. 2006. VisTrails: visualization meets data management. In Proceedings of the 2006 ACM SIGMOD international conference on Management of data. ACM, 745--747.
[10]
Esther Care, Patrick Griffin, Claire Scoular, Nafisa Awwal, and Nathan Zoanetti. 2015. Collaborative problem solving tasks. In Assessment and teaching of 21st century skills. Springer, 85--104.
[11]
Artem Chebotko, Shiyong Lu, Seunghan Chang, Farshad Fotouhi, and Ping Yang. 2010. Secure abstraction views for scientific workflow provenance querying. IEEE Transactions on Services Computing 4 (2010), 322--337.
[12]
Yuan Cheng, Fazhi He, Yiqi Wu, and Dejun Zhang. 2016. Meta-operation conflict resolution for human-human interaction in collaborative feature-based CAD systems. Cluster Computing, Vol. 19, 1 (2016), 237--253.
[13]
Brian Corrie and Todd Zimmerman. 2009. Build It: Will They Come? In Media Space 20
[14]
Years of Mediated Life. Springer, 393--413.
[15]
David De, Roure Carole, and Goble Robert Stevens. 2008. The design and realisation of the myexperiment virtual research environment for social sharing of workflows. (2008).
[16]
Ewa Deelman, Dennis Gannon, Matthew Shields, and Ian Taylor. 2009. Workflows and e-Science: An overview of workflow system features and capabilities. Future generation computer systems, Vol. 25, 5 (2009), 528--540.
[17]
Ewa Deelman, Gurmeet Singh, Mei-Hui Su, James Blythe, Yolanda Gil, Carl Kesselman, Gaurang Mehta, Karan Vahi, G Bruce Berriman, John Good, et al. 2005. Pegasus: A framework for mapping complex scientific workflows onto distributed systems. Scientific Programming, Vol. 13, 3 (2005), 219--237.
[18]
Joanna DeFranco-Tommarello and F Deek. 2002. Collaborative software development: a discussion of problem solving models and groupware technologies. In hicss. IEEE, 41.
[19]
FastQC. {n. d.}. A quality control tool for high throughput sequence data. http://www.bioinformatics.babraham.ac.uk/projects/fastqc/.
[20]
Xubo Fei and Shiyong Lu. 2012. A dataflow-based scientific workflow composition framework. IEEE Transactions on Services Computing, Vol. 5, 1 (2012), 45--58.
[21]
Xubo Fei, Shiyong Lu, and Jia Zhang. 2011. A Granular Concurrency Control for Collaborative Scientific Workflow Composition. In Services Computing (SCC), 2011 IEEE International Conference on. IEEE, 410--417.
[22]
Stephen M Fiore and Travis J Wiltshire. 2016. Technology as teammate: Examining the role of external cognition in support of team cognitive processes. Frontiers in psychology, Vol. 7 (2016), 1531.
[23]
Juliana Freire, David Koop, Emanuele Santos, and Cláudio T Silva. 2008. Provenance for computational tasks: A survey. Computing in Science & Engineering, Vol. 10, 3 (2008).
[24]
Liping Gao, Fangyu Yu, Qingkui Chen, and Naixue Xiong. 2016. Consistency maintenance of Do and Undo/Redo operations in real-time collaborative bitmap editing systems. Cluster Computing, Vol. 19, 1 (2016), 255--267.
[25]
Ritu Garg and Awadhesh Kumar Singh. 2015. Adaptive workflow scheduling in grid computing based on dynamic resource availability. Engineering Science and Technology, an International Journal, Vol. 18, 2 (2015), 256--269.
[26]
Jeremy Goecks, Anton Nekrutenko, and James Taylor. 2010. Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome biology, Vol. 11, 8 (2010), R86.
[27]
GoJS. {n. d.}. Interactive JavaScript Diagrams in HTML. https://gojs.net/latest/index.html .
[28]
Ian Goldin. 2010. World wide research: Reshaping the sciences and humanities .MIT Press.
[29]
Katharina Görlach, Mirko Sonntag, Dimka Karastoyanova, Frank Leymann, and Michael Reiter. 2011. Conventional workflow technology for scientific simulation. In Guide to e-Science. Springer, 323--352.
[30]
Saul Greenberg, Carl Gutwin, and Mark Roseman. 1996. Semantic telepointers for groupware. In Computer-Human Interaction, 1996. Proceedings., Sixth Australian Conference on. IEEE, 54--61.
[31]
Carl Gutwin and Saul Greenberg. 1995. Support for group awareness in real-time desktop conferences. (1995).
[32]
Sandra G Hart and Lowell E Staveland. 1988. Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research. In Advances in psychology. Vol. 52. Elsevier, 139--183.
[33]
Mark Hartswood, Rob Procter, Mark Rouncefield, and Roger Slack. 2003. Making a case in medical work: implications for the electronic medical record. Computer Supported Cooperative Work (CSCW), Vol. 12, 3 (2003), 241--266.
[34]
David Hollingsworth and UK Hampshire. 1995. Workflow management coalition: The workflow reference model. Document Number TC00--1003, Vol. 19 (1995), 16.
[35]
Marina Jirotka, Charlotte P Lee, and Gary M Olson. 2013. Supporting scientific collaboration: Methods, tools and concepts. Computer Supported Cooperative Work (CSCW), Vol. 22, 4--6 (2013), 667--715.
[36]
Marina Jirotka, Rob Procter, Mark Hartswood, Roger Slack, Andrew Simpson, Catelijne Coopmans, Chris Hinds, and Alex Voss. 2005. Collaboration and trust in healthcare innovation: The eDiaMoND case study. Computer Supported Cooperative Work (CSCW), Vol. 14, 4 (2005), 369--398.
[37]
Kaggle. {n. d.}. Titanic: Machine Learning from Disaster. https://www.kaggle.com/c/titanic/data .
[38]
Toshihiro Kamiya, Shinji Kusumoto, and Katsuro Inoue. 2002. CCFinder: a multilinguistic token-based code clone detection system for large scale source code. IEEE Transactions on Software Engineering, Vol. 28, 7 (2002), 654--670.
[39]
Cory J Kapser and Michael W Godfrey. 2006. Supporting the analysis of clones in software systems. Journal of Software: Evolution and Process, Vol. 18, 2 (2006), 61--82.
[40]
VR Kavitha and N Suresh Kumar. 2013. A Method for identifying loops in a Workflow using Petri Nets. Life Science Journal, Vol. 10, 3 (2013).
[41]
Terhi Kilamo, Antti Nieminen, Janne Lautam"aki, Timo Aho, Johannes Koskinen, Jarmo Palviainen, and Tommi Mikkonen. 2014. Knowledge transfer in collaborative teams: experiences from a two-week code camp. In Companion Proceedings of the 36th International Conference on Software Engineering. ACM, 264--271.
[42]
Rainer Koschke, Raimar Falke, and Pierre Frenzel. 2006. Clone detection using abstract syntax suffix trees. In Reverse Engineering, 2006. WCRE'06. 13th Working Conference on. IEEE, 253--262.
[43]
Cui Lin, Shiyong Lu, Xubo Fei, Artem Chebotko, Darshan Pai, Zhaoqiang Lai, Farshad Fotouhi, and Jing Hua. 2009. A reference architecture for scientific workflow management systems and the VIEW SOA solution. IEEE Transactions on Services Computing, Vol. 2, 1 (2009), 79--92.
[44]
Ji Liu, Esther Pacitti, Patrick Valduriez, and Marta Mattoso. 2015. A survey of data-intensive scientific workflow management. Journal of Grid Computing, Vol. 13, 4 (2015), 457--493.
[45]
Salvatore Loreto and Simon Pietro Romano. 2014. Real-Time Communication with WebRTC: Peer-to-Peer in the Browser ." O'Reilly Media, Inc.".
[46]
LSST. 2009. Large Synoptic Survey Telescope. http://www.lsst.org/lsst/science .
[47]
Shiyong Lu and Jia Zhang. 2009. Collaborative scientific workflows. In Web Services, 2009. ICWS 2009. IEEE International Conference on. IEEE, 527--534.
[48]
Bertram Ludascher, Ilkay Altintas, Chad Berkley, Dan Higgins, Efrat Jaeger, Matthew Jones, Edward A Lee, Jing Tao, and Yang Zhao. 2006. Scientific workflow management and the Kepler system. Concurrency and Computation: Practice and Experience, Vol. 18, 10 (2006), 1039--1065.
[49]
Paul Luff, Jon Hindmarsh, and Christian Heath. 2000. Workplace studies: Recovering work practice and informing system design .Cambridge university press.
[50]
Ruiqi Luo, Ping Yang, Shiyong Lu, and Mikhail Gofman. 2012. Analysis of scientific workflow provenance access control policies. In Services Computing (SCC), 2012 IEEE Ninth International Conference on. IEEE, 266--273.
[51]
D.H. Honemann M. Robert, W.J. Evans and T.J. Balch. {n. d.}. Robert's Rules of Order. Newly Revised, 10th Edition. Perseus Publishing Company, 2000.
[52]
Marta Mattoso, Claudia Werner, Guilherme Horta Travassos, Vanessa Braganholo, Eduardo Ogasawara, Daniel Oliveira, Sergio Cruz, Wallace Martinho, and Leonardo Murta. 2010. Towards supporting the life cycle of large scale scientific experiments. International Journal of Business Process Integration and Management, Vol. 5, 1 (2010), 79--92.
[53]
Ana Isabel Molina, Miguel Ángel Redondo, and Manuel Ortega. 2009. A methodological approach for user interface development of collaborative applications: A case study. Science of Computer Programming, Vol. 74, 9 (2009), 754--776.
[54]
Ana I Molina, Miguel A Redondo, Manuel Ortega, and Ulrich Hoppe. 2008. CIAM: A methodology for the development of groupware user interfaces. J. UCS, Vol. 14, 9 (2008), 1435--1446.
[55]
Golam Mostaeen, Banani Roy, Chanchal K. Roy, and Kevin A. Schneider. 2018a. Fine-Grained Attribute Level Locking Scheme for Collaborative Scientific Workflow Development. In Services Computing (SCC), 2018 IEEE International Conference on. IEEE, 273--277.
[56]
G. Mostaeen, Jeffrey Svajlenko, Banani Roy, Chanchal K. Roy, and K. Schneider. 2018b. On the Use of Machine Learning Techniques Towards the Design of Cloud Based Automatic Code Clone Validation Tools. In Source Code Analysis and Manipulation, 2018. SCAM 2018. 18th IEEE International Working Conference on. IEEE.
[57]
myExperiment. {n. d.}. Advanced FastQ manipulation. https://www.myexperiment.org/workflows/2944.html .
[58]
myExperiment. {n. d.}. galaxy_101. https://www.myexperiment.org/workflows/2939.html .
[59]
myExperiment. {n. d.}. NGS : Pair reads assembly with Velvet Workflow. https://www.myexperiment.org/workflows/4095.html .
[60]
myExperiment. {n. d.} d. Tuto Galaxy 2013 : CPB2012 - BasicProtocol3 - Calling Peaks for ChIP-seq Data. https://www.myexperiment.org/workflows/4094.html .
[61]
Davide Nicolini. 2012. Practice theory, work, and organization: An introduction .OUP Oxford.
[62]
Eduardo Ogasawara, Jonas Dias, Vitor Silva, Fernando Chirigati, Daniel Oliveira, Fabio Porto, Patrick Valduriez, and Marta Mattoso. 2013. Chiron: a parallel engine for algebraic scientific workflows. Concurrency and Computation: Practice and Experience, Vol. 25, 16 (2013), 2327--2341.
[63]
Tom Oinn, Mark Greenwood, Matthew Addis, M Nedim Alpdemir, Justin Ferris, Kevin Glover, Carole Goble, Antoon Goderis, Duncan Hull, Darren Marvin, et al. 2006. Taverna: lessons in creating a workflow environment for the life sciences. Concurrency and Computation: Practice and Experience, Vol. 18, 10 (2006), 1067--1100.
[64]
Angela Orebaugh, Gilbert Ramirez, and Jay Beale. 2006. Wireshark & Ethereal network protocol analyzer toolkit .Elsevier.
[65]
Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, et al. 2011. Scikit-learn: Machine learning in Python. Journal of machine learning research, Vol. 12, Oct (2011), 2825--2830.
[66]
Jeffrey M Perkel. 2014. Scientific writing: the online cooperative: collaborative browser-based tools aim to change the way researchers write and publish their papers. Nature, Vol. 514, 7520 (2014), 127--129.
[67]
Radu Prodan and Thomas Fahringer. 2005. Dynamic scheduling of scientific workflow applications on the grid: a case study. In Proceedings of the 2005 ACM symposium on Applied computing. ACM, 687--694.
[68]
David Randall, Richard Harper, and Mark Rouncefield. 2007. Fieldwork for design: theory and practice .Springer Science & Business Media.
[69]
Banani Roy and TC Nicholas Graham. 2008. An iterative framework for software architecture recovery: An experience report. In European Conference on Software Architecture. Springer, 210--224.
[70]
Banani Roy, Amit Kumar Mondal, Chanchal K Roy, Kevin A Schneider, and Kawser Wazed. 2017. Towards a reference architecture for cloud-based plant genotyping and phenotyping analysis frameworks. In 2017 IEEE International Conference on Software Architecture (ICSA). IEEE, 41--50.
[71]
Chanchal K. Roy and James R. Cordy. 2007. A survey on software clone detection research. Queen's School of Computing TR, Vol. 541, 115 (2007), 64--68.
[72]
Chanchal K. Roy and James R. Cordy. 2008. An empirical study of function clones in open source software. In Reverse Engineering, 2008. WCRE'08. 15th Working Conference on. IEEE, 81--90.
[73]
Chanchal K Roy and James R Cordy. 2008. NICAD: Accurate detection of near-miss intentional clones using flexible pretty-printing and code normalization. In Program Comprehension, 2008. ICPC 2008. The 16th IEEE International Conference on. IEEE, 172--181.
[74]
Gergely Sipos. 2012. Protecting the consistency of workflow applications in collaborative development environments. Future Generation Computer Systems, Vol. 28, 3 (2012), 500--512.
[75]
Gergely Sipos and Péter Kacsuk. 2009. Maintaining consistency properties of grid workflows in collaborative editing systems. In Grid and Cooperative Computing, 2009. GCC'09. Eighth International Conference on. IEEE, 168--175.
[76]
Gergely Sipos and Peter K Kacsuk. 2005. Collaborative workflow editing in the P-GRADE portal. (2005).
[77]
Gergely Sipos, Gareth Lewis, Péter Kacsuk, and Vassil Alexandrov. 2005. Workflow-oriented collaborative grid portals. Advances in Grid Computing-EGC 2005 (2005), 64--69.
[78]
Apache Spark. {n. d.}. Apache Spark Lightning-fast cluster computing. https://spark.apache.org/.
[79]
Chengzheng Sun. 2002. Optional and responsive fine-grain locking in Internet-based collaborative systems. IEEE Transactions on Parallel and Distributed Systems, Vol. 13, 9 (2002), 994--1008.
[80]
Chengzheng Sun and David Chen. 2002. Consistency maintenance in real-time collaborative graphics editing systems. ACM Transactions on Computer-Human Interaction (TOCHI), Vol. 9, 1 (2002), 1--41.
[81]
David Sun and Chengzheng Sun. 2006. Operation context and context-based operational transformation. In Proceedings of the 2006 20th anniversary conference on Computer supported cooperative work. ACM, 279--288.
[82]
Jeff Thomas Svajlenko et al. 2018. Large-Scale Clone Detection and Benchmarking. Ph.D. Dissertation. University of Saskatchewan.
[83]
Ian Taylor, Matthew Shields, Ian Wang, and Andrew Harrison. 2007. The triana workflow environment: Architecture and applications. Workflows for e-Science (2007), 320--339.
[84]
Helga Thorvaldsdóttir, James T Robinson, and Jill P Mesirov. 2013. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Briefings in bioinformatics, Vol. 14, 2 (2013), 178--192.
[85]
useGalaxy. {n. d.}. An open source, web-based platform for data intensive biomedical research. https://usegalaxy.org/.
[86]
Tiantian Wang, Mark Harman, Yue Jia, and Jens Krinke. 2013. Searching for better configurations: a rigorous approach to clone evaluation. In Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering. ACM, 455--465.
[87]
Jiachen Yang, Keisuke Hotta, Yoshiki Higo, Hiroshi Igaki, and Shinji Kusumoto. 2015. Classification model for code clones based on machine learning. Empirical Software Engineering, Vol. 20, 4 (2015), 1095--1125.
[88]
Jia Zhang. 2010. Co-Taverna: a tool supporting collaborative scientific workflows. In Services Computing (SCC), 2010 IEEE International Conference on. IEEE, 41--48.
[89]
Jia Zhang, Daniel Kuc, and Shiyong Lu. 2014. Confucius: A tool supporting collaborative scientific workflow composition. IEEE Transactions on Services Computing, Vol. 7, 1 (2014), 2--17.
[90]
Haibin Zhu and MengChu Zhou. 2006. Role-based collaboration and its kernel mechanisms. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), Vol. 36, 4 (2006), 578--589.

Cited By

View all
  • (2024)Analyzing Collaborative Challenges and Needs of UX Practitioners when Designing with AI/MLProceedings of the ACM on Human-Computer Interaction10.1145/36869868:CSCW2(1-25)Online publication date: 8-Nov-2024
  • (2023)Extensibility Challenges of Scientific Workflow Management SystemsHuman Interface and the Management of Information10.1007/978-3-031-35129-7_4(51-70)Online publication date: 23-Jul-2023
  • (2022)Facilitating Asynchronous Collaboration in Scientific Workflow Composition Using ProvenanceProceedings of the ACM on Human-Computer Interaction10.1145/35345206:EICS(1-26)Online publication date: 17-Jun-2022
  • Show More Cited By

Index Terms

  1. Designing for Real-Time Groupware Systems to Support Complex Scientific Data Analysis

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the ACM on Human-Computer Interaction
    Proceedings of the ACM on Human-Computer Interaction  Volume 3, Issue EICS
    June 2019
    553 pages
    EISSN:2573-0142
    DOI:10.1145/3340630
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 June 2019
    Published in PACMHCI Volume 3, Issue EICS

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. analysis
    2. collaboration
    3. data
    4. scientific
    5. workflow

    Qualifiers

    • Research-article

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)13
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 01 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Analyzing Collaborative Challenges and Needs of UX Practitioners when Designing with AI/MLProceedings of the ACM on Human-Computer Interaction10.1145/36869868:CSCW2(1-25)Online publication date: 8-Nov-2024
    • (2023)Extensibility Challenges of Scientific Workflow Management SystemsHuman Interface and the Management of Information10.1007/978-3-031-35129-7_4(51-70)Online publication date: 23-Jul-2023
    • (2022)Facilitating Asynchronous Collaboration in Scientific Workflow Composition Using ProvenanceProceedings of the ACM on Human-Computer Interaction10.1145/35345206:EICS(1-26)Online publication date: 17-Jun-2022
    • (2021)Designing for Recommending Intermediate States in A Scientific Workflow Management SystemProceedings of the ACM on Human-Computer Interaction10.1145/34571455:EICS(1-29)Online publication date: 29-May-2021

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media