Abstract
What is Data Science? Data contains science. It is much different from the angle of classical mathematics that uses mathematical models to fit the data. Today, we are supposed to find rules and properties in the data set, even among different data sets. In this chapter, we will explain data science and its relationship to BigData, cloud computing and data mining. We also discuss current research problems in data science and provide concerns relating to a baseline of the data science industry.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
J. Abernethy, Y. Chen, J.W. Vaughan, Efficient market making via convex optimization, and a connection to online learning. ACM Trans. Econ. Comput. 1(2), Article 12 (2013)
L. Balzano, R. Nowak, A. Szlam, B. Recht, k-Subspaces with missing data. University of Wisconsin, Madison, Technical Report ECE-11-02, February 2011
R.E. Bryant, R.H. Katz, E.D. Lazowska, Big-Data computing: Creating revolutionary breakthroughs in commerce, science, and society (2008). Computing Research Consortium, at http://www.cra.org/ccc/resources/ccc-led-white-papers/. Accessed 20 September 2013
E.J. Candes, T. Tao, The power of convex relaxation: near-optimal matrix completion. IEEE Trans. Inf. Theory 56, 2053–2080 (2010)
L.M. Chen, Digital and Discrete Geometry: Theory and Algorithms, Springer, 2014
L. Chen, How to be a good programmer. ACM SigAct News, 42(2), 77–81 (2011)
W.S. Cleveland, Data science: an action plan for expanding the technical areas of the field of statistics. Int. Stat. Rev./Revue Internationale de Statistique 69(1), 21–26 (2001)
J. Dean, S. Ghemawat, MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)
V. Dhar, Data science and prediction. Commun. ACM 56, 12–64 (2013)
B. Eriksson, L. Balzano, R. Nowak, High rank matrix completion, in Proceedings of International Conference on Artificial Intelligence and Statistics (2012). http://jmlr.csail.mit.edu/proceedings/papers/v22/eriksson12/eriksson12.pdf
Hadoop: Open source implementation of MapReduce. http://lucene.apache.org/hadoop/ (2014)
J. Han, M. Kamber, Data Mining: Concepts and Techniques (Morgan Kaufmann, Los Altos, CA, 2001)
T. Hastie, R. Tibshirani, J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd edn. (Springer, New York, 2009)
T. Hey, S. Tansley, K. Tolle, Jim Grey on eScience: a transformed scientific method, in The Fourth Paradigm: Data-Intensive Scientific Discovery. Redmond: Microsoft Research, ed. by T. Hey, S. Tansley, K. Tolle (2009), pp. xvii–xxxi, Microsoft Research
K. Kanatani, Motion segmentation by subspace separation and model selection, in Proceedings. Eighth IEEE International Conference on Computer Vision, 2001. ICCV 2001, vol. 2 (2001), pp. 586–591
R. Kennedy, B. Laura S.J. Wright, C.J. Taylor, Online algorithms for factorization-based structure from motion. ArXiv e-print 1309.6964 (2013). http://arxiv.org/abs/1309.6964. 2014 IEEE Winter Conference on Date of Conference on Applications of Computer Vision (WACV) 24–26, 37–44 (2014)
R. Kitchin, Big Data, new epistemologies and paradigm shifts. Big Data Soc. 1(1), 1–12 (2014)
T. Kuhn, The Structure of Scientific Revolutions (University of Chicago Press, Chicago, 1962)
J.-C. Pinoli, Mathematical Foundations of Image Processing and Analysis, vols. 1, 2 (Wiley, New York, 2014)
B. Recht, A simpler approach to matrix completion. J. Mach. Learn. Res. 12, 3413–3430 (2011). arXiv:0910.0651v2
A. Teran, Real-time multi-target tracking: a study on color-texture covariance matrices and descriptor/operator switching, University of Paris Sud, Paris XI, Ph.D. Thesis, 2013
R. Vidal, A tutorial on subspace clustering. Johns Hopkins Technical Report, 2010. http://www.cis.jhu.edu/~rvidal/publications/SPM-Tutorial-Final.pdf
R. Vidal, R. Tron, R. Hartley, Multiframe motion segmentation with missing data using PowerFactorization and GPCA. Int. J. Comput. Vis. 79(1), 85–105 (2008)
T. White, Hadoop: The Definitive Guide, 4th edn. (O’Reilly, Sebastopol, CA, 2015)
A. Yang, J. Wright, Y. Ma, S. Sastry, Unsupervised segmentation of natural images via lossy data compression. Comput. Vis. Image Underst. 110(2), 212–225 (2008)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Chen, L.M. (2015). Introduction: Data Science and BigData Computing. In: Mathematical Problems in Data Science. Springer, Cham. https://doi.org/10.1007/978-3-319-25127-1_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-25127-1_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25125-7
Online ISBN: 978-3-319-25127-1
eBook Packages: Computer ScienceComputer Science (R0)