[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Next Issue
Volume 2, June
Previous Issue
Volume 1, December
You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 

Big Data Cogn. Comput., Volume 2, Issue 1 (March 2018) – 8 articles

Cover Story (view full-size image): In the Internet era of information overload, how does the individual filter and process available knowledge? This paper highlights the relationship between the Worldwide online interest in the term ‘Anti Vaccine’ and the decrease in Measles immunization percentages; indicating the role the Internet plays in the spreading of false information. This finding supports previous research suggesting that conspiracist ideation is related to the rejection of scientific propositions, which, in this case, could affect public health. Furthermore, significant correlations are observed between ‘Measles’ Google queries and reported cases in most EU countries. Therefore, monitoring the online behavioral variations is required for nowcasting the interest towards Measles, so that health officials can deal with reported cases in a timely manner and take the appropriate preventive measures. View the paper here.
  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.
Order results
Result details
Select all
Export citation of selected articles as:
8 pages, 748 KiB  
Article
A Deep Learning Model of Perception in Color-Letter Synesthesia
by Joel R. Bock
Big Data Cogn. Comput. 2018, 2(1), 8; https://doi.org/10.3390/bdcc2010008 - 13 Mar 2018
Cited by 1 | Viewed by 7856
Abstract
Synesthesia is a psychological phenomenon where sensory signals become mixed. Input to one sensory modality produces an experience in a second, unstimulated modality. In “grapheme-color synesthesia”, viewed letters and numbers evoke mental imagery of colors. The study of this condition has implications for [...] Read more.
Synesthesia is a psychological phenomenon where sensory signals become mixed. Input to one sensory modality produces an experience in a second, unstimulated modality. In “grapheme-color synesthesia”, viewed letters and numbers evoke mental imagery of colors. The study of this condition has implications for increasing our understanding of brain architecture and function, language, memory and semantics, and the nature of consciousness. In this work, we propose a novel application of deep learning to model perception in grapheme-color synesthesia. Achromatic letter images, taken from database of handwritten characters, are used to train the model, and to induce computational synesthesia. Results show the model learns to accurately create a colored version of the inducing stimulus, according to a statistical distribution from experiments on a sample population of grapheme-color synesthetes. To the author’s knowledge, this work represents the first model that accurately produces spontaneous, creative mental imagery characteristic of the synesthetic perceptual experience. Experiments in cognitive science have contributed to our understanding of some of the observable behavioral effects of synesthesia, and previous models have outlined neural mechanisms that may account for these observations. A model of synesthesia that generates testable predictions on brain activity and behavior is needed to complement large scale data collection efforts in neuroscience, especially when articulating simple descriptions of cause (stimulus) and effect (behavior). The research and modeling approach reported here provides a framework that begins to address this need. Full article
(This article belongs to the Special Issue Learning with Big Data: Scalable Algorithms and Novel Applications)
Show Figures

Figure 1

Figure 1
<p>Synesthetic letter colorization by the evolving generative network: (<b>a</b>) generator input; and (<b>b</b>) generator output. Early training: 1st epoch; &lt;3000 iterations. One letter per column. Iteration count progresses from top to bottom.</p>
Full article ">Figure 2
<p>Synesthetic letter colorization by the trained generative network: (<b>a</b>) generator input; and (<b>b</b>) generator output. Late training: 3rd epoch, ∼375,000 iterations. One letter per column. Iteration count progresses from top to bottom.</p>
Full article ">
15 pages, 523 KiB  
Article
A Multi-Modality Deep Network for Cold-Start Recommendation
by Mingxuan Sun, Fei Li and Jian Zhang
Big Data Cogn. Comput. 2018, 2(1), 7; https://doi.org/10.3390/bdcc2010007 - 5 Mar 2018
Cited by 19 | Viewed by 6532
Abstract
Collaborative filtering (CF) approaches, which provide recommendations based on ratings or purchase history, perform well for users and items with sufficient interactions. However, CF approaches suffer from the cold-start problem for users and items with few ratings. Hybrid recommender systems that combine collaborative [...] Read more.
Collaborative filtering (CF) approaches, which provide recommendations based on ratings or purchase history, perform well for users and items with sufficient interactions. However, CF approaches suffer from the cold-start problem for users and items with few ratings. Hybrid recommender systems that combine collaborative filtering and content-based approaches have been proved as an effective way to alleviate the cold-start issue. Integrating contents from multiple heterogeneous data sources such as reviews and product images is challenging for two reasons. Firstly, mapping contents in different modalities from the original feature space to a joint lower-dimensional space is difficult since they have intrinsically different characteristics and statistical properties, such as sparse texts and dense images. Secondly, most algorithms only use content features as the prior knowledge to improve the estimation of user and item profiles but the ratings do not directly provide feedback to guide feature extraction. To tackle these challenges, we propose a tightly-coupled deep network model for fusing heterogeneous modalities, to avoid tedious feature extraction in specific domains, and to enable two-way information propagation from both content and rating information. Experiments on large-scale Amazon product data in book and movie domains demonstrate the effectiveness of the proposed model for cold-start recommendation. Full article
(This article belongs to the Special Issue Learning with Big Data: Scalable Algorithms and Novel Applications)
Show Figures

Figure 1

Figure 1
<p>Rating prediction with deep fused embedding.</p>
Full article ">Figure 2
<p>Rating prediction measured by mean squared error (MSE) and mean absolute error (MAE) with respect to latent dimension size on Movie (left) and Book (right).</p>
Full article ">Figure 3
<p>Case study for a Movie user (<b>top</b>) and a Book user (<b>bottom</b>): <b>Left</b>: user’s top 3 favorite items. <b>Right</b> top 3 items our model recommends.</p>
Full article ">
19 pages, 797 KiB  
Article
A Rule Extraction Study from SVM on Sentiment Analysis
by Guido Bologna and Yoichi Hayashi
Big Data Cogn. Comput. 2018, 2(1), 6; https://doi.org/10.3390/bdcc2010006 - 2 Mar 2018
Cited by 17 | Viewed by 4762
Abstract
A natural way to determine the knowledge embedded within connectionist models is to generate symbolic rules. Nevertheless, extracting rules from Multi Layer Perceptrons (MLPs) is NP-hard. With the advent of social networks, techniques applied to Sentiment Analysis show a growing interest, but rule [...] Read more.
A natural way to determine the knowledge embedded within connectionist models is to generate symbolic rules. Nevertheless, extracting rules from Multi Layer Perceptrons (MLPs) is NP-hard. With the advent of social networks, techniques applied to Sentiment Analysis show a growing interest, but rule extraction from connectionist models in this context has been rarely performed because of the very high dimensionality of the input space. To fill the gap we present a case study on rule extraction from ensembles of Neural Networks and Support Vector Machines (SVMs), the purpose being the characterization of the complexity of the rules on two particular Sentiment Analysis problems. Our rule extraction method is based on a special Multi Layer Perceptron architecture for which axis-parallel hyperplanes are precisely located. Two datasets representing movie reviews are transformed into Bag-of-Words vectors and learned by ensembles of neural networks and SVMs. Generated rules from ensembles of MLPs are less accurate and less complex than those extracted from SVMs. Moreover, a clear trade-off appears between rules’ accuracy, complexity and covering. For instance, if rules are too complex, less complex rules can be re-extracted by sacrificing to some extent their accuracy. Finally, rules can be viewed as feature detectors in which very often only one word must be present and a longer list of words must be absent. Full article
(This article belongs to the Special Issue Big Data Analytic: From Accuracy to Interpretability)
Show Figures

Figure 1

Figure 1
<p>Plot of average complexity of rules versus average fidelity (RT-2k problem). Average complexity is the product of average number of rules by average number of antecedents per rule.</p>
Full article ">Figure 2
<p>Plot of average complexity of rules versus average fidelity (RT-s problem).</p>
Full article ">Figure 3
<p>A DIMLP network creating two discriminative hyperplanes. The activation function of neurons <math display="inline"> <semantics> <msub> <mi>h</mi> <mn>1</mn> </msub> </semantics> </math> and <math display="inline"> <semantics> <msub> <mi>h</mi> <mn>2</mn> </msub> </semantics> </math> is a step function, while for output neuron <math display="inline"> <semantics> <msub> <mi>y</mi> <mn>1</mn> </msub> </semantics> </math> it is a sigmoid.</p>
Full article ">Figure 4
<p>Transparency of DIMLP ensembles by majority voting, linear combinations and non-linear combinations.</p>
Full article ">Figure 5
<p>A QSVM network with Gaussian kernel.</p>
Full article ">
15 pages, 595 KiB  
Article
A Machine Learning Approach for Air Quality Prediction: Model Regularization and Optimization
by Dixian Zhu, Changjie Cai, Tianbao Yang and Xun Zhou
Big Data Cogn. Comput. 2018, 2(1), 5; https://doi.org/10.3390/bdcc2010005 - 24 Feb 2018
Cited by 129 | Viewed by 16275
Abstract
In this paper, we tackle air quality forecasting by using machine learning approaches to predict the hourly concentration of air pollutants (e.g., ozone, particle matter ( PM 2.5 ) and sulfur dioxide). Machine learning, as one of the most popular techniques, is able [...] Read more.
In this paper, we tackle air quality forecasting by using machine learning approaches to predict the hourly concentration of air pollutants (e.g., ozone, particle matter ( PM 2.5 ) and sulfur dioxide). Machine learning, as one of the most popular techniques, is able to efficiently train a model on big data by using large-scale optimization algorithms. Although there exist some works applying machine learning to air quality prediction, most of the prior studies are restricted to several-year data and simply train standard regression models (linear or nonlinear) to predict the hourly air pollution concentration. In this work, we propose refined models to predict the hourly air pollution concentration on the basis of meteorological data of previous days by formulating the prediction over 24 h as a multi-task learning (MTL) problem. This enables us to select a good model with different regularization techniques. We propose a useful regularization by enforcing the prediction models of consecutive hours to be close to each other and compare it with several typical regularizations for MTL, including standard Frobenius norm regularization, nuclear norm regularization, and 2 , 1 -norm regularization. Our experiments have showed that the proposed parameter-reducing formulations and consecutive-hour-related regularizations achieve better performance than existing standard regression models and existing regularizations. Full article
(This article belongs to the Special Issue Learning with Big Data: Scalable Algorithms and Novel Applications)
Show Figures

Figure 1

Figure 1
<p>Locations of measurement sites. <span class="html-italic">Blue stars</span> denote the two air quality monitoring sites. <span class="html-italic">Red circles</span> denote the two meteorological sites.</p>
Full article ">Figure 2
<p>Improvement of different methods over the baseline method for Lewis University–Lemont Village (LU–LV) dataset.</p>
Full article ">Figure 3
<p>Improvement of different methods over the baseline method for Lansing Municipal Airport–Alsip Village (LMA–AV) dataset.</p>
Full article ">Figure 4
<p>Optimization techniques.</p>
Full article ">
13 pages, 854 KiB  
Article
Reimaging Research Methodology as Data Science
by Ben Kei Daniel
Big Data Cogn. Comput. 2018, 2(1), 4; https://doi.org/10.3390/bdcc2010004 - 12 Feb 2018
Cited by 14 | Viewed by 8986
Abstract
The growing volume of data generated by machines, humans, software applications, sensors and networks, together with the associated complexity of the research environment, requires immediate pedagogical innovations in academic programs on research methodology. This article draws insights from a large-scale research project examining [...] Read more.
The growing volume of data generated by machines, humans, software applications, sensors and networks, together with the associated complexity of the research environment, requires immediate pedagogical innovations in academic programs on research methodology. This article draws insights from a large-scale research project examining current conceptions and practices of academics (n = 144) involved in the teaching of research methods in research-intensive universities in 17 countries. The data was obtained through an online questionnaire. The main findings reveal that a large number of academics involved in the teaching of research methods courses tend to teach the same classes for many years, in the same way, despite the changing nature of data, and complexity of the environment in which research is conducted. Furthermore, those involved in the teaching of research methods courses are predominantly volunteer academics, who tend to view the subject only as an “add-on” to their other teaching duties. It was also noted that universities mainly approach the teaching of research methods courses as a “service” to students and departments, not part of the core curriculum. To deal with the growing changes in data structures, and technology driven research environment, the study recommends institutions to reimage research methodology programs to enable students to develop appropriate competences to deal with the challenges of working with complex and large amounts of data and associated analytics. Full article
(This article belongs to the Special Issue Big Data Analytic: From Accuracy to Interpretability)
Show Figures

Figure 1

Figure 1
<p>Countries where participants were employed at the time of the research.</p>
Full article ">Figure 2
<p>Students’ disciplines and subjects.</p>
Full article ">Figure 3
<p>Proposed Data Science and Research Methodology Curriculum.</p>
Full article ">
15 pages, 1111 KiB  
Article
Big Data Processing and Analytics Platform Architecture for Process Industry Factories
by Martin Sarnovsky, Peter Bednar and Miroslav Smatana
Big Data Cogn. Comput. 2018, 2(1), 3; https://doi.org/10.3390/bdcc2010003 - 26 Jan 2018
Cited by 26 | Viewed by 9368
Abstract
This paper describes the architecture of a cross-sectorial Big Data platform for the process industry domain. The main objective was to design a scalable analytical platform that will support the collection, storage and processing of data from multiple industry domains. Such a platform [...] Read more.
This paper describes the architecture of a cross-sectorial Big Data platform for the process industry domain. The main objective was to design a scalable analytical platform that will support the collection, storage and processing of data from multiple industry domains. Such a platform should be able to connect to the existing environment in the plant and use the data gathered to build predictive functions to optimize the production processes. The analytical platform will contain a development environment with which to build these functions, and a simulation environment to evaluate the models. The platform will be shared among multiple sites from different industry sectors. Cross-sectorial sharing will enable the transfer of knowledge across different domains. During the development, we adopted a user-centered approach to gather requirements from different stakeholders which were used to design architectural models from different viewpoints, from contextual to deployment. The deployed architecture was tested in two process industry domains, one from the aluminium production and the other from the plastic molding industry. Full article
(This article belongs to the Special Issue Big Data Analytic: From Accuracy to Interpretability)
Show Figures

Figure 1

Figure 1
<p>High level architecture overview. Abbreviations: ERP, Enterprise resource planning; SCADA, Supervisory control and data acquisition; MES, Manufacturing execution system; PLC, Programmable Logic Controller.</p>
Full article ">Figure 2
<p>The main concepts of the Semantic Modelling Framework. Abbreviation: KPI, Key-Performance Indicator.</p>
Full article ">Figure 3
<p>Plant Operational Platform architecture. Abbreviations: IoT, Internet of Things; RDBMS, relational database management system; NoSQL, Non Structured Query Language; API, application programming interface.</p>
Full article ">Figure 4
<p>The architecture of the Cross-Sectorial Data Lab platform.</p>
Full article ">Figure 5
<p>The internal architecture of the Big Data Storage and Analytics Platform.</p>
Full article ">Figure 6
<p>Deployment view with the main types of nodes.</p>
Full article ">
18 pages, 9322 KiB  
Article
The Internet and the Anti-Vaccine Movement: Tracking the 2017 EU Measles Outbreak
by Amaryllis Mavragani and Gabriela Ochoa
Big Data Cogn. Comput. 2018, 2(1), 2; https://doi.org/10.3390/bdcc2010002 - 16 Jan 2018
Cited by 35 | Viewed by 15956
Abstract
In the Internet Era of information overload, how does the individual filter and process available knowledge? In addressing this question, this paper examines the behavioral changes in the online interest in terms related to Measles and the Anti-Vaccine Movement from 2004 to 2017, [...] Read more.
In the Internet Era of information overload, how does the individual filter and process available knowledge? In addressing this question, this paper examines the behavioral changes in the online interest in terms related to Measles and the Anti-Vaccine Movement from 2004 to 2017, in order to identify any relationships between the decrease in immunization percentages, the Anti-Vaccine Movement, and the increased reported Measles cases. The results show that statistically significant positive correlations exist between monthly Measles cases and Google queries in the respective translated terms in most EU28 countries from January 2011 to August 2017. Furthermore, a strong negative correlation (p < 0.01) exists between the online interest in the term ‘Anti Vaccine’ and the Worldwide immunization percentages from 2004 to 2016. The latter could be supportive of previous work suggesting that conspiracist ideation is related to the rejection of scientific propositions. As Measles require the highest immunization percentage out of the vaccine preventable diseases, the 2017 EU outbreak could be the first of several other diseases’ outbreaks or epidemics in the near future should the immunization percentages continue to decrease. Big Data Analytics in general and the analysis of Google queries in specific have been shown to be valuable in addressing health related topics up to this point. Therefore, analyzing the variations and patterns of available online information could assist health officials with the assessment of reported cases, as well as taking the required preventive actions. Full article
(This article belongs to the Special Issue Health Assessment in the Big Data Era)
Show Figures

Figure 1

Figure 1
<p>Worldwide Interest in ‘Measles’, ‘Mumps’, ‘Rubella’, and ‘MMR’ from 2004 to 2017.</p>
Full article ">Figure 2
<p>Worldwide Interest by Country in Measles from 2004 to 2017 (gray indicates zero scoring).</p>
Full article ">Figure 3
<p>Worldwide Interest by Country in Mumps from 2004 to 2017 (gray indicates zero scoring).</p>
Full article ">Figure 4
<p>Worldwide Interest by Country in Rubella from 2004 to 2017 (gray indicates zero scoring).</p>
Full article ">Figure 5
<p>Worldwide Interest by Country in MMR from 2004 to 2017 (gray indicates zero scoring).</p>
Full article ">Figure 6
<p>Worldwide Online Interest in the term ‘Anti Vaccine’ from January 2004 to August 2017.</p>
Full article ">Figure 7
<p>EU28 Online Interest in the English (blue) and Translated (red) Terms for ‘Measles’ from January 2004 to August 2017.</p>
Full article ">Figure 7 Cont.
<p>EU28 Online Interest in the English (blue) and Translated (red) Terms for ‘Measles’ from January 2004 to August 2017.</p>
Full article ">Figure 7 Cont.
<p>EU28 Online Interest in the English (blue) and Translated (red) Terms for ‘Measles’ from January 2004 to August 2017.</p>
Full article ">Figure 7 Cont.
<p>EU28 Online Interest in the English (blue) and Translated (red) Terms for ‘Measles’ from January 2004 to August 2017.</p>
Full article ">Figure 8
<p>EU28 Population Coverage (%) of the 1st and 2nd Dose of the Vaccine for Measles from 1980 to 2016.</p>
Full article ">Figure 9
<p>EU28 Population Coverage (%) for the 1st Dose in 2016.</p>
Full article ">Figure 10
<p>EU28 Population Coverage (%) for the 2nd Dose in 2016.</p>
Full article ">
1 pages, 139 KiB  
Editorial
Acknowledgement to Reviewers of BDCC in 2017
by BDCC Editorial Office
Big Data Cogn. Comput. 2018, 2(1), 1; https://doi.org/10.3390/bdcc2010001 - 12 Jan 2018
Viewed by 2765
Abstract
Peer review is an essential part in the publication process, ensuring that BDCC maintains high quality standards for its published papers [...]
Full article
Previous Issue
Next Issue
Back to TopTop