[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
Reflects downloads up to 11 Dec 2024Bibliometrics
research-article
Discord-based counterfactual explanations for time series classification
Abstract

The opacity inherent in machine learning models presents a significant hindrance to their widespread incorporation into decision-making processes. To address this challenge and foster trust among stakeholders while ensuring decision fairness, the ...

research-article
Robust explainer recommendation for time series classification
Abstract

Time series classification is a task which deals with temporal sequences, a prevalent data type common in domains such as human activity recognition, sports analytics and general sensing. In this area, interest in explanability has been growing as ...

research-article
GeoRF: a geospatial random forest
Abstract

The geospatial domain increasingly relies on data-driven methodologies to extract actionable insights from the growing volume of available data. Despite the effectiveness of tree-based models in capturing complex relationships between features and ...

research-article
Modelling event sequence data by type-wise neural point process
Abstract

Event sequence data widely exists in real life, where each event is typically represented as a tuple, event type and occurrence time. Recently, neural point process (NPP), a probabilistic model that learns the next event distribution with events ...

research-article
Randomnet: clustering time series using untrained deep neural networks
Abstract

Neural networks are widely used in machine learning and data mining. Typically, these networks need to be trained, implying the adjustment of weights (parameters) within the network based on the input data. In this work, we propose a novel ...

research-article
Towards effective urban region-of-interest demand modeling via graph representation learning
Abstract

Identifying the region’s functionalities and what the specific Point-of-Interest (POI) needs is essential for effective urban planning. However, due to the diversified and ambiguity nature of urban regions, there are still some significant ...

research-article
Knowledge graph embedding closed under composition
Abstract

Knowledge Graph Embedding (KGE) has attracted increasing attention. Relation patterns, such as symmetry and inversion, have received considerable focus. Among them, composition patterns are particularly important, as they involve nearly all ...

research-article
On regime changes in text data using hidden Markov model of contaminated vMF distribution
Abstract

This paper presents a novel methodology for analyzing temporal directional data with scatter and heavy tails. A hidden Markov model with contaminated von Mises-Fisher emission distribution is developed. The model is implemented using forward and ...

research-article
Negative-sample-free knowledge graph embedding
Abstract

Recently, knowledge graphs (KGs) have been shown to benefit many machine learning applications in multiple domains (e.g. self-driving, agriculture, bio-medicine, recommender systems, etc.). However, KGs suffer from incompleteness, which motivates ...

research-article
Explainable decomposition of nested dense subgraphs
Abstract

Discovering dense regions in a graph is a popular tool for analyzing graphs. While useful, analyzing such decompositions may be difficult without additional information. Fortunately, many real-world networks have additional information, namely ...

research-article
Bayesian network Motifs for reasoning over heterogeneous unlinked datasets
Abstract

Modern data-oriented applications often require integrating data from multiple heterogeneous sources. When these datasets share attributes, but are otherwise unlinked, there is no way to join them and reason at the individual level explicitly. ...

research-article
Gradient-based explanation for non-linear non-parametric dimensionality reduction
Abstract

Dimensionality reduction (DR) is a popular technique that shows great results to analyze high-dimensional data. Generally, DR is used to produce visualizations in 2 or 3 dimensions. While it can help understanding correlations between data, ...

research-article
Evaluating outlier probabilities: assessing sharpness, refinement, and calibration using stratified and weighted measures
Abstract

An outlier probability is the probability that an observation is an outlier. Typically, outlier detection algorithms calculate real-valued outlier scores to identify outliers. Converting outlier scores into outlier probabilities increases the ...

research-article
Sequential query prediction based on multi-armed bandits with ensemble of transformer experts and immediate feedback
Abstract

We study the problem of predicting the next query to be recommended in interactive data exploratory analysis to guide users to correct content. Current query prediction approaches are based on sequence-to-sequence learning, exploiting past ...

research-article
De-confounding representation learning for counterfactual inference on continuous treatment via generative adversarial network
Abstract

Counterfactual inference for continuous rather than binary treatment variables is more common in real-world causal inference tasks. While there are already some sample reweighting methods based on Marginal Structural Model for eliminating the ...

research-article
Enhancing racism classification: an automatic multilingual data annotation system using self-training and CNN
Abstract

Accurate racism classification is crucial on social media, where racist and discriminatory content can harm individuals and society. Automated racism detection requires gathering and annotating a wide range of diverse and representative data as an ...

research-article
Statistical methods utilizing structural properties of time-evolving networks for event detection
Abstract

With the advancement of technology, real-world networks have become vulnerable to many attacks such as cyber-crimes, terrorist attacks, and financial frauds. Accuracy and scalability are the two principal but contrary characteristics for ...

research-article
ArcMatch: high-performance subgraph matching for labeled graphs by exploiting edge domains
Abstract

Consider a large labeled graph (network), denoted the target. Subgraph matching is the problem of finding all instances of a small subgraph, denoted the query, in the target graph. Unlike the majority of existing methods that are restricted to ...

research-article
Detach-ROCKET: sequential feature selection for time series classification with random convolutional kernels
Abstract

Time Series Classification (TSC) is essential in fields like medicine, environmental science, and finance, enabling tasks such as disease diagnosis, anomaly detection, and stock price analysis. While machine learning models like Recurrent Neural ...

research-article
Efficient learning with projected histograms
Abstract

High dimensional learning is a perennial problem due to challenges posed by the “curse of dimensionality”; learning typically demands more computing resources as well as more training data. In differentially private (DP) settings, this is further ...

research-article
Opinion dynamics in social networks incorporating higher-order interactions
Abstract

The issue of opinion sharing and formation has received considerable attention in the academic literature, and a few models have been proposed to study this problem. However, existing models are limited to the interactions among nearest neighbors, ...

research-article
Random walks with variable restarts for negative-example-informed label propagation
Abstract

Label propagation is frequently encountered in machine learning and data mining applications on graphs, either as a standalone problem or as part of node classification. Many label propagation algorithms utilize random walks (or network ...

research-article
Evaluating the disclosure risk of anonymized documents via a machine learning-based re-identification attack
Abstract

The availability of textual data depicting human-centered features and behaviors is crucial for many data mining and machine learning tasks. However, data containing personal information should be anonymized prior making them available for ...

research-article
Regularization-based methods for ordinal quantification
Abstract

Quantification, i.e., the task of predicting the class prevalence values in bags of unlabeled data items, has received increased attention in recent years. However, most quantification research has concentrated on developing algorithms for binary ...

research-article
FRUITS: feature extraction using iterated sums for time series classification
Abstract

We introduce a pipeline for time series classification that extracts features based on the iterated-sums signature (ISS) and then applies a linear classifier. These features are intrinsically nonlinear, capture chronological information, and, ...

research-article
Bounding the family-wise error rate in local causal discovery using Rademacher averages
Abstract

Many algorithms have been proposed to learn local graphical structures around target variables of interest from observational data, focusing on two sets of variables. The first one, called Parent–Children (PC) set, contains all the variables that ...

research-article
Model-agnostic variable importance for predictive uncertainty: an entropy-based approach
Abstract

In order to trust the predictions of a machine learning algorithm, it is necessary to understand the factors that contribute to those predictions. In the case of probabilistic and uncertainty-aware models, it is necessary to understand not only ...

research-article
FRAPPE: fast rank approximation with explainable features for tensors
Abstract

Tensor decompositions have proven to be effective in analyzing the structure of multidimensional data. However, most of these methods require a key parameter: the number of desired components. In the case of the CANDECOMP/PARAFAC decomposition (...

Comments

Please enable JavaScript to view thecomments powered by Disqus.