[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.5555/3327757guideproceedingsBook PagePublication PagesnipsConference Proceedingsconference-collections
NIPS'18: Proceedings of the 32nd International Conference on Neural Information Processing Systems
2018 Proceeding
Publisher:
  • Curran Associates Inc.
  • 57 Morehouse Lane
  • Red Hook
  • NY
  • United States
Conference:
Montréal Canada December 3 - 8, 2018
Published:
03 December 2018

Reflects downloads up to 14 Dec 2024Bibliometrics
Abstract

No abstract available.

Article
Free
Compact generalized non-local network
Pages 6511–6520

The non-local module [27] is designed for capturing long-range spatio-temporal dependencies in images and videos. Although having shown excellent performance, it lacks the mechanism to model the interactions between positions across channels, which are ...

Article
Free
On the Local Hessian in back-propagation
Pages 6521–6531

Back-propagation (BP) is the foundation for successfully training deep neural networks. However, BP sometimes has difficulties in propagating a learning signal deep enough effectively, e.g., the vanishing gradient phenomenon. Meanwhile, BP often works ...

Article
Free
The everlasting database: statistical validity at a fair price
Pages 6532–6541

The problem of handling adaptivity in data analysis, intentional or not, permeates a variety of fields, including test-set overfitting in ML challenges and the accumulation of invalid scientific discoveries. We propose a mechanism for answering an ...

Article
Free
Lipschitz-margin training: scalable certification of perturbation invariance for deep neural networks
Pages 6542–6551

High sensitivity of neural networks against malicious perturbations on inputs causes security concerns. To take a steady step towards robust classifiers, we aim to create neural network models provably defended from perturbations. Prior certification ...

Article
Free
Proximal SCOPE for distributed sparse learning
Pages 6552–6561

Distributed sparse learning with a cluster of multiple machines has attracted much attention in machine learning, especially for large-scale applications with high-dimensional data. One popular way to implement sparse learning is to use L1 ...

Article
Free
On coresets for logistic regression
Pages 6562–6571

Coresets are one of the central methods to facilitate the analysis of large data. We continue a recent line of research applying the theory of coresets to logistic regression. First, we show the negative result that no strongly sublinear sized coresets ...

Article
Free
Neural ordinary differential equations
Pages 6572–6583

We introduce a new family of deep neural network models. Instead of specifying a discrete sequence of hidden layers, we parameterize the derivative of the hidden state using a neural network. The output of the network is computed using a black-box ...

Article
Free
Unsupervised learning of artistic styles with archetypal style analysis
Pages 6584–6593

In this paper, we introduce an unsupervised learning approach to automatically discover, summarize, and manipulate artistic styles from large collections of paintings. Our method is based on archetypal analysis, which is an unsupervised learning ...

Article
Free
Approximating real-time recurrent learning with random Kronecker factors
Pages 6594–6603

Despite all the impressive advances of recurrent neural networks, sequential data is still in need of better modelling. Truncated backpropagation through time (TBPTT), the learning algorithm most widely used in practice, suffers from the truncation bias,...

Article
Free
Contamination attacks and mitigation in multi-party machine learning
Pages 6604–6616

Machine learning is data hungry; the more data a model has access to in training, the more likely it is to perform well at inference time. Distinct parties may want to combine their local data to gain the benefits of a model trained on a large corpus of ...

Article
Free
An improved analysis of alternating minimization for structured multi-response regression
Pages 6617–6628

Multi-response linear models aggregate a set of vanilla linear models by assuming correlated noise across them, which has an unknown covariance structure. To find the coefficient vector, estimators with a joint approximation of the noise covariance are ...

Article
Free
Incorporating context into language encoding models for fMRI
Pages 6629–6638

Language encoding models help explain language processing in the human brain by learning functions that predict brain responses from the language stimuli that elicited them. Current word embedding-based approaches treat each stimulus word independently ...

Article
Free
CatBoost: unbiased boosting with categorical features
Pages 6639–6649

This paper presents the key algorithmic techniques behind CatBoost, a new gradient boosting toolkit. Their combination leads to CatBoost outperforming other publicly available boosting implementations in terms of quality on a variety of datasets. Two ...

Article
Free
Query K-means clustering and the double dixie cup problem
Pages 6650–6659

We consider the problem of approximate K-means clustering with outliers and side information provided by same-cluster queries and possibly noisy answers. Our solution shows that, under some mild assumptions on the smallest cluster size, one can obtain ...

Article
Free
Training neural networks using features replay
Pages 6660–6669

Training a neural network using backpropagation algorithm requires passing error gradients sequentially through the network. The backward locking prevents us from updating network layers in parallel and fully leveraging the computing resources. Recently,...

Article
Free
Modeling dynamic missingness of implicit feedback for recommendation
Pages 6670–6679

Implicit feedback is widely used in collaborative filtering methods for recommendation. It is well known that implicit feedback contains a large number of values that are missing not at random (MNAR); and the missing data is a mixture of negative and ...

Article
Free
Representation learning of compositional data
Pages 6680–6690

We consider the problem of learning a low dimensional representation for compositional data. Compositional data consists of a collection of nonnegative data that sum to a constant value. Since the parts of the collection are statistically dependent, ...

Article
Free
Model-based targeted dimensionality reduction for neuronal population data
Pages 6691–6700

Summarizing high-dimensional data using a small number of parameters is a ubiquitous first step in the analysis of neuronal population activity. Recently developed methods use "targeted" approaches that work by identifying multiple, distinct low-...

Article
Free
On gradient regularizers for MMD GANs
Pages 6701–6711

We propose a principled method for gradient-based regularization of the critic of GAN-like models trained by adversarially optimizing the kernel of a Maximum Mean Discrepancy (MMD). We show that controlling the gradient of the critic is vital to having ...

Article
Free
Heterogeneous multi-output Gaussian process prediction
Pages 6712–6721

We present a novel extension of multi-output Gaussian processes for handling heterogeneous outputs. We assume that each output has its own likelihood function and use a vector-valued Gaussian process prior to jointly model the parameters in all ...

Article
Free
Large-scale stochastic sampling from the probability simplex
Pages 6722–6732

Stochastic gradient Markov chain Monte Carlo (SGMCMC) has become a popular method for scalable Bayesian inference. These methods are based on sampling a discrete-time approximation to a continuous time process, such as the Langevin diffusion. When ...

Article
Free
Policy regret in repeated games
Pages 6733–6742

The notion of policy regret in online learning is a well defined performance measure for the common scenario of adaptive adversaries, which more traditional quantities such as external regret do not take into account. We revisit the notion of policy ...

Article
Free
A theory-based evaluation of nearest neighbor models put into practice
Pages 6743–6754

In the k-nearest neighborhood model (k-NN), we are given a set of points P, and we shall answer queries q by returning the k nearest neighbors of q in P according to some metric. This concept is crucial in many areas of data analysis and data processing,...

Article
Free
Banach wasserstein GAN
Pages 6755–6764

Wasserstein Generative Adversarial Networks (WGANs) can be used to generate realistic samples from complicated image distributions. The Wasserstein metric used in WGANs is based on a notion of distance between individual images, which induces a notion ...

Article
Free
Provable Gaussian embedding with one observation
Pages 6765–6775

The success of machine learning methods heavily relies on having an appropriate representation for data at hand. Traditionally, machine learning approaches relied on user-defined heuristics to extract features encoding structural information about data. ...

Article
Free
BRITS: bidirectional recurrent imputation for time series
Pages 6776–6786

Time series are ubiquitous in many classification/regression applications. However, the time series data in real applications may contain many missing values. Hence, given multiple (possibly correlated) time series data, it is important to fill in ...

Article
Free
M-Walk: learning to walk over graphs using Monte Carlo tree search
Pages 6787–6798

Learning to walk over a graph towards a target node for a given query and a source node is an important problem in applications such as knowledge base completion (KBC). It can be formulated as a reinforcement learning (RL) problem with a known state ...

Article
Free
Extracting relationships by multi-domain matching
Pages 6799–6810

In many biological and medical contexts, we construct a large labeled corpus by aggregating many sources to use in target prediction tasks. Unfortunately, many of the sources may be irrelevant to our target task, so ignoring the structure of the dataset ...

Article
Free
Efficient gradient computation for structured output learning with rational and tropical losses
Pages 6811–6822

Many structured prediction problems admit a natural loss function for evaluation such as the edit-distance or n-gram loss. However, existing learning algorithms are typically designed to optimize alternative objectives such as the cross-entropy. This is ...

Article
Free
Generative probabilistic novelty detection with adversarial autoencoders
Pages 6823–6834

Novelty detection is the problem of identifying whether a new data point is considered to be an inlier or an outlier. We assume that training data is available to describe only the inlier distribution. Recent approaches primarily leverage deep encoder-...

Contributors
  • Microsoft Research
  • Google LLC
  • The University of Texas at Austin
  • University of Milan
Please enable JavaScript to view thecomments powered by Disqus.

Recommendations