Nonparametric choice modeling : applications to operations management
Author(s)
Jagabathula, Srikanth
DownloadFull printable version (14.03Mb)
Other Contributors
Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.
Advisor
Vivek F. Farias and Devavrat Shah.
Terms of use
Metadata
Show full item recordAbstract
With the recent explosion of choices available to us in every walk of our life, capturing the choice behavior exhibited by individuals has become increasingly important to many businesses. At the core, capturing choice behavior boils down to being able to predict the probability of choosing a particular alternative from an offer set, given historical choice data about an individual or a group of "similar" individuals. For such predictions, one uses what is called a choice model, which models each choice occasion as follows: given an offer set, a preference list over alternatives is sampled according to a certain distribution, and the individual chooses the most preferred alternative according to the sampled preference list. Most existing literature, which dates back to at least the 1920s, considers parametric approaches to choice modeling. The goal of this thesis is to deviate from the existing approaches to propose a nonparametric approach to modeling choice. Apart from the usual advantages, the primary strength of a nonparametric model is its ability to scale with the data - certainly crucial to applications of our interest where choice behavior is highly dynamic. Given this, the main contribution of the thesis is to operationalize the nonparametric approach and demonstrate its success in several important applications. Specifically, we consider two broad setups: (1) solving decision problems using choice models, and (2) learning the choice models. In both setups, data available corresponds to marginal information about the underlying distribution over rankings. So the problems essentially boil down to designing the 'right' criterion to pick a model from one of the (several) distributions that are consistent with the available marginal information. First, we consider a central decision problem in operations management (OM): find an assortment of products that maximizes the revenues subject to a capacity constraint on the size of the assortment. Solving this problem requires two components: (a) predicting revenues for assortments and (b) searching over all subsets of a certain size for the optimal assortment. In order to predict revenues for an assortment, of all models consistent with the data, we use the choice model that results in the 'worst-case' revenue. We derive theoretical guarantees for the predictions, and show that the accuracy of predictions is good for the cases when the choice data comes from several different parametric models. Finally, by applying our approach to real-world sales transaction data from a major US automaker, we demonstrate an improvement in accuracy of around 20% over state-of-the-art parametric approaches. Once we have revenue predictions, we consider the problem of finding the optimal assortment. It has been shown that this problem is provably hard for most of the important families of parametric of choice models, except the multinomial logit (MNL) model. In addition, most of the approximation schemes proposed in the literature are tailored to a specific parametric structure. We deviate from this and propose a general algorithm to find the optimal assortment assuming access to only a subroutine that gives revenue predictions; this means that the algorithm can be applied with any choice model. We prove that when the underlying choice model is the MNL model, our algorithm can find the optimal assortment efficiently. Next, we consider the problem of learning the underlying distribution from the given marginal information. For that, of all the models consistent with the data, we propose to select the sparsest or simplest model, where we measure sparsity as the support size of the distribution. Finding the sparsest distribution is hard in general, so we restrict our search to what we call the 'signature family' to obtain an algorithm that is computationally efficient compared to the brute-force approach. We show that the price one pays for restricting the search to the signature family is minimal by establishing that for a large class of models, there exists a "sparse enough" model in the signature family that fits the given marginal information well. We demonstrate the efficacy of learning sparse models on the well-known American Psychological Association (APA) dataset by showing that our sparse approximation manages to capture useful structural properties of the underlying model. Finally, our results suggest that signature condition can be considered an alternative to the recently popularized Restricted Null Space condition for efficient recovery of sparse models.
Description
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011. Cataloged from PDF version of thesis. Includes bibliographical references (p. 257-263).
Date issued
2011Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology
Keywords
Electrical Engineering and Computer Science.