Google Scholar

Feature selection methods for text classification

A Dasgupta, P Drineas, B Harb, V Josifovski… - Proceedings of the 13th …, 2007 - dl.acm.org

We consider feature selection for text classification both theoretically and empirically. Our
main result is an unsupervised feature selection strategy for which we give worst-case …

Save Cite Cited by 270 Related articles All 13 versions

[PDF] arxiv.org

Sampling Algorithms and Coresets for Regression

A Dasgupta, P Drineas, B Harb, R Kumar… - SIAM Journal on …, 2009 - SIAM

The $\ell_p$ regression problem takes as input a matrix $A\in\mathbb{R}^{n\times d}$, a
vector $b\in\mathbb{R}^n$, and a number $p\in[1,\infty)$, and it returns as output a number ${\…

Save Cite Cited by 222 Related articles All 18 versions

[PDF] cidrdb.org

[PDF][PDF] Applying webtables in practice

S Balakrishnan, A Halevy, B Harb, H Lee, J Madhavan… - 2015 - cidrdb.org

We started investigating the collection of HTML tables on the Web and developed the
WebTables system a few years ago [4]. Since then, our work has been motivated by applying …

Save Cite Cited by 82 Related articles All 3 versions View as HTML

[PDF] emory.edu

Wavelet synopsis for data streams: minimizing non-euclidean error

S Guha, B Harb - Proceedings of the eleventh ACM SIGKDD …, 2005 - dl.acm.org

We consider the wavelet synopsis construction problem for data streams where given n
numbers we wish to estimate the data by constructing a synopsis, whose size, say B is much …

Save Cite Cited by 102 Related articles All 6 versions

[PDF] psu.edu

[BOOK][B] Algorithms for linear and nonlinear approximation of large data

B Harb - 2007 - search.proquest.com

A central problem in approximation theory is the concise representation of functions. Given a
function or signal described as a vector in high-dimensional space, the goal is to represent …

Save Cite Cited by 5 Related articles All 2 versions Library Search

[PDF] arxiv.org

Approximation algorithms for wavelet transform coding of data streams

S Guha, B Harb - IEEE Transactions on Information Theory, 2008 - ieeexplore.ieee.org

This paper addresses the problem of finding a B -term wavelet representation of a given
discrete function fepsiR n whose distance from is minimized. The problem is well understood …

Save Cite Cited by 71 Related articles All 13 versions

[PDF] researchgate.net

[PDF][PDF] Weighted isotonic regression under the L₁ norm

S Angelov, B Harb, S Kannan… - Proceedings of the …, 2006 - researchgate.net

Isotonic regression, the problem of finding values that best fit given observations and conform
to specific ordering constraints, has found many applications in biomedical research and …

Save Cite Cited by 41 Related articles All 10 versions View as HTML

[PDF] psu.edu

Query language modeling for voice search

…, J Schalkwyk, T Brants, V Ha, B Harb… - 2010 IEEE Spoken …, 2010 - ieeexplore.ieee.org

The paper presents an empirical exploration of google.com query stream language modeling.
We describe the normalization of the typed query stream resulting in out-of-vocabulary (…

Save Cite Cited by 39 Related articles All 14 versions

[PDF] umass.edu

Approximating the Best-Fit Tree Under L _p Norms

B Harb, S Kannan, A McGregor - … 2005 and 9th International Workshop on …, 2005 - Springer

We consider the problem of fitting an n× n distance matrix M by a tree metric T. We give a
factor O( min {n 1/p ,(klogn) 1/p }) approximation algorithm for finding the closest ultrametric T …

Save Cite Cited by 22 Related articles All 14 versions

[PDF] isca-archive.org

[PDF][PDF] Back-off language model compression.

B Harb, C Chelba, J Dean, S Ghemawat - INTERSPEECH, 2009 - isca-archive.org

With the availability of large amounts of training data relevant to speech recognition scenarios,
scalability becomes a very productive way to improve language model performance. We …

Save Cite Cited by 26 Related articles All 6 versions View as HTML

Create alert

Cite

Advanced search

Saved to My library

Feature selection methods for text classification

Sampling Algorithms and Coresets for Regression

[PDF][PDF] Applying webtables in practice

Wavelet synopsis for data streams: minimizing non-euclidean error

[BOOK][B] Algorithms for linear and nonlinear approximation of large data

Approximation algorithms for wavelet transform coding of data streams

[PDF][PDF] Weighted isotonic regression under the L₁ norm

Query language modeling for voice search

Approximating the Best-Fit Tree Under L _p Norms

[PDF][PDF] Back-off language model compression.

Feature selection methods for text classification

Sampling Algorithms and Coresets for Regression

[PDF][PDF] Applying webtables in practice

Wavelet synopsis for data streams: minimizing non-euclidean error

[BOOK][B] Algorithms for linear and nonlinear approximation of large data

Approximation algorithms for wavelet transform coding of data streams

[PDF][PDF] Weighted isotonic regression under the L1 norm

Query language modeling for voice search

Approximating the Best-Fit Tree Under L p Norms

[PDF][PDF] Back-off language model compression.

[PDF][PDF] Weighted isotonic regression under the L₁ norm

Approximating the Best-Fit Tree Under L _p Norms