Article

Free access

Residual speech signal compression: an experiment in the practical application of neural network technology

Authors:

Lorien Pratt,

Kathleen D. Cebulka,

Peter ClitherowAuthors Info & Claims

IEA/AIE '90: Proceedings of the 3rd international conference on Industrial and engineering applications of artificial intelligence and expert systems - Volume 2

Pages 1063 - 1072

https://doi.org/10.1145/98894.99124

Published: 01 June 1990 Publication History

PDF eReader

Abstract

Neural networks are a popular area of research today. However, neural network algorithms have only recently proven valuable to application problems. This paper seeks to aid in the process of transferring neural network technology from research to a development environment by describing our experience in applying this technology.

The application studied here is Speaker Identity Verification (SIV), which is the task of verifying a speaker's identity by comparing the speaker's voice pattern to a stored template.

In this paper, we describe the application of the back-propagation neural network algorithm to one aspect of the SIV problem, called Residual Compression (RC). The RC problem is to extract useful features from a part of the speech signal that was not utilized by previous SIV systems. Here, we describe a neural network architecture, pre-processing algorithm, training methodology, and empirical results for this problem. We also present a few guidelines for the use of neural networks in applied settings.

References

[1]

B AUM, E. B., AND HAUSSLER, D. What size net gives Valid Generalization? In Advances in Neural Information Processing Systems 1: {Collected papers of the IEEE Conference on Neural Information Processing Systems - Natural and Synthetic, Denver, Nov. 28-Dec. 1,1988}, D. S. Touretzky, Ed. Morgan Kaufmann, San Mateo, CA, 1989, pp. 81-90.

Digital Library

Google Scholar

[2]

DENKER, J., SCHWARTZ, D., WITrNER, B., SOLLA, S., HOPFiELD, J., HOWARD, R., AND JACKEL, L. Automatic Learning, Rule Extraction, and Generalization, October 1987. Unpublished technical report.

Google Scholar

[3]

ELMAN, J. L., AND ZIPSER, D. Learning the Hidden Structure of Speech. Tech. Rep. ICS-8701, Institute for Cognitive Science, February 1987.

Google Scholar

[4]

FEUSTEL, T. C., AND VELIUS, G. A. Voice-Based Securry: identity Verification over Telephone Lines. In Proceedings of the IEEE Global Telecommunications Conference & Exhibition (Dallas, TX, 1989), vol. 1, IEEE, pp. 212-216.

Google Scholar

[5]

HANSON, S. J., AND PRATI', L. Y. Comparing Biases for Minimal Network Construction with Backpropagation. In Advances in Neural Information Processing Systems 1: {Collected papers of the {EEE Conference on Neural Information Processing Systems - Natural and Synthetic, Denver, Nov. 28-Dec. 1,1988}, D. S. Touretzky, Ed. Morgan Kaufmann, San Mateo, CA, 1989, pp. 177-185.

Crossref

Google Scholar

[6]

LAPEDES, A., AND F^RBER, R. Nonlinear signal processing using Neural Networks: Prediction and system modelling. Tech. rep., Theoretical Division, Los Alarnos National Laboratory, Los Alamos, NM 87545, 1987.

Google Scholar

[7]

MCCLELLAND, J. L., AND RUMELHART, D. E. Explorations in Parallel Distributed Processing' A Handbook of Models, Programs, and Exercises. The MIT Press, Cambridge, MA, 1988.

Digital Library

Google Scholar

[8]

MCCLELLAND, J. L., RUMELHART, D. E., AND THE PDP RESEARCH GROUP. Parallel Distributed Processing, Volume 2: Psychological and Biological models. Addison-Wesley, 1986.

Digital Library

Google Scholar

[9]

MORGAN, N., AND BOURLARD, H. Generalization and Parameter Estimation in Feedforward Nets: Some Experiments. In Advances in Neural Information Processing Systems 2, D. S. Touretzky, Ed. Morgan Kaufmann, San Mateo, CA, 1990.

Digital Library

Google Scholar

[10]

OPPENHEIM, A. V., AND SCHAFER, R. W. Digital Signal Processing. Prentice-Hall, Englewood Cliffs, New Jersey, 1975.

Google Scholar

[11]

RUMELHART, D. E., MECLELLAND, J. L., AND THE PDP RESEARCH GROUP, Eds. Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol. 1. MIT Press: Bradford Books, 1987.

Digital Library

Google Scholar

[12]

TESAI3RO, G., AND JANSSEN$, R. Scaling relationships in back-propagation learning: dependence on predicate order. Tech. Rep. CCSR-88-1, Center for Complex Systems Research, University of Illinois at Urbana- Champaign, February 1988.

Google Scholar

[13]

VELIUS, G. Speaker Identity Feature Combination with a Neural Net. J. Acoust. Soc. Am. Suppl. 1 84 (1988), $61.

Google Scholar

[14]

VELIUS, G. A. Variants of Cepstrum Based Speaker Identity Verification. ICASSP 88 J (1988), 583-586.

Google Scholar

Index Terms

Residual speech signal compression: an experiment in the practical application of neural network technology

Recommendations

Application of BP Neural Network with Chebyshev Mapping in Image Compression
IMCCC '13: Proceedings of the 2013 Third International Conference on Instrumentation, Measurement, Computer, Communication and Control

The traditional BP neural network has two disadvantages: long training time and easily getting into local minimum. To solve above problem, the neural network with chaotic neuron was proposed. In this paper, the Chebyshev chaotic mapping is used to ...
Neural network based pitch tracking in very noisy speech

Pitch determination is a fundamental problem in speech processing, which has been studied for decades. However, it is challenging to determinate pitch in strong noise because the harmonic structure is corrupted. In this paper, we estimate pitch using ...
Scene text recognition using residual convolutional recurrent neural network

Text is a significant tool for human communication, and text recognition in scene images becomes more and more important. In this paper, we propose a residual convolutional recurrent neural network for solving the task of scene text recognition. The ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

IEA/AIE '90: Proceedings of the 3rd international conference on Industrial and engineering applications of artificial intelligence and expert systems - Volume 2

June 1990

591 pages

ISBN:0897913728

DOI:10.1145/98894

Chairman:
Manton M. Matthews

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 June 1990

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Article

Conference

IEA/AEI-90

Sponsor:

SIGAI

IEA/AEI-90: 3rd International Conference on Industrial & Engineering Applications of Artificial Intelligence & Expert Systems

South Carolina, Charleston, USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
846
Total Downloads

Downloads (Last 12 months)15
Downloads (Last 6 weeks)1

Reflects downloads up to 24 Dec 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Index Terms

Recommendations

Application of BP Neural Network with Chebyshev Mapping in Image Compression

Neural network based pitch tracking in very noisy speech

Scene text recognition using residual convolutional recurrent neural network