SU1394232A1

SU1394232A1 - Method of identifying a talker

Info

Publication number: SU1394232A1
Application number: SU864044386A
Authority: SU
Inventors: Валерий Иванович Галунов; Гурам Соломонович Рамишвили
Original assignee: Институт Систем Управления Ан Гсср
Priority date: 1986-04-02
Filing date: 1986-04-02
Publication date: 1988-05-07

Abstract

Изобретение относитс к речевой информатике о Цель изобретени - упрощение процесса прин ти решени о смене говор щего по свойствам принимаемых речевых сообщений о Свойство выделенных характеристик речи начальных фрагментов монотонных участков позвол ет идентифицировать Говор щего путем формировани последовательностей средних значений высоты тона и относительной энергии верхних формант , формировани текущих средних значений последовательностей средних значений и сравнени абсолютных значений относительно уклонений последних значений от соответствующих текущих значенийо Если уклон не пре- вьшает порог, который устанавливает в пределах 10-ЗОЙ, то говор щего верифицируют , если он превьшает этот порог, то говор щего замен ют„ 1 ил I (ЛThe invention relates to speech informatics. The purpose of the invention is to simplify the process of deciding whether to change the speaker by the properties of received speech messages. The property of the selected speech characteristics of the initial fragments of monotonic regions allows one to identify the Speaker by forming sequences of average values of pitch and relative energy of upper formant, forming current average values of sequences of average values and comparing absolute values with respect to deviations n If the slope does not exceed the threshold, which sets within 10-ZOY, then the speaker is verified, if he exceeds this threshold, then the speaker is replaced by „1 Il I (L

Description

Изобретение относитс к приборостроению дл речевой информатики и Предназначено дл контрол за воз можной сменой говор щего в системах св зи и системах голосового управлени ,The invention relates to instrumentation for speech informatics and is intended to control the possible change of speaker in communication systems and voice control systems,

Цель изобретени - упрощение процесса прин ти решени о смене говор щего по свойствам принимаемых ре- чевых сообщенийThe purpose of the invention is to simplify the process of making a decision about changing the speaker by the properties of received voice messages.

Упрощение процесса достигаетс выделением из речевых сигналов только тех фрагментов речи, которыми начинаютс монотонные участки речевых сообщений В пределах этих монотонны начальных участков, длительность которых составл ет с, высота тона голоса и относительна энерги .третьей и четвертой формант измен ютс не более чем на 30%, а сами эти характеристики речи мен ютс незначительно от фрагмента к фрагменту и вл ютс разными у различных людей Это свойство выделенных характеристик ре чи начальных фрагментов монотонных участков позвол ет идентифицировать говор щего путем формировани последовательностей средних значений высоты тона и относительной энергии верхних формант, формировани текущих средних значений последовательностей средних значений и сравнени абсолютных значений относительных уклонений последних средних значе- НИИ от соответствующих текущих средних значений Если уклонени не пре- выщают порог, который устанавливают в пределах 10-30%, то говор щего верифицируют , если они превьппают этот порог, то принимают решение о смене говор щегоSimplification of the process is achieved by extracting from speech signals only those fragments of speech that begin the monotonous parts of speech messages. Within these monotone initial parts, the duration of which is, the pitch of the voice and the relative energy of the third and fourth formant change by no more than 30%. and these speech characteristics themselves vary slightly from fragment to fragment and are different in different people. This is a property of the selected characteristics of speech of the initial fragments of monotonous parts of poses. It identifies the speaker by forming sequences of mean values of pitch and relative energy of upper formant, forming current average values of sequences of average values and comparing the absolute values of relative deviations of the latter average values from the corresponding current average values. If the deviations do not exceed the threshold that set within 10-30%, then the speaker is verified, if they exceed this threshold, then they decide to change the speaker

На чертеже показана блок-схема устройства дл реализации предлагаемого способа The drawing shows a block diagram of the device for implementing the proposed method

Блок-схема содержит сегментатор 1 соединенный с выделителем 2 параметров сегментов речи, блок 3 формировани средних значений, блок 4 формировани текущих средних последовател ностей средних значений, блок 5 выделени абсолютных значений относительно уклонений последних средних значений от соответствующих текущих средних и блок 6 пороговой логики.The block diagram contains a segmenter 1 connected to the selector 2 parameters of speech segments, a block 3 forming averages, a block 4 forming a current average sequences of averages, a block 5 extracting absolute values relative to the deviations of the last averages from the corresponding current averages and block 6 threshold logic.

Вход сегментатора вл етс входом устройства, а выход блока пороговой логики - выходом этого устройстваThe input of the segmenter is the input of the device, and the output of the threshold logic block is the output of this device.

Выделение только начальных фрагметов речевого сигнала, характеризующегос своей монотонностью, которое производитс сегментатором 1, и послдующие вьщелени параметров сегментов выделителем. 2 параметров, формирование средних и текущих средних значений формировател ми 3 и 4, выделение абсолютных значений относительных уклонений средних от текущих средних блоком 5 и прин тие рещени о верификации или смене говор щего в блоке 6 пороговой логики позвол ют исключить из процесса измерений индивидуальных характеристик речи говор щего множество динамических характеристик речи, которые могут быть использованы только в сложных процессах опознавани образцов с привлечением множества заранее формируемых эталонных образцов динамических свойств артикул ционных особенностей говор щего.The selection of only the initial fragments of the speech signal, characterized by its monotony, which is performed by the segmenter 1, and the subsequent selection of the parameters of the segments by the selector. 2 parameters, the formation of average and current average values by the formers 3 and 4, the selection of absolute values of relative deviations of the average from the current average by block 5 and the decision to verify or change the speaker of the threshold logic in block 6 allows to exclude individual speech characteristics from the measurement process speaker has a variety of dynamic speech characteristics that can only be used in complex sample recognition processes involving a multitude of preformed reference samples amicheskih properties article tional features talker.

Claims

Invention Formula

A method of identifying a speaker by a threshold comparison of the relative deviations of the characteristics of the speech signal fragments relative to the current average values, characterized in that, in order to simplify making decisions about a change of the speaker, the fragment -. within the limits of spelled fragments, speech spells, which begin its monotonous regions, form the average values of the relative energy parameters of the upper formant voices and its pitch, form the current average values of the sequences of average values and the relative deviations of the average values of the parameters of the last fragments from the corresponding current average absolute values of relative deviations with a threshold, and the threshold is set in the range of 10-30%, and the duration of the initial monotonic fragments chastkov adjusted between 1-3 with

In i