Failing diarization, debugging tips #1869
mo22
started this conversation in
Development
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Good day,
I'm trying to learn pyannote and have some (clear) audio recordings where pyannote fails, and I was wondering if you could suggest how to debug / improve this.
I'm using:
In the hook method I store the embeddings (ex. with audio input duration 300 seconds):
slice_embeddings
, "Slice"-> are these the 10 second by 1 second voxceleb embeddings of the input file?
all_embeddings
("Channel 0, 1, 2")-> what are these?
I plot a umap scatter plot of each of the channels, as well as a 1d plot of the channels together with the diarization result. The diarization result is stacked: bottom is the pyannote result, middle is the known correct diarization, and top is the result of pyannote pro api.
For the audio file test_42 it works really well, but for test_47 the whole recording is mapped to SPEAKER_1 with only small segments (<1 second) to SPEAKER_2, and pyannote pro also does not seem to work correctly. Both audio files have two speakers.
This works:
This does not work:
Any suggestions how to proceed here?
Thank you very much
Moritz
Beta Was this translation helpful? Give feedback.
All reactions