Re: split stereo by channel and silence

"Graff, David E" <graff@xxxxxxxxxxxxx> · Thu, 29 Jun 2017 14:20:18 +0000

According to online docs for Kaldi (http://kaldi-asr.org/doc/tools.html), you should find a utility called "extract-segments", which will take
 either a 1- or 2-channel wav file as input and will produce as output a listing of speech segments with their time stamps. (It looks like using it on single-channel data is easier/better, and it makes sense to do it this way, because the use of time stamps
 on the original data means that "silence" regions are not deleted from the data, so portions of interest in the two separate channels retain their original alignment relative to each other -- each speech segment can be handled independently of others, and
 has a unique identifier to keep track of its position in the overall timeline of the original recording.

I haven't used Kaldi at all myself, but this approach to speech detection (using a listing of time offsets, while preserving the full content of the original recording) is a pretty common procedure.

   Dave Graff

From: Jon Nichols <jonlnichols@xxxxxxxxx>

Sent: Thursday, June 29, 2017 9:50:14 AM

To: sox-users@xxxxxxxxxxxxxxxxxxxxx

Subject: Re:  split stereo by channel and silence

the reason why is i'm trying to use an ASR( Kaldi to be exact) to transcribe the audio. it seems to work better on short audio clips which is why the split on silence and keeping the channels separate makes it easy to know who the speaker is,
 plus it was was unintelligible to my model when they were speaking over each other in a single mono file.

i'm still very new to figuring out how to use Kaldi, so there easily could be better way within that tool to handle this.

On Thu, Jun 29, 2017 at 5:15 AM, Jan Stary 
<hans@xxxxxxxx> wrote:

On Jun 26 17:07:31, jonlnichols@xxxxxxxxx wrote:

> i have stereo wav files, which each channel is different speakers in a

> conversation.  trying to figure out how best to split a stereo file by both

> its channel and silence, but still know the order the files should be played

> in to hear the conversation has a whole.

Why do you want to do this?

> i don't want to merge the 2

> channels because often 1 channel has more background noise then the other

> and sometime speakers will speak over each other and keeping them separate

> will make it easier to understand them.

You can play the one and then play the other, or just the parts

where they speak over each other.

> the problem is, for play back sometimes i should play multiple R.###.wav

> files in a row, or multi L.###.wav files and i have no way of knowing when i

> should do this with my current setup.

I you play the L and R files in a sequence (whether one-by-one

or with occasional cluster of L or R as you describe), it will

not be the conversation that happend, exactly in the places

where they spoke over each other.

> instead of just having an increment counter for the name, is there a way to

> have have it use the starting time( in seconds or whatever) for that segment

> of the file? that way i'd have the below files and could just sort by the

> number for the play order.

First please descdribe _why_ you are doing this.

Are the parts when they both speak so uninteligible

that you need to separate them into two mono strems

to actually hear what each is saying?

        Jan

------------------------------------------------------------------------------

Check out the vibrant tech community on one of the world's most

engaging tech sites, Slashdot.org! 
http://sdm.link/slashdot

_______________________________________________

Sox-users mailing list

Sox-users@lists.sourceforge.net

https://lists.sourceforge.net/lists/listinfo/sox-users

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Sox-users mailing list
Sox-users@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/sox-users