On 2017-06-27 01:07, jlnichols wrote:
i have stereo wav files, which each channel is different speakers in a
conversation. trying to figure out how best to split a stereo file by
both
its channel and silence, but still know the order the files should be
played
in to hear the conversation has a whole.
If you leave it as a stereo file and split by silence you'll get a
sequence of
smaller files in play order, divided up whenever neither person is
talking. So
essentially
both001.wav
both002.wav
both003.wav
(for that your sox command is going to have to contain: "both%3n.wav" I
think,
though you might need %5n or %7n or something if there's going to be a
huge
number of these files created).
Most of these files should only have one person speaking in them, but
clearly
there's going to be some with both.
I'd then run the sox stats effect on each of those files, piping the
output
into a script/program. Stats tells you the sound level in each channel
of a
file. You'd need (by experiment, probably) to find out for yourself
what the
levels are in a file where one or other person is silent. It should
then be
possible for your script/program to decide if that stereo file contains
only
the left channel person speaking, or only the right, or both.
I would use that info to rename each of the 'both' files, so eg
both103.wav
could become
voice103L.wav or voice103R.wav or voice103B.wav
Then you'd have a set of files like
voice001L.wav
voice002L.wav
voice003R.wav
voice004L.wav
voice005L.wav
voice006B.wav
voice007L.wav
voice008R.wav
voice009L.wav
...
To listen to just the lefthand person, you'd want to copy the files with
"L" and "B"
in their names elsewhere, to get:
voice001L.wav
voice002L.wav
voice004L.wav
voice005L.wav
voice006B.wav
voice007L.wav
voice009L.wav
...
To listen to just the righthand person, you'd want to copy the "R" and
"B" files:
voice003R.wav
voice006B.wav
voice008R.wav
...
To listen to the whole thing with both people, just listen to the whole
'voice' set of
files.
Now, if you think about the "just lefthand person" COPY of the
voicennnx.wav files, ie:
voice001L.wav
voice002L.wav
voice004L.wav
voice005L.wav
voice006B.wav
voice007L.wav
voice009L.wav
...
obviously although these files contain all of the left person's
contributions to the
discussion, the "B" files do also contain the right person interrupting.
If that was
annoying then ON THIS SET OF COPIED FILES ONLY you could run on each of
the B files
the command like:
sox voice006B.wav voice006L.wav remix 1
to get a left-channel only copy of what was in the B file. So then
you'd have
voice001L.wav
voice002L.wav
voice004L.wav
voice005L.wav
voice006B.wav chunk 6, both people
voice006L.wav chunk 6, left person only
voice007L.wav
voice009L.wav
...
and you could delete this copy of the voice006B file if you were sure
you didn't need
it.
Does that help?
--
Jeremy Nicoll - my opinions are my own
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Sox-users mailing list
Sox-users@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/sox-users