Re: split stereo by channel and silence

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



i hadn't thought it all the way through like you have so that does help some thanks, this will end up a script for multiple files, but currently i'm just playing with one. I did tried to split by silence first but background noise on the left channel(mostly left, might have been the combined noise some too) made it so there was only 3 files, if i increase the noise % it breaks it up better but then i start to chop off words from the right side. while if i split on silence after i spilt the channels i get  2 left( unless i increase the noise % which doesn't end up hurting this side, for this file not sure about other files) and 15 right files.

another thing i just thought of, is it possible to split on the silence of just 1 channel while the file is still stereo? then if i remove chopping off the silence, i'd still get the 15 right files, and alot of little files for the left side between the 15 right files, and i'd then just combine the files in-between what gets determined to be good right side files, and that should be the left side

thanks again for the help

On Tue, Jun 27, 2017 at 4:13 AM, Jeremy Nicoll - ml sox users <jn.ml.sxu.88@xxxxxxxxxxxxxxxxxxxx> wrote:
On 2017-06-27 01:07, jlnichols wrote:
i have stereo wav files, which each channel is different speakers in a
conversation.  trying to figure out how best to split a stereo file by both
its channel and silence, but still know the order the files should be played
in to hear the conversation has a whole.

If you leave it as a stereo file and split by silence you'll get a sequence of
smaller files in play order, divided up whenever neither person is talking.  So
essentially

   both001.wav
   both002.wav
   both003.wav

(for that your sox command is going to have to contain: "both%3n.wav" I think,
though you might need %5n or %7n or something if there's going to be a huge
number of these files created).

Most of these files should only have one person speaking in them, but clearly
there's going to be some with both.


I'd then run the sox  stats  effect on each of those files, piping the output
into a script/program.  Stats tells you the sound level in each channel of a
file.  You'd need (by experiment, probably) to find out for yourself what the
levels are in a file where one or other person is silent.  It should then be
possible for your script/program to decide if that stereo file contains only
the left channel person speaking, or only the right, or both.

I would use that info to rename each of the 'both' files, so eg

   both103.wav

could become

   voice103L.wav  or  voice103R.wav  or  voice103B.wav

Then you'd have a set of files like

   voice001L.wav
   voice002L.wav
   voice003R.wav
   voice004L.wav
   voice005L.wav
   voice006B.wav
   voice007L.wav
   voice008R.wav
   voice009L.wav
   ...

To listen to just the lefthand person, you'd want to copy the files with "L" and "B"
in their names elsewhere, to get:

   voice001L.wav
   voice002L.wav
   voice004L.wav
   voice005L.wav
   voice006B.wav
   voice007L.wav
   voice009L.wav
   ...

To listen to just the righthand person, you'd want to copy the "R" and "B" files:

   voice003R.wav
   voice006B.wav
   voice008R.wav
   ...

To listen to the whole thing with both people, just listen to the whole 'voice' set of
files.


Now, if you think about the "just lefthand person" COPY of the voicennnx.wav files, ie:

   voice001L.wav
   voice002L.wav
   voice004L.wav
   voice005L.wav
   voice006B.wav
   voice007L.wav
   voice009L.wav
    ...

obviously although these files contain all of the left person's contributions to the
discussion, the "B" files do also contain the right person interrupting.  If that was
annoying then ON THIS SET OF COPIED FILES ONLY you could run on each of the B files
the command like:

  sox voice006B.wav voice006L.wav remix 1

to get a left-channel only copy of what was in the B file.  So then you'd have

   voice001L.wav
   voice002L.wav
   voice004L.wav
   voice005L.wav
   voice006B.wav        chunk 6, both people
   voice006L.wav        chunk 6, left person only
   voice007L.wav
   voice009L.wav
    ...

and you could delete this copy of the voice006B file if you were sure you didn't need
it.


Does that help?


--
Jeremy Nicoll - my opinions are my own


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Sox-users mailing list
Sox-users@xxxxxxxxxxxxxxxxxxxxt
https://lists.sourceforge.net/lists/listinfo/sox-users

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Sox-users mailing list
Sox-users@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/sox-users

[Index of Archives]     [Linux Sound]     [ALSA Users]     [ALSA Devel]     [Linux Audio Users]     [Linux Media]     [Kernel]     [Photo Sharing]     [Gimp]     [Yosemite News]     [Linux Media]

  Powered by Linux