Re: Semantic Digital Audio Memory: A cognitive aid to boost the capabilities of your memory

"'Rastislav Kish' via blinux-list@xxxxxxxxxx" <blinux-list@xxxxxxxxxx> · Sun, 24 Mar 2024 09:37:52 +0000

Hello Janina,
thanks for the awesome feedback!
Yes, using the soundcard loopback as the audio source is definitely 
something I would like to implement, exactly for the reasons you 
mention. Right now the situation over sound IO is bit... interesting, 
since Windows has apparently a different logic of approaching computer 
audio (what sense does it make to have the microphone enforce a 
different sampling rate than speakers? ?? )
But I absolutely want to get into it, as soon as I figure out a 
non-messy way to deal with Microsoft ideas.

By the way, when it comes to controlling SDAM, I got quite inspired by 
the keyboard layout introduced by Orca flat review.
I.E. you have three controlling triplets of keys - UIO, JKL and M Comma 
Dot, the top-one controls rate, the middle lets you seek and pause/play, 
while the bottom-one allows you to control markers.
You can either focus the next/previous (relative to the last focused), 
the next/previous closest one (i.e. relative to your position) and you 
can also jump to the last focused marker by pressing Ctrl+Comma, or move 
it to the current playback position or assign it a label.

Regarding audio editing, this is an interesting topic. I absolutely do 
want to introduce some forms of edits, for example, automatic silence 
removal. That could be very useful, since at least on math classes, 
there are often rather large portions of nothing while the professor is 
writing down formulas, proofs, or while the students calculate an 
assignment on the chalkboard during exercise classes.

However. SDAM is, in its core idea, supposed to work like human memory. 
What does that mean in practice? Right now, when you have a recording 
from which you do a transcript, you have two different resources, which 
are describing the same thing, but otherwise they're mostly independent 
and you treat them like linear resources (that means, if you want to 
find something, you skim through from left to right or from top to bottom).
However, human memory works differently. Human memory is a network 
connecting thoughts, transcripts and audio cues. If you listened to 
let's say a podcast about Jupyter notebooks accessibility and there was 
a part dealing with Orca, in your mind, you don't need to speed-listen 
to the whole event nor roll your eyes/ears through the transcript to get 
to the Orca part, you can simply bring yourself in your imagination 
directly to the Orca related stuff, no detours necessary.

And this is basically also my vision for SDAM. To make it create a 
network of audio, meanings and texts, where you could easily move 
directly to the relevant parts of whichever resource you're interested in.

Therefore, yes, I do get what you mean by excluding the formal social 
parts, etc. and it is something that may be possible to do eventually, 
however, if I manage to fulfil the SDAM as I imagine it to be, then 
these operations basically shouldn't be necessary, since you should 
never need to be skimming through them unles you would explicitly want 
to. Similar to how you don't need to delete them from your brain and can 
still work with your memories efficiently.

Markers are one piece of puzzle for achieving this goal, but not the 
only-one, there are more that should eventually build up the network, in 
the form I currently am imagining them to be, most should operate either 
completely automatically or, fit to the routine you already do while 
transcribing.

To be honest, I'm myself curious where this goes and what is possible to 
achieve.

Best regards

Rastislav

Dňa 20. 3. 2024 o 12:55 Janina Sajka napísal(a):
> Dear Rastislav:
>
> I am delighted by this project outline from you. May I suggest that live
> recording isn't the only use case? One might be listening to a webinar,
> podcast, and audio or ebook, or even a musical composition and experience the need to drop
> markers in the content for later access.
>
> Another thought about markers ... It seems to me that one would
> frequently realize the importance of what is being said somewhat into
> the discussion, not at the head of the discussion. So, the bookmarks may
> frequently not be at the beginning of the discussion, but part-ways in.
> So, being able to move a marker afterwards might be valuable!
>
> Another use case, imo, would be to replay some portion, e.g. from marker
> a to marker b. I don't know about you, but I find I learn more on
> hearing something a second, third, and even fourth time
> Another use case, imo, would be to replay some portion, e.g. from marker
> a to marker b. I don't know about you, but I find I learn more on
> hearing something a second, third, and even fourth time. In music this
> becomes very, very useful, especially if one can apply time-scale
> modification to slow the music playback.
>
> I suppose a certain amount of editing after the recorded event occur
> realtime would be useful, too. One might want to excise all the "Welcome
> everyone, isn't this a nice day, and how lovely of you to come and talk
> with us" kind of social nicities from the recording one is accessing.
> Even with bookmarks to the meatier discussion present, these kinds of
> deletions might be useful.
>
> So, yes please. I'd love to try this tool! I expect I might use it quite
> a lot!
>
> Best,
> Janina
>
> 'Rastislav Kish' via blinux-list@xxxxxxxxxx writes:
>> Hello everyone,
>> I would like to share with you a project I had in mind for longer time during my university study, and which I finally got to work on in the recent months.
>> While attending classes of theoretical mathematics, I???m usually facing 3 problems:
>>
>> - I can???t write down notes and pay attention at the same time
>> - Sometimes, I don???t get the context of the explained concept right away, I need few moments to think it through or even lookup additional details in my notes or on the Internet. So, I either don???t do so and end up just sitting in the class being unable to understand anything, because that concept was important for later topics, or, I do the lookup asynchronously, what however means I get out of sync with the explanation and find myself in the same situation, except now I can???t do much with it.
>> - If the class requires active work, my mind gets submerged in the problem and can???t track anything in the physical world, resulting in shattered context and missed information.
>>
>> Recording classes can fix all of these issues, however for the cost of doubling the processing time for each class, since raw recordings don???t hold any information about their content and need to be listened through in full to get a good-quality notes.
>>
>> Semantic audio
>>
>> SDAM lets you capture recordings with assigned meaning. In the simplest usage, you can just start the recording and add a mark whenever something you will want to write down later is said, when the class is over, you can just return to those labels and quickly create the notes, you can be sure you have covered everything important without the need to go through the whole thing again. At the same time, those marks can serve as reference points, if you need to return in your memory to the part of your class dealing with a particular topic, because you feel you may have missed something or just want to hear it again, you can get to the relevant part in few clicks.
>>
>> Time travel
>>
>> However, SDAM also offers a different operation mode. If you have headphones with active noise cancellation technology, you can use it to travel in time during the class. After activating this function, the program will work in augmented reality mode, where you can hear what???s happening around you. And if you don???t get something, need to research or simply mishear, there???s nothing simpler than pausing the time or rewinding it back, you will get to repeat the past events without missing on anything that???s happening in the meantime, because everything is being recorded for you in the background. So when you???re done, you can simply continue listening to the class as it was happening while you were dealing with other things, or, even increase the speed twice or triple to get in sync again.
>>
>> The program is also equipped with a built-in notepad, so you can make use of it to do your note-taking stuff, calculations and other textual operations.
>>
>> Saving your memory to a file
>>
>> When the class is over and you save everything, all the recorded audio, taken marks and written notes is put into a single file, which can be afterwards opened again in SDAM and act as a effective capture of your memory back from the class.
>>
>> This project is highly experimental, I???ve got all of the above mentioned implemented, and I???m curious to see how are my ideas going to work in practice. Over the time, I would also like to add more functionality related to audio processing, like automatic transcription using Whisper (that of course won???t work for math, but could give a decent enough starting point for more narrated topics), automatic silence detection and removal (combined with timetravel, that could be a really interesting function), and I have more cool stuff in mind. The idea is basically that SDAM could become my all-in-one solution for working with audio classes, increasing effectivity and saving time for more of the fascinating topics.
>>
>> If you find the idea interesting, you can learn more about the project in it???s [GitHub repository](https://github.com/RastislavKish/sdam). It???s free and open-source, as usual with my projects.
>>
>> Happy memory-hacking!
>>
>> Best regards
>>
>> Rastislav
>>
>> ???
>>
>> --
>> You received this message because you are subscribed to the Google Groups "blinux-list@xxxxxxxxxx" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to blinux-list+unsubscribe@xxxxxxxxxx.
> --
>
> Janina Sajka (she/her/hers)
> Accessibility Consultant https://linkedin.com/in/jsajka
>
> The World Wide Web Consortium (W3C), Web Accessibility Initiative (WAI)
> Co-Chair, Accessible Platform Architectures	http://www.w3.org/wai/apa
>
> Linux Foundation Fellow
> https://www.linuxfoundation.org/board-of-directors-2/
>

-- 
You received this message because you are subscribed to the Google Groups "blinux-list@xxxxxxxxxx" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blinux-list+unsubscribe@xxxxxxxxxx.