Re: Identifying MPEG-4 HE-AAC (LATM, LAOS) audio formats

schorpp <thomas.schorpp@xxxxxxxxx> · Mon, 30 Dec 2024 12:21:18 +0100

Am 30.12.24 um 11:26 schrieb Marko Mäkelä:
Tue, Dec 10, 2024 at 04:13:21AM +0100, schorpp wrote:
Get a 4B at no cost. h.265 h/w decoding supported (at least with 
libreelec).

Thanks, I already got a Pi 4, but luckily rpihddevice on the Raspberry 
Pi 2 or 3 is sufficient for my needs (the DVB-T2 here only uses H.264). 
My only "production" VDR setup is on the Pi 2; I have a spare setup with 
a Pi 3, but without an MPEG-2 decoding license that would be necessary 
for watching DVB-T recordings. I have been waiting if the requirement to 
buy a license key for the VideoCore 4 firmware would disappear in 2025, 
when the patents will have expired in all jurisdictions.

What? Your PI3/2 software decoders are too slow for h.262 (MPEG2) but 
fast enough for h.264 needing ~ x4 cpu/fpu power?

I've a huge archive of dvd backups here, aac 5.1 sample is attached.

Thank you. I renamed the file to 00001.ts within a directory name that 
VDR recognizes. As expected, there was no audio (or video) output 
whatsoever from VDR+rpihddevice. In Totem, the audio played fine via the 
built-in 2 speakers on my laptop.

Same here with xineliboutput media player, no audio from AAC 5.1 MP4 files.

Latest yt-dlp extracted AAC 2.0 and LATM dvb-c radio broadcasters play fine.

Lyngsat claims HE-AAC, femon plugin says LATM.

https://www.lyngsat.com/muxes/Astra-1M_Europe_10891-H.html

I finally spent some time debugging mediainfo and ffprobe on the 5.1 
channel audio sample that you provided. The mediainfo/libmediainfo code 
base is written in what I would call stereotypical "bloatware C++". Lots 
of objects (including wchar_t strings) are being created, copied, and 
destroyed, which makes it very hard to follow the data flow.

Try an UML Modeling C++ reverse engineering app like e.g. Rational Rose 
or Umbrella (Linux) successors.

With "rr record ffprobe 00001.ts" and "rr replay", I got closer to 
determining where the audio format is actually being parsed. It took a 
few distinct data watchpoints ("watch -l") and "reverse-continue", 
because the metadata was being copied a few times:

Hardware watchpoint 5: -location ac->oc[1].ch_layout.nb_channels

Old value = 0x6
New value = 0x0
0x00007fe048c2bb30 in av_channel_layout_from_mask () from /lib/x86_64- 
linux-gnu/libavutil.so.59
(rr) bt
#0  0x00007fe048c2bb30 in av_channel_layout_from_mask () from /lib/ 
x86_64-linux-gnu/libavutil.so.59
#1  0x00007fe049e87c5e in ff_aac_output_configure 
(ac=ac@entry=0x55841cc02a00, layout_map=layout_map@entry=0x7fff1e8604e0, 
tags=<optimized out>, oc_type=oc_type@entry=OC_GLOBAL_HDR,     
get_new_frame=get_new_frame@entry=0x0) at src/libavcodec/aac/aacdec.c:508
#2  0x00007fe049f0b115 in decode_ga_specific_config 
(ac=ac@entry=0x55841cc02a00, avctx=avctx@entry=0x55841cbf2fc0, 
gb=gb@entry=0x7fff1e860850, 
get_bit_alignment=get_bit_alignment@entry=0x0,     
m4ac=m4ac@entry=0x55841cc081d8, channel_config=<optimized out>) at src/ 
libavcodec/aac/aacdec.c:890
#3  0x00007fe049f0bce6 in decode_audio_specific_config_gb 
(ac=0x55841cc02a00, avctx=0x55841cbf2fc0, oc=0x55841cc081d8, 
gb=0x7fff1e860850, get_bit_alignment=0x0, sync_extension=0x1)
     at src/libavcodec/aac/aacdec.c:1040
#4  decode_audio_specific_config (ac=ac@entry=0x55841cc02a00, 
avctx=avctx@entry=0x55841cbf2fc0, oc=oc@entry=0x55841cc081d8, 
data=<optimized out>, bit_size=<optimized out>, sync_extension=0x1)
     at src/libavcodec/aac/aacdec.c:1095
#5  0x00007fe049e87ce1 in ff_aac_decode_init (avctx=0x55841cbf2fc0) at 
src/libavcodec/aac/aacdec.c:1189
#6  0x00007fe049feacc7 in avcodec_open2 
(avctx=avctx@entry=0x55841cbf2fc0, codec=codec@entry=0x7fe04ae07ac0 
<ff_aac_decoder>, options=options@entry=0x55841cbf7580) at src/ 
libavcodec/avcodec.c:336
#7  0x00007fe04b49b982 in avformat_find_stream_info (ic=0x55841cbf21c0, 
options=0x55841cbf7580) at src/libavformat/demux.c:2603
#8  0x00005583e525791f in open_input_file (ifile=0x7fff1e860ef0, 
filename=0x55841cbeefb0 "00001.ts", print_filename=<optimized out>)
     at src/fftools/ffprobe.c:3901
#9  probe_file (wctx=0x55841cbef000, filename=0x55841cbeefb0 "00001.ts", 
print_filename=<optimized out>) at src/fftools/ffprobe.c:4011
#10 main (argc=<optimized out>, argv=<optimized out>) at src/fftools/ 
ffprobe.c:4765

Because this watchpoint was hit in "reverse-continue" and not 
"continue", the "Old value" and "New value" are swapped (the number of 
channels was actually changed from 0 to 6 at that point). I didn't 
install libavutil-dbgsym, but it is clear from reading the ffmpeg source 
code that ff_aac_output_configure() and its callers are the more 
interesting part of the call stack.

Once I have fully understood the parsing logic in libavcodec, I will 
determine if I'll improve cRpiAudioDecoder::cParser::Parse() a little, 
or if I'd make it use more of libavcodec, which rpihddevice already 
depends on for the actual decoding.

Cool. The FFMPEG guys will surely appreciate it.

But there're still PCM Multichannel HDMI drivers missing for many devices.

Best regards,

     Marko

y
tom