ASRock boards like my Z790 LiveMixer, have 2x low latency USB ports, Yellow / Lightning USB ports https://www.asrock.com/microsite/2021embracethefuture/single-post3.html https://www.asrock.com/mb/Intel/Z790%20LiveMixer/Specification.asp https://www.asrock.com/microsite/2022EmbraceTheFuture/single-post8.html but... USB Bluetooth 5.0 dongle does Not work in some ports... Bus 005 Device 003: ID 0a12:0001 Cambridge Silicon Radio, Ltd Bluetooth Dongle (HCI mode) old 2010 server boards like Tyan S8232, allow to set USB Sync on the Bios, Focurite USB interface, has serious performance problems with USB Sync Activated... Sync limit performance. Async means both device clocks are different, but information is transferred at max speed allowed by link. All USB interfaces, All GPU´s are Async: https://www.youtube.com/watch?v=m9qL7gfNxxs SCSI Scanners are Async... >5MB/s SCSI HDDs are Sync. USB interfaces have AD/DA clock: 44.1 or 48khz x 256 = 12 & 14MHz . some have x512 or x1024 some have DSP or FPGA clock, MIDI clock, and USB clock. sometimes DSP or FPGA clock is same as AD/DA clock, sometimes Not. different AD/DA clocks have different phase noise characteristics, affecting sound like different Dither Noise shaping algorithms. #2. Lowest latency possible are DSP sound interfaces, Protools HD pci-x / pcie or HDX pcie + converter AD/DA latency. maybe Avid Carbon, but its ethernet AVB, Avid HD io AD/DA has 3 miliseconds, but goes up to 6 ms, depending on the plugins inserted: EQ, compressor, etc.... HDX dsp latency increase / change with plugins in/out in-serial, aprox. <1ms per DSP plugin, some have more, because some algorithms try to "see the future" and adapt, like look-ahead true peak limiters, others use oversampling, etc... HDX dsp has 4-samples of latency No plugins, No AD/DA, because its a complex dsp + fpga. HD io digilink interface also has Altera FPGA in each AD/DA board + Chassis controller board. Lynx AES16 & RME hdsp 9632 have basic DSP, 2 samples of latency. Focusrite USB interfaces like Scarlett & Clarett mk2 also have basic DSP, CPU latency is fixed, & defined by buffer size / driver. The problem with CPU processing are several, Milllions of Interrupts, Branch Prediction algorithms, Speculative execution algorithms, prediction & Play pause will never sound the same. some CPU algorithms are very close, designed to minimize C++ prediction, likely() and other methods, like pspaudioware MasterQ v1.02 has 64-Bit FP DP, Newer MasterQ 2, has 80-Bit, sounds different, dont like it. but testing same plugin/algorithm: DSP vs. CPU DSP always nicer. like Avid Focusrite d2/d3 or older digidesign focusrite forte suite TDM. New CPUs have better HPET and Better Prediction, but still can never be 100%. older 48Khz AD/DA had a lot of latency, 1st generation 192Khz also had a lot of latency (2004-2005), today, most usb interfaces like Focusrite Scarlett or Clarett mk2 have near latest AKM & Cirrus, with near 0-latency AD/DA + DSP + Drivers. Focusreite 18i20mk2 outs 1 & 2 are Cirrus, 3-10 are AKM. https://github.com/geoffreybennett/alsa-scarlett-gui AKM has a bit less latency vs. Cirrus has a bit more, because Cirrus uses a fast voltage processor circuit to emulate smooth "analog" waveforms after the Sample & Hold circuit and brickwall filter, transition filter: https://src.infinitewave.ca/ similar to Arp 2600 LAG Voltage Processor, but much faster, calculated for each sample rate, https://www.manualslib.com/manual/1211473/Arp-2600.html?page=46#manual other brands like pioneer dj use other DACs, other is: RME FireFace 800, early FF800s had 1st gen 192Khz DA AK4395, since march 2005, had AK4396 , 2nd generation 192Khz ic has lower latency. https://web.archive.org/web/20091229053519/http://www.rme-audio.de/download/fface800_e.pdf page.96 DA latency: Sample frequency kHz 44.1 | 48 | 88.2 | 96 | 176.4 | 192 DA (43.5 x 1/fs) ms * AK4395 0.99 | 0.9 | 0.49 | 0.45 | 0.25 | 0.23 DA (28 x 1/fs) ms * AK4396 0.63 | 0.58 | 0.32 | 0.29 | 0.16 | 0.15 ----------------------- Apogee Rosetta 800 originally was 96khz, after 2004-2005, some were upgraded to 192khz 1st gen AD/DA. https://www.soundonsound.com/reviews/apogee-rosetta-800 old AD/DA + 512 buffer PCI/PCIe = same latency vs. New AD/DA + 1024 buffer USB 2, 256 = 512 128 = 256 64 = 128 32 = 64 old vs. new AD/DA Latency lowers when working at 2x or 4x sample rate: 96Khz / 192Khz, but working at 192Khz is Not good, because requires a clock with 1pico second jitter = very expensive, most interfaces don´t have. https://en.wikipedia.org/wiki/Analog-to-digital_converter#Jitter Agilent Keysight TrueForm Generators have 1 pico second of jitter. https://www.youtube.com/watch?v=HLPoSiorh30&t=126s https://www.youtube.com/watch?v=1hxN3QPL4E4&t=52s most USB interfaces have JetPLL clock, its ok, but... Not as good as a MasterClock like Grimm Audio CC1, the difference is small in small speakers, and Big in large systems. true dual Quartz XO with ultra low phase noise circuit design, PLL instead of Dual XO started around E-Mu Ultra 6000 samplers in 1999 https://www.vintagesynth.com/emu/emulator4 when using external clock, there is another problems: the signal is re.clocked again by a PLL included in the decoder IC, some PLL are very strict, some are Not... can be adjusted replacing external feedback circuit resistor and capacitor, RC. strict PLL gives priority to internal clock, all inputs have different PLL low pass filters, the best is s/pdif, because the voltage is very low 0.5v, pll filter is not aggressive. WordClock, AES/EBU inputs, all have more aggressive PLL. the cable requires very good shield, litz wire, good dielectrics, etc... RME hdsp 9632 / FireFace800 have aggressive DDS, but "turned off." by default, Lynx AES16 same... has SynchroLock, super long slow PLL. most interfaces Pll are fixed. Avid Carbon has 2xFS JetPLL = twice frequency JetPLL, promises lower jitter. similar to Steinberg SSPLL in AXR interfaces, #4. Linux Liquorix Kernel / LowLatency kernel or Windows 8.1 kernel, allow <8 millisecond latency at 96Khz in USB measured with Oscilloscope. https://github.com/falkTX/Carla/issues/1912 Linux ALSA / Jack allows to change: Frames & Period independent... means: 256 x2 = 128 x4 but... 256x2 works with slower CPU´s 128x4 requires a faster CPU to have the same latency = pointless. when CPU audio plugins are used at lowest latency possible, 32 buffer, CPU does less interrupts, and the CPU sound is more similar to DSP sound, when 1x plugin eats 100% CPU load. #5. Rensesas has different USB 3 ICs 200 series included in some PCIe USB cards, and old Asus Rampage 3 Extreme lga1176 board, and 201 / 202 series Renesas/Nec uPD720202 latest Firmware breaks compatibility with Mac OSX Maverics, https://www.station-drivers.com/index.php/en/component/remository/Drivers/Renesas-Nec/USB-3.0/lang,en-gb/ there is also VIA VL800, has weird USB drivers for windows, "latest" does Not work, but previous works ok in Wni8.1. _________________________________ From: Florian Paul Schmidt <mista.tapas@xxxxxxx> Sent: Thursday, January 16, 2025 4:54 AM To: linux-audio-dev@xxxxxxxxxxxxxxxxxxxx; linux-audio-user@xxxxxxxxxxxxxxxxxxxx Subject: [LAD] Not all XHCI (USB-Controllers) are made equal Hi! This mail is just a heads-up about a finding discovered with the help of the linux-usb mailing list and which I thought some other people might benefit from. If this is nothing new to you, then please do ignore it. Background: The USB audio class 2.0 specification dictates that isochronous transfers (i.e. audio frames to/from an audio interface) happen every "micro-frame". USB micro-frames are 125 microseconds (us) apart. 125 us = 0.125 milliseconds (ms). The majority of USB audio interfaces (at least those that I have) use synchronous audio-streaming, i.e. the sample clock is derived from the bus clock. There are definitely interfaces that use e.g. adaptive or asynchronous modes and this discussion would have to be altered for these. Given the case of synchronous mode isochronous transfers at a sampling rate of 48000 Hz (= 48 kHz) this would correspond to 6 audio frames per USB micro-frame. 48000 frames/second * 0.000125 seconds = 6 frames So in principle a very well behaved audio interface attached to a very well behaved USB controller sitting in a well tuned system _should_ be able to achieve a minimum round-trip latency of 2 * 6 frames = 12 frames or 250 us. This leaves out additional buffering inside the audio interface and additional latency by anti-aliasing and reconstruction filters. The first caveat to above in the context of Linux: The snd_usb_audio driver does not seem to support period sizes of just 6 frames. It _does support 12 frames though, which is nice. And here is the other caveat which lead to the title of this mail: Some controllers are behaving "worse" than others. There's this little thing in the XHCI spec which amounts to the following: The XHCI can specify in a register how many micro-frames have to buffered at all times for outgoing isochronous endpoints. The Intel XHCI in my ASRock N100dc-itx main-board for example request a whole USB frame (which corresponds to 8 micro-frames or 1 ms) to be buffered at all times. Another XHCI that I have (a Renesas controller) only requires one micro-frame. This has direct consequences on the kind of period sizes and number of periods that are usable on these controllers. These consequences don't explain everything but at least you know that you can't ever get better than this limit. For example for a period size of 48 frames I need to use 3 periods on the Intel controller (resulting in 3 ms round-trip latency), but for the Renesas I can use 2 periods at 48 frames (resulting in 2 ms round-trip latency). On the Renesas controller even 2 periods at 24 frames works fine (1 ms round-trip latency). I can lower the latency on the Intel controller by using a smaller period size but more of them as long as the buffering requirement of the controller is satisfied. One stable setting is for example a period size of 24 and 5 periods which results in a round-trip latency of 2.5 ms. So what's the take-away here? In some cases, if you are chasing stable low-latency operation using a USB audio class 2.0 device it might just be worth installing a different XHCI in your computer (the above mentioned Renesas controller is just a PCI-Express card which sits in a slot in my N100DC-itx board) that is better behaved than the one you currently have. Another take-away is that the above limitation only applies to the outgoing direction (playback). If all I was interested in would be the capture direction then 2 periods at 48 frames would work fine even on the Intel controller. Kind regards, FPS _______________________________________________ Linux-audio-dev mailing list -- linux-audio-dev@xxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to linux-audio-dev-leave@xxxxxxxxxxxxxxxxxxxx _______________________________________________ Linux-audio-user mailing list -- linux-audio-user@xxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to linux-audio-user-leave@xxxxxxxxxxxxxxxxxxxx