Dear all, I'm a maintainer of ALSA firewire stack. Linux kernel v5.14 was out a few days ago[1], including some changes in ALSA firewire stack. The changes bring improvement for usage of including drivers by solving some issues. I appreciate the users cooperating for it[2]. This message includes two topics about solved issues in the release: 1. get rid of playback noise by recovering media clock 2. allow some applications to run without periodical hardware interrupts and another topic: 3. device aggregation Let me describe the two topics first. 1. get rid of playback noise by recovering media clock Many users had been reporting playback noise since the initial version of each driver in ALSA firewire stack. The cause of the issue is complicated to explain, but let me roughly summarize it to a point below: * mismatch between audio sample count in playback stream and the one expected by hardware Since the initial stage of ALSA firewire stack, included drivers transfer audio data frames per second the exact count as sampling frequency, which is configured via ALSA PCM interface; e.g. 44.1 kHz. But it is figured out that it is not suitable for many models. For recent years, I've measured actual packets from/to various models with Windows and OS X drivers[3], and realized the below phenomena. Here, the configured frequency is called 'nominal', and the measured frequency is called 'effective'. * the effective frequency is not the same as the nominal frequency, less or greater by several audio data frames (<= 10 frames) * the effective frequency is not even in successive seconds for some models The phenomena mean that it is not achieved to transfer samples for playback sound by nominal frequency, and computation for even number of samples per second for some models. Additionally, in isochronous communication of IEEE 1394, part of models support time stamp per isochronous packet[4]. When parsing the sequence of time stamp and compare it to frequency of samples included in the packets, I realize the phenomena below: * the phase of sample based on computed time stamp shifts during long packet streaming * before and after configuring source of sampling clock to external signal input such as S/PDIF, neither the effective frequency of samples in packets nor the sequence of time stamp in packets have difference. The phenomena give us some insights. At least, the phase of samples is not deterministic somehow in driver side. It is required to recover the timing to put audio data frame into packet according to packets transferred by the hardware. This is called 'media clock recovery'[5]. In engineering field, many method of media clock recovery has been invented for each type of media. The way which ALSA firewire stack in v5.14 uses is the simplest one. It is to replay the sequence in transferred packets[6][7][8]. The result looks better. As long as I tested, it can cover all of models supported by ALSA firewire stack. 2. allow applications to run independently of periodical hardware interrupts ALSA PCM interface has hardware parameter for runtime of PCM substream to process audio data frame without scheduling periodical hardware interrupts[9]. PulseAudio and PipeWire use the function. All of drivers[10] in ALSA firewire stack now support it. Linux FireWire subsystem has function to flush queued packet till the most recent isochronous cycle. The function is available in process context without waiting for scheduled hardware interrupts, and allows drivers to achieve the topic. In most cases, it's done by calling ioctl(2) with SNDRV_PCM_IOCTL_HWSYNC. The call is the part of routine in usual ALSA PCM application, thus users do not need to take extra care of it. I think these improvements and configurable size of PCM buffer added in Linux kernel v5.5 brings better experience to users. The rest of topic comes from comparison to what existent userspace driver, libffado2[11], does. 3. device aggregation Some user pointed that it is not available with drivers in ALSA firewire stack to aggregate several audio data stream into one stream. It is what libffado2 does. Let me describe my opinion about it. At first, let me start with fundamental attribute of audio data frame. In my understanding about ALSA PCM interface, audio data frame is a unit for audio data transmission. The audio data frame includes the specific number of audio data depending on hardware; e.g. 2 samples for usual sound device. The fundamental attribute of audio data frame is to include the set of audio data sampled at the same time. The goal of aggregating audio data stream is to construct an audio data frame from some audio data frames for several devices. It means that one audio data frame consists of audio data sampled at different time. As I describe the phenomena about nominal and effective frequency, each hardware seems to run own unique effective frequency time to time[12], at least over IEEE 1394 bus. Additionally, we have the experience that the hardware is not aware of sequence of packet with nominal frequency for sample synchronization. It might be legitimate that we can not pick up audio data frame which consists of audio data sampled at the same time even if they are transferred at the same isochronous cycle[13]. When achieving the aggregation, we would need to loosen up the fundamental attribute of audio data frame, by accepting the range of sampling time for audio data in the frame, or need to implement one of resampling methods to adjust phase of audio data to the frame. Although the aggregation is itself superficially useful, it seems not to be a requirement to device driver in hardware abstraction layer of general purpose operating system. It may be over engineering. [1] Linux 5.14 https://lore.kernel.org/lkml/CAHk-=wh75ELUu99yPkPNt+R166CK=-M4eoV+F62tW3TVgB7=4g@xxxxxxxxxxxxxx/ [2] The cooperation is done in my public repository in github.com: https://github.com/takaswie/snd-firewire-improve [3] The method is described in the message: IEEE 1394 isoc library, libhinoko v0.1.0 release https://lore.kernel.org/alsa-devel/20190415153053.GA32090@workstation/ [4] The resolution of time stamp is 24.576 MHz, which is the same as specification of cycle time in IEEE 1394. The method to compute time stamp of packet for samples in the packet is defined by IEC 61883-6. We can see integrated document for it published by industry group: https://web.archive.org/web/20210216003054/http://1394ta.org/wp-content/uploads/2015/07/2009013.pdf [5] I borrow the expression from IEEE 1722. We can see specific term, sampling transmission frequency (STF) in IEC 61883-6 to express similar idea of the media clock. [6] [PATCH 0/3] ALSA: firewire: media clock recovery for syt-aware devices https://lore.kernel.org/alsa-devel/20210601081753.9191-1-o-takashi@xxxxxxxxxxxxx/ [7] [PATCH 0/6] ALSA: firewire: media clock recovery for syt-unaware devices https://lore.kernel.org/alsa-devel/20210531025103.17880-1-o-takashi@xxxxxxxxxxxxx/ [8] [PATCH 0/3] ALSA: firewire-motu: media clock recovery for sph-aware devices https://lore.kernel.org/alsa-devel/20210602013406.26442-1-o-takashi@xxxxxxxxxxxxx/ [9] SNDRV_PCM_HW_PARAMS_NO_PERIOD_WAKEUP. When the PCM substream has a flag of SNDRV_PCM_INFO_NO_PERIOD_WAKEUP, it's available. [10] Precisely except for snd-isight. [11] http://www.ffado.org/ [12] Precisely the hardware looks to run own unique media clock over IEEE 1394 bus. [13] For precise discussion, the knowledge about IEC 61883-6 and vendor specific method for packetization is required. Regards Takashi Sakamoto