On 03/05/2016 06:07 PM, Takashi Iwai wrote: > On Sat, 05 Mar 2016 09:35:51 +0100, > bsiice wrote: >> >> On 03/05/2016 03:36 PM, Takashi Iwai wrote: >>> On Sat, 05 Mar 2016 07:37:34 +0100, >>> bsiice wrote: >>>> >>>> On 03/05/2016 02:08 AM, Takashi Iwai wrote: >>>>> On Fri, 04 Mar 2016 18:36:46 +0100, >>>>> bsiice wrote: >>>>>> >>>>>> On 03/05/2016 12:37 AM, Takashi Iwai wrote: >>>>>>> On Fri, 04 Mar 2016 17:22:58 +0100, >>>>>>> bsiice wrote: >>>>>>>> >>>>>>>> On 03/03/2016 09:53 PM, Takashi Iwai wrote: >>>>>>>>> On Thu, 03 Mar 2016 13:40:42 +0100, >>>>>>>>> IceBsi wrote: >>>>>>>>>> >>>>>>>>>> From: Qing Cai <caiqing@xxxxxxxxxxx> >>>>>>>>>> >>>>>>>>>> As stated in manpage SHMCTL(2), shm_nattch is "No. of current attaches" >>>>>>>>>> (i.e., number of processes attached to the shared memeory). If an >>>>>>>>>> application uses alsa-lib and invokes fork() at some point, there should >>>>>>>>>> be the following execution sequence: >>>>>>>>>> 1. execute the following statement: >>>>>>>>>> pcm_direct.c:110: dmix->shmptr = shmat(dmix->shmid, 0, 0) >>>>>>>>>> (shm_nattch becomes 1) >>>>>>>>>> 2. invoke fork() in some thread. >>>>>>>>>> (shm_nattch becomes 2) >>>>>>>>>> 3. execute the following statement: >>>>>>>>>> pcm_direct.c:122: if (buf.shm_nattch == 1) >>>>>>>>>> 4. execute the following statement: >>>>>>>>>> pcm_direct.c:131: if (dmix->shmptr->magic != SND_PCM_DIRECT_MAGIC) >>>>>>>>>> (As stated in manpage SHMGET(2), "When a new shared memory segment >>>>>>>>>> is created, its contents are initialized to zero values", so >>>>>>>>>> dmix->shmptr->magic is 0) >>>>>>>>>> 5. execute the following statements: >>>>>>>>>> pcm_direct.c:132: snd_pcm_direct_shm_discard(dmix) >>>>>>>>>> pcm_direct.c:133: return -EINVAL >>>>>>>>>> The above execution sequence will cause the following error: >>>>>>>>>> unable to create IPC shm instance >>>>>>>>>> This error causes multimedia application has no sound. This error rarely >>>>>>>>>> occurs, probability is about 1%. >>>>>>>>>> Because the first user of the shared memory will get that >>>>>>>>>> dmix->shmptr->magic is 0, check dmix->shmptr->magic's value to determine >>>>>>>>>> if "we're the first user" is OK. >>>>>>>>>> Tests have been made 400+ times after this fix, and the issue no longer >>>>>>>>>> exists. >>>>>>>>> >>>>>>>>> I think this is still racy. Multiple users can grab the shmem at the >>>>>>>>> very same time. Maybe it looks as if working just because both users >>>>>>>>> behavior as the first user and do clear and initialize. >>>>>>>> I think this won't be a race condition. Since >>>>>>>> snd_pcm_direct_shm_create_or_connect() is protected by >>>>>>>> snd_pcm_direct_semaphore_down() and snd_pcm_direct_semaphore_up(), >>>>>>>> multiple processes won't do clear and initialization concurrently. >>>>>>> >>>>>>> Hrm, OK, now understood the situation. >>>>>>> >>>>>>>>> The check of bus.shm_nattach=1 should be fine, per se. The problem is >>>>>>>>> the magic key check of the secondary. In the current code, as you >>>>>>>>> pointed out, this may happen before the first client finishes the >>>>>>>>> initialization. >>>>>>>> I think the check of dmix->shmptr->magic!=SND_PCM_DIRECT_MAGIC is more >>>>>>>> robust. Assume there are two clients of shmem and shmem is initialized, >>>>>>>> if one of the clients exits, shm_nattach will become 1 and shmem may be >>>>>>>> initialized again. >>>>>>> >>>>>>> Yeah, in the fork from a thread, the shm_nattch check doesn't work >>>>>>> reliably, indeed. >>>>>>> >>>>>>>> In the current code, because there is semaphore, magic key check won't >>>>>>>> happen before initialization of shmem. >>>>>>>> In the scenario that I want to tell, there is only one process, and only >>>>>>>> one thread doning the clear and initialization of shmem, and the fork() >>>>>>>> is invoked in another thread of non alsa-lib context (i.e., the fork() >>>>>>>> code is not in alsa-lib). If the fork() is invoked before shmat(), the >>>>>>>> problem won't happen; If the fork() is invoked after the check of >>>>>>>> bus.shm_nattach=1, the problem won't happen too. Only if the fork() is >>>>>>>> invoked just after shmat() and just before the check of >>>>>>>> bus.shm_nattach=1, the problem happens. >>>>>>> >>>>>>> Well, the reason why the magic key check was introduced was about the >>>>>>> safety. Since the shmid is given explicitly, it might override some >>>>>>> shmem usages other than dmix accidentally. With your patch, it >>>>>>> forcibly clears the region -- this is quite opposite to the intention >>>>>>> of the magic key check. >>>>>> >>>>>> I just think it clears the region only once at the first time. Could you >>>>>> please show me the scenario in detail? >>>>> >>>>> The shmid ipc key is given explicitly by an alsa-lib configuration, >>>>> but there is no guarantee that this key has never been used by some >>>>> other programs for a completely different purpose. So, for non-first >>>>> user, it verifies whether the attached region really belongs to >>>>> alsa-lib dmix. >>>>> >>>>> >>>>> Takashi >>>>> >>>> >>>> Thank you for explanation! Now I understood. >>>> I have tried changing ipc_key many times when I have been investigating >>>> the problem, and changing ipc_key doesn't solve the problem I >>>> encounterd. >>>> >>>> I think there is more benefits to do magic key check instead of >>>> shm_nattch check, because the situation that I encounterd happens more >>>> often than the situation that other program has a same ipc key with >>>> alsa-lib. >>> >>> Yes, possibly. However, which may cause a severe problem? It's the >>> question, too. >> >> By 'severe problem', do you mean that alsa-lib may clear other program's >> shmem? If the key is unique, will the 'severe problem' still happen? > > If it were unique... Actually it's not with SYSV shmem. > >>>> The problem left is to make ipc key as unique as possible. >>>> Every program/library should fulfil the responsibility of generating a >>>> unique key. As stated on http://www.tldp.org/LDP/lpg/node24.html, ftok() >>>> should be used to generate a unique key. >>> >>> This doesn't guarantee the uniqueness completely, obviously :) >>> In addition, we'd need more than 256 variants... >> >> Oh, sorry, I thought it would be unique if all program/library use their >> own full filename as arg to ftok(). Actually this doesn't guarantee >> uniqueness. :) >> >> What if alsa-lib changed to use POSIX shmem (i.e. shm_open()...)? > > This would be one option. I thought it in the past, too, but > postponed by some reason I forgot. > > Another possible way I can think of is to determine the first user by > shmget() call without IPC_CREAT. If this succeeds, we are no first > user. I'm not sure, though, whether this method is more robust. Sounds a good method, I think it can't be impacted by fork() from a thread. Assume there is other program/process uses same ipc key to create shmem, and runs concurrently with alsa-lib, I can think of some methods and situations (I only think of situations with failure). Method 1: call shmget() without IPC_CREAT firstly, if fail then call shmget(IPC_CREAT). Possible situation 1.1: (1) other program creates a shmem with same key (2) alsa-lib calls shmget() without IPC_CREAT (3) if (2) succeeds, alsa-lib connects to other program's shmem (4) alsa-lib is not the first user, so alsa-lib checks dmix->shmptr->magic!=SND_PCM_DIRECT_MAGIC (true) (5) alsa-lib return -EINVAL Possible situation 1.2: (1) alsa-lib calls shmget() without IPC_CREAT, and fails (2) other program creates a shmem with same key (3) alsa-lib calls shmget(IPC_CREAT), and succeeds, but connects to other program's shmem (4) alsa-lib doesn't know if the shmem is created by alsa-lib or not, so method 1 is not applicable Method 2: call shmget() without IPC_CREAT firstly, if fail then call shmget(IPC_CREAT|IPC_EXCL). Possible situation 2.1: (1) alsa-lib calls shmget() without IPC_CREAT, and fails (2) other program creates a shmem with same key (3) alsa-lib calls shmget(IPC_CREAT|IPC_EXCL), and fails Method 3: call shmget(IPC_CREAT|IPC_EXCL) firstly, if fail then call shmget() without IPC_CREAT. Possible situation 3.1: (1) alsa-lib calls shmget(IPC_CREAT|IPC_EXCL), and succeeds (2) other program tries to create a shmem with same key, and maybe fail (3) alsa-lib is the first user, so alsa-lib does clear and initialization Possible situation 3.2: (1) other program creates a shmem with same key, and succeeds (2) alsa-lib calls shmget(IPC_CREAT|IPC_EXCL), and fails (3) alsa-lib calls shmget() without IPC_CREAT, and succeeds, but connects to other program's shmem (4) alsa-lib is not the first user, so alsa-lib checks dmix->shmptr->magic!=SND_PCM_DIRECT_MAGIC (true) (5) alsa-lib return -EINVAL So, which method should be used? Method 2 or Method 3 or something else? thanks, Qing Cai _______________________________________________ Alsa-devel mailing list Alsa-devel@xxxxxxxxxxxxxxxx http://mailman.alsa-project.org/mailman/listinfo/alsa-devel