Hi Jan,
I haven't the patch you suggested(in order to catch it again). But I have reproduced this issue and found something very strange for me:
1) corosync_dispath_get() got called by our CLM client after return form polling the file descriptor of the IPC channel. But I don't think there is any dispatch request generate from serverside because we have only one node in cluster, no cluster membership change notification can happen.
2) the poll() in corosync_dispatch_get() returns 1 but the errno is not zero, it is EINTR! Then it run into socket_recv(). Can poll really act like this? Or the errno belongs to the previous syscall?
3) in socket_recv(), the call to recvmsg() returns -1 and errno is EBADF. So after returns to corosync_dispatch_get() , the assertion raised.
Thanks Jan,
I will give it a try and find out the initial reason why this issue appear.in my environment.On Dec 3, 2012 5:12 PM, "Jan Friesse" <jfriesse@xxxxxxxxxx> wrote:Hi,
honestly I'm really unsure why this assert is there. Actually, it really
looks like thing which shouldn't be there at all. I would suggest a
patch, which simply:
- remove whole #if defined (as it doesn't seem to be needed)
- remove assert
- check if error is CS_OK and if not, goto error_put
Regards,
Honza
jason napsal(a):
> Hi All,
> We have encountered an assertion at coroipc.c:925 and it seems hard to
> reproduce. According to the code of corosync-1.4.4 it means socket_recv()
> did not return CS_OK as expected by coroipcc_dispatch_get(). But I checked
> socket_recv() and found that it DO return CS_ERR_TRY_AGAIN or
> CS_ERR_LIBRARY in some cases. So does it really need this assertion or do
> we need to deal with ! CS_OK cases at coroipc.c:925?
>
> Thank you!
>
>
>
> _______________________________________________
> discuss mailing list
> discuss@xxxxxxxxxxxx
> http://lists.corosync.org/mailman/listinfo/discuss
>
_______________________________________________ discuss mailing list discuss@xxxxxxxxxxxx http://lists.corosync.org/mailman/listinfo/discuss