Re: Corosync segmentfalt at liblogsys.so.4.0.0

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Sorry,Resend. Previous one is wrong.

--- logsys.c.orig 2012-11-29 17:40:25.000000000 +800
+++ logsys.c 2012-11-29 12:31:12.000000000 +800
@@ -446,6 +446,8 @@ static void log_printf_to_logs (
        subsysid = LOGSYS_DECODE_SUBSYSID(rec_ident);
        level = LOGSYS_DECODE_LEVEL(rec_ident);

+        pthread_mutex_lock (&logsys_config_mutex);
+
        while ((c = format_buffer[format_buffer_idx])) {
                cutoff = 0;
                if (c != '%') {
@@ -537,6 +539,8 @@ static void log_printf_to_logs (
                }
        }

+ pthread_mutex_unlock (&logsys_config_mutex);
+
        normal_output_buffer[normal_output_buffer_idx] = '\0;
        syslog_output_buffer[syslog_output_buffer_idx] = '\0';

On Nov 29, 2012 5:36 PM, "jason" <huzhijiang@xxxxxxxxx> wrote:

Hi Jan,
Here is my patch.

--- logsys.c.orig       2012-11-29 17:40:25.000000000 +800
+++ logsys.c     2012-11-29 12:31:12.000000000 +800
@@ -446,6 +446,8 @@ static void log_printf_to_logs (
        subsysid = LOGSYS_DECODE_SUBSYSID(rec_ident);
        level = LOGSYS_DECODE_LEVEL(rec_ident);

+       pthread_mutex_lock (&logsys_config_mutex);
+
        while ((c = format_buffer[format_buffer_index])) {
                cutoff = 0;
                if (c != '%') {
@@ -537,6 +539,8 @@ static void log_printf_to_logs (
                }
        }

+       pthread_mutex_unlock (&logsys_config_mutex);
+
        normal_output_buffer[normal_output_buffer_idx] = '\0';
        syslog_output_buffer[syslog_output_buffer_idx] = '\0';

On Nov 29, 2012 4:51 PM, "Jan Friesse" <jfriesse@xxxxxxxxxx> wrote:
Ya,
you are right. Thanks for dig into problem. Actually, in current
flatiron, corosync-objctl is fixed, so reread of mainconfig (and thus
logsys_format_set call) is not called after EVERY operation, so this
problem appear only after change of logging bits in objdb (thats why I
was not able to reproduce it). Nevertheless, problem is there, it's just
harder to reproduce it.

Can you send me your patch (if you have it)? You spent time to find real
issue and you deserver credit in coorosync changelog.

Regards,
  Honza

jason napsal(a):
> Hi All,
>
> I think I have got the point. Corosync-blackbox can trigger calling
> log_format_set() two times, thus, format_buffer's value may becomes NULL
> temporary which cause log_printf_to_logs() access invalid memory address.
>
> After add logsys_config_mutex lock around format_buffer in
> log_printf_to_logs(), the segment fault never happens again.
> On Nov 28, 2012 11:12 AM, "jason" <huzhijiang@xxxxxxxxx> wrote:
>
>> Hi All,
>> By using corosync-1.4.4 and openais-1.1.4. We encountered a reproducable
>> segmentfalt. By exams with gdb we found it failed in log_printf_to_logs()
>> at liblogsys.so.4.0.0 at the following line:
>>
>> while ((c = format_buffer[format_buffer_idx])) {
>>
>> And the backtrace is:
>> log_printf_to_logs()
>> logsys_worker_thread()
>> start_thread()
>> clone()
>>
>> Only release version have this issue. If I turn --enable-debug when
>> compiling, then it will not happen.
>>
>> It can be reproduce by executing corosync-blackbox again and again quickly
>> and frequently while aisexec and its client application is running (to let
>> amf_dump_fn() really has something to print).
>>
>>
>
>
>
> _______________________________________________
> discuss mailing list
> discuss@xxxxxxxxxxxx
> http://lists.corosync.org/mailman/listinfo/discuss

_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss

[Index of Archives]     [Linux Clusters]     [Corosync Project]     [Linux USB Devel]     [Linux Audio Users]     [Photo]     [Yosemite News]    [Yosemite Photos]    [Linux Kernel]     [Linux SCSI]     [X.Org]

  Powered by Linux