Re: Problem with function select on kernel 2.6.29.6-rt23

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



No, I do not think that this is intentional. Some lines later, you will find

"Some code calls *select*() with all three sets empty, /n/ zero, and a non-NULL /timeout/ as a fairly portable way to sleep with subsecond precision."

This cannot make any sense, if I have to call select several times to get the full delay period. The overhead for calling the function several times is significant. I have modified the test program according to your proposal to run the loop 2000 times with 10000 us delay and get - depending on the speed of the computer - times between 22 and 24 seconds total.

I understand that the timeout argument of select is updated when select returns after one of the monitored file descriptors is ready for the selected operation.

I have tested this issue now with the kernel 2.6.31-rt11 and got a new problem: this time select does not abort prematurely any more but now each second of computer time is about three seconds in reality (the computer clock is extremely slow). NTP is running.

Somehow fiddling with NTP causes very strange side effects...

Bye,
          Jürgen

Sujit K M schrieb:
this seems to be normal functionality.

As quoted from

http://linux.die.net/man/2/select

(ii)
select() may update the timeout argument to indicate how much time was
left. pselect() does not change this argument.



On Sun, Sep 20, 2009 at 3:50 PM, Sujit K M <sjt.kar@xxxxxxxxx> wrote:
Hi,

One thing at the onset I would like you to check is that what happens
to the program when the loop
count is made more like 1000/10,000/100000 - 1 Million/10 Million.
Does the Time Graph Increase.
Try Plotting the Difference with actual time start. Try Making Use of
Some scripting language like TCL/TK.

There is some info regarding the select system call. I think it is
pertaining to this problem.
http://linux.die.net/man/2/syscalls. Basically It is an Optimization
that the Current Kernels Look Into.

Thanks,
Sujit

On Sat, Sep 19, 2009 at 1:10 AM, Jürgen Mell <mell@xxxxxxxxxxxxxxxxxxx> wrote:
Meanwhile I have dug a little deeper into this problem. The problem
occurs under the following conditions:
- the BIOS clock must be slow
- the NTP daemon is used to adjust the system time
The problem can be reproduced on real hardware as well as on a virtual
machine running under VMware. Set the BIOS clock back about ten minutes
against the 'real' time. Then start the NTP daemon and then run the
little test program:

#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <errno.h>
#include <sys/select.h>

int main(int argc, char *argv[])
{
  time_t t;
  struct timeval timeout;
  int i;
  int ret;

  t = time (NULL);
  printf ("Current time before = %s", ctime (&t));

  for (i = 0; i < 20; i++)
  {
     timeout.tv_sec  = 1;
     timeout.tv_usec = 0;

     if ((ret = select (FD_SETSIZE, NULL, NULL, NULL, &timeout)) < 0)
     {
        printf ("select returned %d, errno = %d\n", ret, errno);
        return EXIT_FAILURE;
     }
  }
  t = time (NULL);
  printf ("Current time after = %s", ctime (&t));
  return EXIT_SUCCESS;
}

On a virtual machine under VMware I got the following result after some
minutes of system run time:

hws@cwc-vmware:/home/hws > /space/software/select_test/debug/src/select_test
Current time before = Fri Sep 18 20:05:51 2009
Current time after = Fri Sep 18 20:06:11 2009
hws@cwc-vmware:/home/hws > /space/software/select_test/debug/src/select_test
Current time before = Fri Sep 18 20:14:29 2009
Current time after = Fri Sep 18 20:14:33 2009
hws@cwc-vmware:/home/hws > /space/software/select_test/debug/src/select_test
Current time before = Fri Sep 18 20:14:57 2009
Current time after = Fri Sep 18 20:14:57 2009
hws@cwc-vmware:/home/hws > /space/software/select_test/debug/src/select_test
Current time before = Fri Sep 18 20:15:20 2009
Current time after = Fri Sep 18 20:15:40 2009
hws@cwc-vmware:/home/hws >

Normally, the time distance between 'before' and 'after' should be 20
seconds as in the first and last run of the program. For the second run
the time difference is only 4 seconds and for the third run it is even zero.

On the real hardware I have also some other time-related issues when the
problem occurs. Keyboard input will often 'bounce' - key presses are
detected two or more times and some delay times are prolonged (!). I
could not yet reproduce this in the virtual machine.

The problem will not always occur immediately after the system is
started but it may take several minutes until the first effects occur. I
have not tested this issue with other kernels yet but I will do so
during the weekend.

Are there any ideas what to do about this (beside buying a better BIOS
clock)? I would really like to have the NTP daemon running to keep the
system time accurate, but somehow it seems to effect wait queues in the
kernel pretty badly.

Bye,
          Jürgen

Jürgen Mell schrieb:
Hi,

I have an application which connects via a network socket to a server
running on the same machine (IP 127.0.0.1) This application uses the
function 'select' to wait for new data from the server or until a two
seconds timeout. This works well until there is network traffic on the
external network interfaces (eth* or WLAN). When there is network
traffic on the external interfaces, the select function does not wait
anymore but it returns with a return code of zero, indicating not data
available on the socket. This happens nearly immediately (after 8 to 9
microseconds) and not after the specified two seconds interval. The
timeout parameter of select is updated accordingly (it shows eg. 1 s
999991 us).
Up to now I could not test this with another kernel but I will try to
do it this afternoon. Are there any known problems with select? Is
there any way to circumvent this?

Any help would be greatly appreciated!

       Jürgen

--
Jürgen Mell (Software-Entwicklung)       mell@xxxxxxxxxxxxxxxxxxx
Tel.:  +49-511-762-18226                 http://www.hedrich-winding.com
FAX :  +49-511-762-18225
Mobil: +49-160-7428156
----------------------------------------------------------------------------
HEDRICH winding systems GmbH
An der Universität 2 (im PZH)
D-30823 Garbsen (GERMANY)
----------------------------------------------------------------------------
Geschäftsführer: Karsten Adam
Handelsregister: Wetzlar, HRB 4768
Steuernr.: 020/235/20110                 USt-IdNr.: DE 258258279
----------------------------------------------------------------------------

--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
-- Sujit K M

blog(http://kmsujit.blogspot.com/)






--
Jürgen Mell (Software-Entwicklung)       mell@xxxxxxxxxxxxxxxxxxx
Tel.:  +49-511-762-18226                 http://www.hedrich-winding.com
FAX :  +49-511-762-18225
Mobil: +49-160-7428156
----------------------------------------------------------------------------
HEDRICH winding systems GmbH
An der Universität 2 (im PZH)
D-30823 Garbsen (GERMANY)
----------------------------------------------------------------------------
Geschäftsführer: Karsten Adam
Handelsregister: Wetzlar, HRB 4768
Steuernr.: 020/235/20110                 USt-IdNr.: DE 258258279
----------------------------------------------------------------------------

--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [RT Stable]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Video 4 Linux]     [Device Mapper]

  Powered by Linux