1 year ago
#266431

Mario Klebsch
clock_gettime(CLOCK_MONOTONIC_RAW,...) seems to cause the user space to freeze for about 5-6 minutes
I have an embedded systems running linux 2.6.30 on a powerpc CPU (Freescale MPC5125). After writing new code for this device, I suddenly observed user space hangs for about 5-6 minutes.
It turned out, that the new code calls clock_gettime(CLOCK_MONOTONIC_RAW,...) and the system freezes after one of these clock_gettime() calls returned an unplausible value.
Here is the recorded output of strace:
14:39:48.769746 clock_gettime(0x4 /* CLOCK_??? */, {14496, 285316209}) = 0
14:39:48.782047 clock_gettime(0x4 /* CLOCK_??? */, {14496, 285627946}) = 0
14:39:48.782354 select(14, [4 5 6 7 9 10 12 13], NULL, NULL, {19, 999689}) = 1 (in [13], left {19, 853317})
14:39:48.928554 read(13, "\0\0\0\257\0\0\0\1", 8) = 8
14:39:48.928917 clock_gettime(0x4 /* CLOCK_??? */, {1266889381, 847702609}) = 0
14:45:15.612681 time(NULL) = 1646750715
14:45:15.613026 ...
14:45:15.818364 clock_gettime(0x4 /* CLOCK_??? */, {14819, 27615047}) = 0
The system continues to respond to ICMP echo requests, accepts new TCP connections and even acks incoming data on these new TCP connections. But the entire user space seem to be hanging, even no echo on the serial console.
After 5-6 minutes, the system recovers and continues to work, TCP sessions did not time out, all ssh sessions continue to work and input on the serial console is echoed again.
The problem seems to go away, when I use CLOCK_MONOTONIC instead of CLOCK_MONOTONIC_RAW. I could just do this change to my software and just live with regarding this problem a a known kernel bug on my system, which could be easy to avoid, but other programs on that system also use CLOCK_MONOTONIC_RAW and I cannot change them. I should at least understand, what is going wrong inside of the kernel.
Unfortunately, the system does not have a JTAG pads on its PCB, so I cannot debug the kernel, while the user space is blocked.
So, here are my questions:
- Has anybody ever observed problems like this?
- What might be the cause of clock_gettime() calls blocking all user space processes?
- What can I do to continune hunting this problem in the kernel?
linux-kernel
powerpc
0 Answers
Your Answer