
From: Mingming Cao <cmm@us.ibm.com>

I have seen similar bug report from IBM.  The bug report also said there
is an oops but no kernel panic.
https://bugzilla.linux.ibm.com/show_bug.cgi?id=2050

I suspect there is a race in the IPC RMID.  During the procedure of
removing an IPC semaphore,  a process waiting on that semaphore might be
wokenup by an signal before it is removed from the waiting queue.  When
it is woke up, it find the ID has been removed but itself still linked
on the waiting queue, then the BUG() will catch this. 

I think this could happen but the window to cause this race is very
small, so hard to reproduce. If that is the problem, then maybe we
should not call BUG() if the waiter of an removed ID got interrupted. 



 25-akpm/ipc/sem.c |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)

diff -puN ipc/sem.c~semop-race-fix ipc/sem.c
--- 25/ipc/sem.c~semop-race-fix	Thu Apr 24 17:45:04 2003
+++ 25-akpm/ipc/sem.c	Thu Apr 24 17:46:02 2003
@@ -1114,10 +1114,12 @@ asmlinkage long sys_semtimedop(int semid
 		lock_semundo();
 		sma = sem_lock(semid);
 		if(sma==NULL) {
-			if(queue.prev != NULL)
+			error = queue.status;	/* EIDRM or EINTR*/
+			if (((error != -EINTR) && (error != -EIDRM))
+				|| ((error == -EIDRM) && (queue.prev !=NULL))) {
 				BUG();
+			}
 			current->sysvsem.sleep_list = NULL;
-			error = -EIDRM;
 			goto out_semundo_free;
 		}
 		/*

_
