
From: "Chen, Kenneth W" <kenneth.w.chen@intel.com>

We hit a memory ordering race condition on AIO ring buffer tail pointer
between function aio_complete() and aio_read_evt().

What happens is that on an architecture that has a relaxed memory ordering
model like IPF(ia64), explicit memory barrier is required in a SMP
execution environment.  Considering the following case:

1 CPU is executing a tight loop of aio_read_evt.  It is pulling event off
the ring buffer.  During that loop, another CPU is executing aio_complete()
where it is putting event into the ring buffer and then update the tail
pointer.  However, due to relaxed memory ordering model, the tail pointer
can be visible before the actual event is being updated.  So the other CPU
sees the updated tail pointer but picks up a staled event data.

A memory barrier is required in this case between the event data and tail
pointer update.  Same is true for the head pointer but the window of the
race condition is nil.  For function correctness, it is fixed here as well.

By the way, this bug is fixed in the major distributor's kernel on 2.4.x
kernel series for a while, but somehow hasn't been propagated to 2.5 kernel
yet.



 25-akpm/fs/aio.c |    7 +++----
 1 files changed, 3 insertions(+), 4 deletions(-)

diff -puN fs/aio.c~aio_complete-barrier-fix fs/aio.c
--- 25/fs/aio.c~aio_complete-barrier-fix	Tue Jul  8 16:06:56 2003
+++ 25-akpm/fs/aio.c	Tue Jul  8 16:06:56 2003
@@ -679,12 +679,11 @@ int aio_complete(struct kiocb *iocb, lon
 	/* after flagging the request as done, we
 	 * must never even look at it again
 	 */
-	barrier();
+	smp_wmb();	/* make event visible before updating tail */
 
 	info->tail = tail;
 	ring->tail = tail;
 
-	wmb();
 	put_aio_ring_event(event, KM_IRQ0);
 	kunmap_atomic(ring, KM_IRQ1);
 
@@ -721,7 +720,7 @@ static int aio_read_evt(struct kioctx *i
 	dprintk("in aio_read_evt h%lu t%lu m%lu\n",
 		 (unsigned long)ring->head, (unsigned long)ring->tail,
 		 (unsigned long)ring->nr);
-	barrier();
+
 	if (ring->head == ring->tail)
 		goto out;
 
@@ -732,7 +731,7 @@ static int aio_read_evt(struct kioctx *i
 		struct io_event *evp = aio_ring_event(info, head, KM_USER1);
 		*ent = *evp;
 		head = (head + 1) % info->nr;
-		barrier();
+		smp_mb(); /* finish reading the event before updatng the head */
 		ring->head = head;
 		ret = 1;
 		put_aio_ring_event(evp, KM_USER1);

_
