
From: Ingo Molnar <mingo@elte.hu>

the attached patch fixes SMP/x86 systems to not hang during reboot in
certain circumstances, printing out an endless stream of:

 APIC error on CPU1: 00(04)
 APIC error on CPU1: 04(04)
 APIC error on CPU1: 04(04)

The bug is this: sys_reboot() calls device_shutdown(), which calls each
registered driver and shuts it down.  One of them is lapic0, which is
lethal: it will disable the local APIC on the currently executing CPU. 
This is buggy in a number of ways 1) there's no guarantee that the reboot
process wont migrate to another CPU at this point (it's still a fully
functioning kernel) 2) if another CPU tries to send an cross-CPU IPI
message after this CPU's local APIC has been disabled, that other CPU will
get an infinite stream of APIC error interrupts - locking the system up in
a flood of messages.  The reboot never happens.

The fix is to do what we always did: only disable the local APIC in the
very final moments, as part of the smp_send_stop() logic.

AFAICS, this is a fresh changeset that came in over the ACPI BK tree, it
has not been committed to Linus' tree yet. This bug was found via
PREEMPT_RT where the race is much more likely to trigger, but there's no
reason why this could not occur in the generic kernel.

Similarly, i think it's unfortunate that the IO-APIC driver has a shutdown
handler as well - while i cannot see a bug scenario right now, i think it's
quite fragile to disable all external interrupts (including the timer
interrupt!) at this point.  Again, this is best done in machine_restart(). 
(which already does it.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
---

 25-akpm/arch/i386/kernel/apic.c |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletion(-)

diff -puN arch/i386/kernel/apic.c~x86-fix-reboot-hang--apic-errors arch/i386/kernel/apic.c
--- 25/arch/i386/kernel/apic.c~x86-fix-reboot-hang--apic-errors	2004-11-28 01:33:11.340422216 -0800
+++ 25-akpm/arch/i386/kernel/apic.c	2004-11-28 01:33:11.344421608 -0800
@@ -618,9 +618,12 @@ static int lapic_resume(struct sys_devic
 }
 
 
+/*
+ * This device has no shutdown method - fully functioning local APICs
+ * are needed on every CPU up until machine_halt/restart/poweroff.
+ */
 static struct sysdev_class lapic_sysclass = {
 	set_kset_name("lapic"),
-	.shutdown	= lapic_shutdown,
 	.resume		= lapic_resume,
 	.suspend	= lapic_suspend,
 };
_
