pandora-kernel.git
16 years agox86: convert some existing cpuid disable options to new generic bitmap
Andi Kleen [Wed, 30 Jan 2008 12:33:20 +0000 (13:33 +0100)]
x86: convert some existing cpuid disable options to new generic bitmap

This convers nofxsr, mem=nopentium and nosep to use the new
generic cpuid disable bitmap instead of using own variables.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: add framework to disable CPUID bits on the command line
Andi Kleen [Wed, 30 Jan 2008 12:33:20 +0000 (13:33 +0100)]
x86: add framework to disable CPUID bits on the command line

There are already various options to disable specific cpuid bits
on the command line. They all use their own variable. Add a generic
mask to make this easier in the future.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: fill in missing pv_mmu_ops entries for PAGETABLE_LEVELS >= 3
Eduardo Habkost [Wed, 30 Jan 2008 12:33:20 +0000 (13:33 +0100)]
x86: fill in missing pv_mmu_ops entries for PAGETABLE_LEVELS >= 3

This finally makes paravirt-ops able to compile and boot under x86_64.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: don't set pagetable_setup_{start,done} hooks on 64-bit
Eduardo Habkost [Wed, 30 Jan 2008 12:33:20 +0000 (13:33 +0100)]
x86: don't set pagetable_setup_{start,done} hooks on 64-bit

paravirt_pagetable_setup_{start,done}() are not used (yet) under x86_64,
and native_pagetable_setup_{start,done}() don't exist on x86_64. So they
don't need to be set.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: include/asm-x86/paravirt.h: x86_64 mmu operations
Eduardo Habkost [Wed, 30 Jan 2008 12:33:20 +0000 (13:33 +0100)]
x86: include/asm-x86/paravirt.h: x86_64 mmu operations

Add .set_pgd field to pv_mmu_ops.

Implement pud_val(), __pud(), set_pgd(), pud_clear(), pgd_clear().

pud_clear() and pgd_clear() are implemented simply using set_pud()
and set_pmd(). They don't have a field at pv_mmu_ops.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: change function orders in paravirt.h
Glauber de Oliveira Costa [Wed, 30 Jan 2008 12:33:19 +0000 (13:33 +0100)]
x86: change function orders in paravirt.h

__pmd, pmd_val and set_pud are used before they are defined (as static)
We move them a little up in the file, so it doesn't happen.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: provide __parainstructions section
Glauber de Oliveira Costa [Wed, 30 Jan 2008 12:33:19 +0000 (13:33 +0100)]
x86: provide __parainstructions section

This patch adds the __parainstructions section to vmlinux.lds.S.
It's needed for the patching system.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: add asm_offset PARAVIRT constants
Glauber de Oliveira Costa [Wed, 30 Jan 2008 12:33:19 +0000 (13:33 +0100)]
x86: add asm_offset PARAVIRT constants

This patch adds the constant PARAVIRT needs in asm_offsets_64.c

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: fill pv_cpu_ops structure with cr8 fields
Glauber de Oliveira Costa [Wed, 30 Jan 2008 12:33:19 +0000 (13:33 +0100)]
x86: fill pv_cpu_ops structure with cr8 fields

This patch fills in the read and write cr8 fields with their
native version.

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: provide read and write cr8 paravirt hooks
Glauber de Oliveira Costa [Wed, 30 Jan 2008 12:33:19 +0000 (13:33 +0100)]
x86: provide read and write cr8 paravirt hooks

Since the cr8 manipulation functions ended up staying in the tree,
they can't be defined just when PARAVIRT is off: In this patch,
those functions are defined for the PARAVIRT case too.

[ mingo@elte.hu: fixes ]

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: puts read and write cr8 into pv_cpu_ops
Glauber de Oliveira Costa [Wed, 30 Jan 2008 12:33:19 +0000 (13:33 +0100)]
x86: puts read and write cr8 into pv_cpu_ops

This patch adds room for read and write_cr8 functions back in
pv_cpu_ops struct

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: put generic mm_hooks include into PARAVIRT
Glauber de Oliveira Costa [Wed, 30 Jan 2008 12:33:19 +0000 (13:33 +0100)]
x86: put generic mm_hooks include into PARAVIRT

With PARAVIRT, we actually have arch_{dup,exit}_mmap functions,
so we can't include the generic header

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: provide a native_init_IRQ function on 64-bit
Glauber de Oliveira Costa [Wed, 30 Jan 2008 12:33:19 +0000 (13:33 +0100)]
x86: provide a native_init_IRQ function on 64-bit

x86_64 lacks a native_init_IRQ() function, so we turn the arch's
init_IRQ() function into a native construct

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: add stringify header
Glauber de Oliveira Costa [Wed, 30 Jan 2008 12:33:19 +0000 (13:33 +0100)]
x86: add stringify header

We use a __stringify construction at paravirt_patch_64.c.
It's better practice to include the stringify header directly

Signed-off-by: Glauber de Oliveira Costa <gcosta@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: checking aperture report for node instead
Yinghai Lu [Wed, 30 Jan 2008 12:33:18 +0000 (13:33 +0100)]
x86: checking aperture report for node instead

currently when gart iommu is enabled by BIOS or previous we got

"
Checking aperture...
CPU 0: aperture @4000000 size 64MB
CPU 1: aperture @4000000 size 64MB
"
we should use use Node instead.

we will get
"
Checking aperture...
Node 0: aperture @4000000 size 64MB
Node 1: aperture @4000000 size 64MB
"

Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: move select_idle_routine() call after detect_ht()
Hiroshi Shimamoto [Wed, 30 Jan 2008 12:33:18 +0000 (13:33 +0100)]
x86: move select_idle_routine() call after detect_ht()

Move the select_idle_routine() call to after the detect_ht() call at
identify_cpu() on 64-bit.

This change is for printing the polling idle and HT enabled warning
message properly.

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: move warning message of polling idle and HT enabled
Hiroshi Shimamoto [Wed, 30 Jan 2008 12:33:18 +0000 (13:33 +0100)]
x86: move warning message of polling idle and HT enabled

The warning message at idle_setup() is never shown because
smp_num_sibling hasn't been updated at this point yet.

Move this polling idle and HT enabled warning to select_idle_routine().
I also implement this warning on 64-bit kernel.

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: msr for AMD Fam 10h mmio
Yinghai Lu [Wed, 30 Jan 2008 12:33:18 +0000 (13:33 +0100)]
x86: msr for AMD Fam 10h mmio

Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: fix unconditional arch/x86/kernel/pcspeaker.o compiling
Michael Opdenacker [Wed, 30 Jan 2008 12:33:18 +0000 (13:33 +0100)]
x86: fix unconditional arch/x86/kernel/pcspeaker.o compiling

do not add the pcspkr platform device if pcspkr support is disabled.

Signed-off-by: Michael Opdenacker <michael@free-electrons.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: only call early_init_amd one time
Yinghai Lu [Wed, 30 Jan 2008 12:33:18 +0000 (13:33 +0100)]
x86: only call early_init_amd one time

Andi's patch
"
    x86: move X86_FEATURE_CONSTANT_TSC into early cpu feature detection

    Need this in the next patch in time_init and that happens early.

    This includes a minor fix on i386 where early_intel_workarounds()
    [which is now called early_init_intel] really executes early as
    the comments say.
"
calling early_init_amd in early_identify_cpu and identify_cpu two times.

this patch remove the one in identify_cpu

Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86, 32-bit: trim memory not covered by wb mtrrs
Jesse Barnes [Wed, 30 Jan 2008 12:33:18 +0000 (13:33 +0100)]
x86, 32-bit: trim memory not covered by wb mtrrs

On some machines, buggy BIOSes don't properly setup WB MTRRs to cover all
available RAM, meaning the last few megs (or even gigs) of memory will be
marked uncached.  Since Linux tends to allocate from high memory addresses
first, this causes the machine to be unusably slow as soon as the kernel
starts really using memory (i.e.  right around init time).

This patch works around the problem by scanning the MTRRs at boot and
figuring out whether the current end_pfn value (setup by early e820 code)
goes beyond the highest WB MTRR range, and if so, trimming it to match.  A
fairly obnoxious KERN_WARNING is printed too, letting the user know that
not all of their memory is available due to a likely BIOS bug.

Something similar could be done on i386 if needed, but the boot ordering
would be slightly different, since the MTRR code on i386 depends on the
boot_cpu_data structure being setup.

This patch fixes a bug in the last patch that caused the code to run on
non-Intel machines (AMD machines apparently don't need it and it's untested
on other non-Intel machines, so best keep it off).

Further enhancements and fixes from:

  Yinghai Lu <Yinghai.Lu@Sun.COM>
  Andi Kleen <ak@suse.de>

Signed-off-by: Jesse Barnes <jesse.barnes@intel.com>
Tested-by: Justin Piszcz <jpiszcz@lucidpixels.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: print which shared library/executable faulted in segfault etc. messages v3
Andi Kleen [Wed, 30 Jan 2008 12:33:18 +0000 (13:33 +0100)]
x86: print which shared library/executable faulted in segfault etc. messages v3

They now look like:

hal-resmgr[13791]: segfault at 3c rip 2b9c8caec182 rsp 7fff1e825d30 error 4 in libacl.so.1.1.0[2b9c8caea000+6000]

This makes it easier to pinpoint bugs to specific libraries.

And printing the offset into a mapping also always allows to find the
correct fault point in a library even with randomized mappings. Previously
there was no way to actually find the correct code address inside
the randomized mapping.

Relies on earlier patch to shorten the printk formats.

They are often now longer than 80 characters, but I think that's worth it.

[includes fix from Eric Dumazet to check d_path error value]

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: don't disable the APIC if it hasn't been mapped yet
Andi Kleen [Wed, 30 Jan 2008 12:33:17 +0000 (13:33 +0100)]
x86: don't disable the APIC if it hasn't been mapped yet

When the kernel panics early for some unrelated reason
there would be eventually an early exception inside panic because
clear_local_APIC tried to disable the not yet mapped APIC.
Check for that explicitely.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: optimize lock prefix switching to run less frequently
Andi Kleen [Wed, 30 Jan 2008 12:33:17 +0000 (13:33 +0100)]
x86: optimize lock prefix switching to run less frequently

On VMs implemented using JITs that cache translated code changing the lock
prefixes is a quite costly operation that forces the JIT to throw away and
retranslate a lot of code.

Previously a SMP kernel would rewrite the locks once for each CPU which
is quite unnecessary. This patch changes the code to never switch at boot in
 the normal case (SMP kernel booting with >1 CPU) or only once for SMP kernel
on UP.

This makes a significant difference in boot up performance on AMD SimNow!
Also I expect it to be a little faster on native systems too because a smp
switch does a lot of text_poke()s which each synchronize the pipeline.

v1->v2: Rename max_cpus
v1->v2: Fix off by one in UP check (Thomas Gleixner)

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: replace hard coded reservations in 64-bit early boot code with dynamic table
Andi Kleen [Wed, 30 Jan 2008 12:33:17 +0000 (13:33 +0100)]
x86: replace hard coded reservations in 64-bit early boot code with dynamic table

On x86-64 there are several memory allocations before bootmem. To avoid
them stomping on each other they used to be all hard coded in bad_area().
Replace this with an array that is filled as needed.

This cleans up the code considerably and allows to expand its use.

Cc: peterz@infradead.org
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: unify printk strings in fault_32|64.c
Harvey Harrison [Wed, 30 Jan 2008 12:33:16 +0000 (13:33 +0100)]
x86: unify printk strings in fault_32|64.c

Adding the address of the faulting library missed removing a
line ending from X86_32.

Also update the shorter printk format for X86_32 in fault_64.c
to make it easier to se the remaining differences.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: use shorter addresses in i386 segfault printks
Andi Kleen [Wed, 30 Jan 2008 12:33:16 +0000 (13:33 +0100)]
x86: use shorter addresses in i386 segfault printks

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: use the correct cpuid method to detect MWAIT support for C states
Andi Kleen [Wed, 30 Jan 2008 12:33:16 +0000 (13:33 +0100)]
x86: use the correct cpuid method to detect MWAIT support for C states

Previously there was a AMD specific quirk to handle the case of
AMD Fam10h MWAIT not supporting any C states. But it turns out
that CPUID already has ways to detectly detect that without
using special quirks.

The new code simply checks if MWAIT supports at least C1 and doesn't
use it if it doesn't. No more vendor specific code.

Note this is does not simply clear MWAIT because MWAIT can be still
useful even without C states.

Credit goes to Ben Serebrin for pointing out the (nearly) obvious.

Cc: "Andreas Herrmann" <andreas.herrmann3@amd.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: move MWAIT idle check to generic CPU initialization on 32-bit
Andi Kleen [Wed, 30 Jan 2008 12:33:16 +0000 (13:33 +0100)]
x86: move MWAIT idle check to generic CPU initialization on 32-bit

Previously it was only run for Intel CPUs, but AMD Fam10h implements MWAIT too.

This matches 64bit behaviour.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: rename stack_pointer to kernel_trap_sp
Harvey Harrison [Wed, 30 Jan 2008 12:33:16 +0000 (13:33 +0100)]
x86: rename stack_pointer to kernel_trap_sp

Choose a less generic name for such a special case.  Add
a comment explaining the odd use in X86_32.

Change the one user of stack_pointer.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: clean up ptrace.h
Harvey Harrison [Wed, 30 Jan 2008 12:33:16 +0000 (13:33 +0100)]
x86: clean up ptrace.h

Leave definition of pt_regs in its own section, move all kernel
code to section afterwards, unify prototype definitions, has some
conditional prototypes to make it clear what was only defined in
32 and 64 bit.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: unify pt_regs accessors ptrace.h
Harvey Harrison [Wed, 30 Jan 2008 12:33:16 +0000 (13:33 +0100)]
x86: unify pt_regs accessors ptrace.h

Unify the definiton of:
v8086_mode
user_mode
user_mode_vm
stack_pointer
instruction_pointer
frame_pointer

in ptrace.h to make it clear where the differences are between
32 and 64 bit.  Changes macros to static inlines as well.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: kdump failure
Hiroshi Shimamoto [Wed, 30 Jan 2008 12:33:16 +0000 (13:33 +0100)]
x86: kdump failure

kdump needs ELF_CORE_COPY_REGS in crash_save_cpu().
This lack of the macro causes the following BUG.

 SysRq : Trigger a crashdump
 ------------[ cut here ]------------
 kernel BUG at include/linux/elfcore.h:105!
 invalid opcode: 0000 [1] PREEMPT SMP

Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86_32: remove the useless NR_syscalls macro
Dmitri Vorobiev [Wed, 30 Jan 2008 12:33:16 +0000 (13:33 +0100)]
x86_32: remove the useless NR_syscalls macro

This is against current x86.git.

The size of the system call table for 32-bit x86 kernels is obtained by
compile-time calculation of the sys_call_table array, not from the value,
which the NR_syscalls macro expands to. This trivial patch removes the
fossil macro.

Manually tested by grepping the x86 files for the "NR_syscalls" string.
No relevant use cases found.

Build-tested using allyesconfig, allnoconfig and a couple of randconfig
instances. All builds successfully finished.

Runtime test performed using a stripped-down Debian-ish config. The system
booted successfully.

Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: 64-bit, remove redundant cpu_has_ definitions
Kyle McMartin [Wed, 30 Jan 2008 12:33:15 +0000 (13:33 +0100)]
x86: 64-bit, remove redundant cpu_has_ definitions

PSE, PGE, XMM, XMM2, and FXSR are defined as required features, and
will be optimized to a constant at compile time. Remove their redundant
definitions.

Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: fixup NR-CPUS patch for numa
travis@sgi.com [Wed, 30 Jan 2008 12:33:15 +0000 (13:33 +0100)]
x86: fixup NR-CPUS patch for numa

This patch removes the EXPORT_SYMBOL for:

x86_cpu_to_node_map_init
x86_cpu_to_node_map_early_ptr

... thus fixing the section mismatch problem.

Also, the mem -> node hash lookup is fixed.

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86/paravirt: make set_pud operation common
Jeremy Fitzhardinge [Wed, 30 Jan 2008 12:33:15 +0000 (13:33 +0100)]
x86/paravirt: make set_pud operation common

Remove duplicate set_pud()s.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86/paravirt: make set_pmd operation common
Jeremy Fitzhardinge [Wed, 30 Jan 2008 12:33:15 +0000 (13:33 +0100)]
x86/paravirt: make set_pmd operation common

Remove duplicate set_pmd()s.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86/paravirt: make set_pte operations common
Jeremy Fitzhardinge [Wed, 30 Jan 2008 12:33:15 +0000 (13:33 +0100)]
x86/paravirt: make set_pte operations common

Remove duplicate set_pte* operations.  PAE still needs to have special
variants of some of these because it can't atomically update a 64-bit
pte, so there's still some duplication.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86/paravirt: common implementation for pmd value ops
Jeremy Fitzhardinge [Wed, 30 Jan 2008 12:33:15 +0000 (13:33 +0100)]
x86/paravirt: common implementation for pmd value ops

Remove duplicate __pmd/pmd_val functions.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86/paravirt: common implementation for pgd value ops
Jeremy Fitzhardinge [Wed, 30 Jan 2008 12:33:15 +0000 (13:33 +0100)]
x86/paravirt: common implementation for pgd value ops

Remove duplicate __pgd/pgd_val functions.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86/paravirt: common implementation for pte value ops
Jeremy Fitzhardinge [Wed, 30 Jan 2008 12:33:15 +0000 (13:33 +0100)]
x86/paravirt: common implementation for pte value ops

Remove duplicate __pte/pte_val functions.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86/paravirt: rearrange common mmu_ops
Jeremy Fitzhardinge [Wed, 30 Jan 2008 12:33:15 +0000 (13:33 +0100)]
x86/paravirt: rearrange common mmu_ops

Rearrange the various pagetable mmu_ops to remove duplication.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agoadd native_pud_val and _pmd_val for 2 and 3
Jeremy Fitzhardinge [Wed, 30 Jan 2008 12:33:14 +0000 (13:33 +0100)]
add native_pud_val and _pmd_val for 2 and 3

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agoarch/x86/mm/numa_64.c: section fix
Andrew Morton [Wed, 30 Jan 2008 12:33:14 +0000 (13:33 +0100)]
arch/x86/mm/numa_64.c: section fix

WARNING: vmlinux.o(__ksymtab+0x670): Section mismatch: reference to .init.data:x86_cpu_to_node_map_init (between '__ksymtab_x86_cpu_to_node_map_init' and '__ksymtab_node_data')

Cc: Matthew Dobson <colpatch@us.ibm.com>
Cc: Mike Travis <travis@sgi.com>
Cc: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: reduce memory and intra-node effects
Mike Travis [Wed, 30 Jan 2008 12:33:14 +0000 (13:33 +0100)]
x86: reduce memory and intra-node effects

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: adjust/fix LDT handling for Xen
Jan Beulich [Wed, 30 Jan 2008 12:33:14 +0000 (13:33 +0100)]
x86: adjust/fix LDT handling for Xen

Based on patch from Jan Beulich <jbeulich@novell.com>.

Don't rely on kmalloc(PAGE_SIZE) returning PAGE_SIZE aligned memory
(Xen requires GDT *and* LDT to be page-aligned). Using the page
allocator interface also removes the (albeit small) slab allocator
overhead. The same change being done for 64-bits for consistency.

Further, the Xen hypercall interface expects the LDT address to be
virtual, not machine.

[ Adjusted to unified ldt.c - Jeremy ]

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Acked-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86-64: clean up linker script
Jan Beulich [Wed, 30 Jan 2008 12:33:14 +0000 (13:33 +0100)]
x86-64: clean up linker script

Remove the dead .text.lock. Move _etext and __{start,stop}___ex_table
into their sections.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: serverworks: IRQ routing needs no _p
Alan Cox [Wed, 30 Jan 2008 12:33:14 +0000 (13:33 +0100)]
x86: serverworks: IRQ routing needs no _p

I can find no reason for the _p on the serverworks IRQ routing logic, and
a review of the documentation contains no indication that any such delay
is needed so lets try this

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: isolate PIC/PIT in/out calls
Alan Cox [Wed, 30 Jan 2008 12:33:14 +0000 (13:33 +0100)]
x86: isolate PIC/PIT in/out calls

Rather than remove and/or mangle inb_p/outb_p we want to remove the use
of them from inappropriate places. For the PIC/PIT this may eventually
depend on 32/64bitism or similar so start by adding inb/outb_pit and
inb/outb_pic so that we can make them use any scheme we settle on without
disturbing the existing, correct (for ISA), port 0x80 usage. (eg we can
make inb_pit use udelay without messing up inb_p).

Floppy already does this for the fdc. That really only leaves the CMOS as
a core logic item to tackle, and bits of parallel port handling in the
chipset layers.

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: fix singlestep handling in reenter_kprobe
Abhishek Sagar [Wed, 30 Jan 2008 12:33:13 +0000 (13:33 +0100)]
x86: fix singlestep handling in reenter_kprobe

Highlight peculiar cases in singles-step kprobe handling.

In reenter_kprobe(), a breakpoint in KPROBE_HIT_SS case can only occur
when single-stepping a breakpoint on which a probe was installed. Since
such probes are single-stepped inline, identifying these cases is
unambiguous. All other cases leading up to KPROBE_HIT_SS are possible
bugs. Identify and WARN_ON such cases.

Signed-off-by: Abhishek Sagar <sagar.abhishek@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: fix synchronize_rcu(): high latency on idle system
Benjamin LaHaise [Wed, 30 Jan 2008 12:33:13 +0000 (13:33 +0100)]
x86: fix synchronize_rcu(): high latency on idle system

an otherwise idle system takes about 3 ticks per network
interface in unregister_netdev() due to multiple calls to synchronize_rcu(),
which adds up to quite a few seconds for tearing down thousands of
interfaces.  By flushing pending rcu callbacks in the idle loop, the system
makes progress hundreds of times faster.  If this is indeed a sane thing to,
it probably needs to be done for other architectures than x86.  And yes, the
network stack shouldn't call synchronize_rcu() quite so much, but fixing that
is a little more involved.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: add ENDPROC() markers
John Reiser [Wed, 30 Jan 2008 12:33:13 +0000 (13:33 +0100)]
x86: add ENDPROC() markers

The ENDPROCs() were not used everywhere.  Some code used just END() instead,
while other code used nothing.  um/sys-i386/checksum.S didn't #include
<linux/linkage.h> .  I also got confused because gcc puts the
.type near the ENTRY, while ENDPROC puts it on the opposite end.

Signed off by: John Reiser <jreiser@BitWagon.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: function ifdefs in fault_32|64.c
Harvey Harrison [Wed, 30 Jan 2008 12:33:13 +0000 (13:33 +0100)]
x86: function ifdefs in fault_32|64.c

Add caller of is_errata93() to X86_32, ifdef'd to do
nothing.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: last of trivial fault_32|64.c unification
Harvey Harrison [Wed, 30 Jan 2008 12:33:13 +0000 (13:33 +0100)]
x86: last of trivial fault_32|64.c unification

Comments, indentation, printk format.

Uses task_pid_nr() on X86_64 now, but this is always defined
to task->pid.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: do_page_fault small unification
Harvey Harrison [Wed, 30 Jan 2008 12:33:12 +0000 (13:33 +0100)]
x86: do_page_fault small unification

Copy the prefetch of map_sem from X86_64 and move the check
notify_page_fault (soon to be kprobe_handle_fault) out of
the unlikely if() statement.

This makes the X86_32|64 pagefault handlers closer to each
other.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: remove last user of get_segment_eip
Harvey Harrison [Wed, 30 Jan 2008 12:33:12 +0000 (13:33 +0100)]
x86: remove last user of get_segment_eip

is_prefetch was the last user of get_segment_eip and only on
X86_32.  This function returned the faulting instruction's
address and set the upper segment limit.

Instead, use the convert_ip_to_linear helper and rely on
probe_kernel_address to do the segment checks which was
already done everywhere the segment limit was being checked
on X86_32.

Remove get_segment_eip as well.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: common x86_32|64 naming
Harvey Harrison [Wed, 30 Jan 2008 12:33:12 +0000 (13:33 +0100)]
x86: common x86_32|64 naming

Rename convert_rip_to_linear to convert_ip_to_linear for shared
X86_32|64 use.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: use wrmsrl in kprobes.c, step.c
Harvey Harrison [Wed, 30 Jan 2008 12:33:12 +0000 (13:33 +0100)]
x86: use wrmsrl in kprobes.c, step.c

Where x86_32 passed zero in the high 32 bits, use wrmsrl which
will zero extend for us.  This allows ifdefs for 32/64 bit to
be eliminated.

Eliminate ifdef in step.c.  Similar cleanup was done when unifying
kprobes_32|64.c and wrmsr() was chosen there over wrmsrl().  This
patch changes these to wrmsrl.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: change bios_cpu_apicid to percpu data variable
travis@sgi.com [Wed, 30 Jan 2008 12:33:12 +0000 (13:33 +0100)]
x86: change bios_cpu_apicid to percpu data variable

Change static bios_cpu_apicid array to a per_cpu data variable.
This includes using a static array used during initialization
similar to the way x86_cpu_to_apicid[] is handled.

There is one early use of bios_cpu_apicid in apic_is_clustered_box().
The other reference in cpu_present_to_apicid() is called after
smp_set_apicids() has setup the percpu version of bios_cpu_apicid.

[ mingo@elte.hu: build fix ]

Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: change NR_CPUS arrays in acpi-cpufreq
travis@sgi.com [Wed, 30 Jan 2008 12:33:12 +0000 (13:33 +0100)]
x86: change NR_CPUS arrays in acpi-cpufreq

Change the following static arrays sized by NR_CPUS to
per_cpu data variables:

acpi_cpufreq_data *drv_data[NR_CPUS]

Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: change NR_CPUS arrays in numa_64
travis@sgi.com [Wed, 30 Jan 2008 12:33:11 +0000 (13:33 +0100)]
x86: change NR_CPUS arrays in numa_64

Change the following static arrays sized by NR_CPUS to
per_cpu data variables:

char cpu_to_node_map[NR_CPUS];

Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: cleanup x86_cpu_to_apicid references
travis@sgi.com [Wed, 30 Jan 2008 12:33:11 +0000 (13:33 +0100)]
x86: cleanup x86_cpu_to_apicid references

Clean up references to x86_cpu_to_apicid.  Removes extraneous
comments and standardizes on "x86_*_early_ptr" for the early
kernel init references.

Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: change NR_CPUS arrays in topology
travis@sgi.com [Wed, 30 Jan 2008 12:33:11 +0000 (13:33 +0100)]
x86: change NR_CPUS arrays in topology

Change the following static arrays sized by NR_CPUS to
per_cpu data variables:

i386_cpu cpu_devices[NR_CPUS];

(And change the struct name to x86_cpu.)

Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: change NR_CPUS arrays in smpboot_64
travis@sgi.com [Wed, 30 Jan 2008 12:33:11 +0000 (13:33 +0100)]
x86: change NR_CPUS arrays in smpboot_64

Change the following static arrays sized by NR_CPUS to
per_cpu data variables:

task_struct *idle_thread_array[NR_CPUS];

This is only done if CONFIG_HOTPLUG_CPU is defined
as otherwise, the array is removed after initialization
anyways.

Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: change NR_CPUS arrays in powernow-k8
travis@sgi.com [Wed, 30 Jan 2008 12:33:11 +0000 (13:33 +0100)]
x86: change NR_CPUS arrays in powernow-k8

Change the following static arrays sized by NR_CPUS to
per_cpu data variables:

powernow_k8_data *powernow_data[NR_CPUS];

Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: change size of node ids from u8 to u16
travis@sgi.com [Wed, 30 Jan 2008 12:33:11 +0000 (13:33 +0100)]
x86: change size of node ids from u8 to u16

Change the size of node ids from 8 bits to 16 bits to
accomodate more than 256 nodes.

Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: change size of APICIDs from u8 to u16
travis@sgi.com [Wed, 30 Jan 2008 12:33:10 +0000 (13:33 +0100)]
x86: change size of APICIDs from u8 to u16

Change the size of APICIDs from u8 to u16.  This partially
supports the new x2apic mode that will be present on future
processor chips. (Chips actually support 32-bit APICIDs, but that
change is more intrusive. Supporting 16-bit is sufficient for now).

Signed-off-by: Jack Steiner <steiner@sgi.com>
I've included just the partial change from u8 to u16 apicids.  The
remaining x2apic changes will be in a separate patch.

In addition, the fake_node_to_pxm_map[] and fake_apicid_to_node[]
tables have been moved from local data to the __initdata section
reducing stack pressure when MAX_NUMNODES and MAX_LOCAL_APIC are
increased in size.

Signed-off-by: Mike Travis <travis@sgi.com>
Reviewed-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: refactor ioport unification
Chris Wright [Wed, 30 Jan 2008 12:33:10 +0000 (13:33 +0100)]
x86: refactor ioport unification

Refactor ioport unification to pull out common code.

Cc: mboton@gmail.com
Cc: Kevin Winchester <kjwinchester@gmail.com>
Cc: Zach Brown <zach.brown@oracle.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: fix ioport unification on 32-bit
Chris Wright [Wed, 30 Jan 2008 12:33:10 +0000 (13:33 +0100)]
x86: fix ioport unification on 32-bit

ioport unification was broken for 32-bit; it was missing
the acutal pushf/popf EFLAGS manipulation (set_iopl_mask()).
Also, use of volatile looks like leftover cruft.

Cc: mboton@gmail.com
Cc: Kevin Winchester <kjwinchester@gmail.com>
Cc: Zach Brown <zach.brown@oracle.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: ioport_{32|64}.c unification
mboton@gmail.com [Wed, 30 Jan 2008 12:33:10 +0000 (13:33 +0100)]
x86: ioport_{32|64}.c unification

ioport_{32|64}.c unification.

This patch unifies the code from the ioport_32.c and ioport_64.c files.

Tested and working fine with i386 and x86_64 kernels.

Signed-off-by: Miguel Botón <mboton@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: disable the GART early, 64-bit
Yinghai Lu [Wed, 30 Jan 2008 12:33:09 +0000 (13:33 +0100)]
x86: disable the GART early, 64-bit

For K8 system: 4G RAM with memory hole remapping enabled, or more than
4G RAM installed.

when try to use kexec second kernel, and the first doesn't include
gart_shutdown. the second kernel could have different aper position than
the first kernel. and second kernel could use that hole as RAM that is
still used by GART set by the first kernel. esp. when try to kexec
2.6.24 with sparse mem enable from previous kernel (from RHEL 5 or SLES
10). the new kernel will use aper by GART (set by first kernel) for
vmemmap. and after new kernel setting one new GART. the position will be
real RAM. the _mapcount set is lost.

Bad page state in process 'swapper'
page:ffffe2000e600020 flags:0x0000000000000000 mapping:0000000000000000 mapcount:1 count:0
Trying to fix it up, but a reboot is needed
Backtrace:
Pid: 0, comm: swapper Not tainted 2.6.24-rc7-smp-gcdf71a10-dirty #13

Call Trace:
 [<ffffffff8026401f>] bad_page+0x63/0x8d
 [<ffffffff80264169>] __free_pages_ok+0x7c/0x2a5
 [<ffffffff80ba75d1>] free_all_bootmem_core+0xd0/0x198
 [<ffffffff80ba3a42>] numa_free_all_bootmem+0x3b/0x76
 [<ffffffff80ba3461>] mem_init+0x3b/0x152
 [<ffffffff80b959d3>] start_kernel+0x236/0x2c2
 [<ffffffff80b9511a>] _sinittext+0x11a/0x121

and
 [ffffe2000e600000-ffffe2000e7fffff] PMD ->ffff81001c200000 on node 0
phys addr is : 0x1c200000

RHEL 5.1 kernel -53 said:
PCI-DMA: aperture base @ 1c000000 size 65536 KB

new kernel said:
Mapping aperture over 65536 KB of RAM @ 3c000000

So could try to disable that GART if possible.

According to Ingo

> hm, i'm wondering, instead of modifying the GART, why dont we simply
> _detect_ whatever GART settings we have inherited, and propagate that
> into our e820 maps? I.e. if there's inconsistency, then punch that out
> from the memory maps and just dont use that memory.
>
> that way it would not matter whether the GART settings came from a [old
> or crashing] Linux kernel that has not called gart_iommu_shutdown(), or
> whether it's a BIOS that has set up an aperture hole inconsistent with
> the memory map it passed. (or the memory map we _think_ i tried to pass
> us)
>
> it would also be more robust to only read and do a memory map quirk
> based on that, than actively trying to change the GART so early in the
> bootup. Later on we have to re-enable the GART _anyway_ and have to
> punch a hole for it.
>
> and as a bonus, we would have shored up our defenses against crappy
> BIOSes as well.

add e820 modification for gart inconsistent setting.

gart_fix_e820=off could be used to disable e820 fix.

Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: cleanup setup_node_zones called by paging_init()
Yinghai Lu [Wed, 30 Jan 2008 12:33:09 +0000 (13:33 +0100)]
x86: cleanup setup_node_zones called by paging_init()

setup_node_zones() calcuates some variables but only use them when
FLAT_NODE_MEM_MAP is set

so change the MACRO postion to avoid calculating.

also change it to static, and rename it to flat_setup_node_zones().

Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: early fault debugging improvement
Ingo Molnar [Wed, 30 Jan 2008 12:33:09 +0000 (13:33 +0100)]
x86: early fault debugging improvement

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: fix DMI ioremap leak
Ingo Molnar [Wed, 30 Jan 2008 12:33:09 +0000 (13:33 +0100)]
x86: fix DMI ioremap leak

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: pat: e820 cleanup
Ingo Molnar [Wed, 30 Jan 2008 12:33:08 +0000 (13:33 +0100)]
x86: pat: e820 cleanup

NOP change.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: smp_scan_config() debugging printouts
Ingo Molnar [Wed, 30 Jan 2008 12:33:08 +0000 (13:33 +0100)]
x86: smp_scan_config() debugging printouts

These are useful in figuring out early-mapping problems.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: make printk_address regs->ip always reliable
Arjan van de Ven [Wed, 30 Jan 2008 12:33:08 +0000 (13:33 +0100)]
x86: make printk_address regs->ip always reliable

printk_address()'s second parameter is the reliability indication,
not the ebp. If we're printing regs->ip we're reliable by definition,
so pass a 1 here.

Signed-off-by: Arjan van de Ven
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: add the "print code before the trapping instruction" feature to 64 bit
Arjan van de Ven [Wed, 30 Jan 2008 12:33:08 +0000 (13:33 +0100)]
x86: add the "print code before the trapping instruction" feature to 64 bit

The 32 bit x86 tree has a very useful feature that prints the Code: line
for the code even before the trapping instrution (and the start of the
trapping instruction is then denoted with a <>). Unfortunately, the 64 bit
x86 tree does not yet have this feature, making diagnosing backtraces harder
than needed.

This patch adds this feature in the same was as the 32 bit tree has
(including the same kernel boot parameter), and including a bugfix
to make the code use probe_kernel_address() rarther than a buggy (deadlocking)
__get_user.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: add a simple backtrace test module
Arjan van de Ven [Wed, 30 Jan 2008 12:33:08 +0000 (13:33 +0100)]
x86: add a simple backtrace test module

During the work on the x86 32 and 64 bit backtrace code I found it useful
to have a simple test module to test a process and irq context backtrace.
Since the existing backtrace code was buggy, I figure it might be useful
to have such a test module in the kernel so that maybe we can even
detect such bugs earlier..

[ mingo@elte.hu: build fix ]

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: use the stack frames to get exact stack-traces for CONFIG_FRAMEPOINTER on x86-64
Arjan van de Ven [Wed, 30 Jan 2008 12:33:07 +0000 (13:33 +0100)]
x86: use the stack frames to get exact stack-traces for CONFIG_FRAMEPOINTER on x86-64

x86 32 bit already has this feature: This patch uses the stack frames with
frame pointer into an exact stack trace, by following the frame pointer.
This only affects kernels built with the CONFIG_FRAME_POINTER config option
enabled, and greatly reduces the amount of noise in oopses.

This code uses the traditional method of doing backtraces, but if it
finds a valid frame pointer chain, will use that to show which parts
of the backtrace are reliable and which parts are not

Due to the fragility and importance of the backtrace code, this needs to
be well reviewed and well tested before merging into mainlne.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: turn 64 bit x86 HANDLE_STACK into print_context_stack like 32 bit has
Arjan van de Ven [Wed, 30 Jan 2008 12:33:07 +0000 (13:33 +0100)]
x86: turn 64 bit x86 HANDLE_STACK into print_context_stack like 32 bit has

This patch turns the x86 64 bit HANDLE_STACK macro in the backtrace code
into a function, just like 32 bit has. This is needed pre work in order to
get exact backtraces for CONFIG_FRAME_POINTER to work.

The function and it's arguments are not the same as 32 bit; due to the
exception/interrupt stack way of x86-64 there are a few differences.

This patch should not have any behavior changes, only code movement.

Due to the fragility and importance of the backtrace code, this needs to be
well reviewed and well tested before merging into mainlne.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: pull bp calculation earlier into the backtrace path
Arjan van de Ven [Wed, 30 Jan 2008 12:33:07 +0000 (13:33 +0100)]
x86: pull bp calculation earlier into the backtrace path

Right now, we take the stack pointer early during the backtrace path, but
only calculate bp several functions deep later, making it hard to reconcile
the stack and bp backtraces (as well as showing several internal backtrace
functions on the stack with bp based backtracing).

This patch moves the bp taking to the same place we take the stack pointer;
sadly this ripples through several layers of the back tracing stack,
but it's not all that bad in the end I hope.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: improve the 32 bit Frame Pointer backtracer to also use the traditional backtrace
Arjan van de Ven [Wed, 30 Jan 2008 12:33:07 +0000 (13:33 +0100)]
x86: improve the 32 bit Frame Pointer backtracer to also use the traditional backtrace

The 32 bit Frame Pointer backtracer code checks if the EBP is valid
to do a backtrace; however currently on a failure it just gives up
and prints nothing. That's not very nice; we can do better and still
print a decent backtrace.

This patch changes the backtracer to use the regular backtracing algorithm
at the same time as the EBP backtracer; the EBP backtracer is basically
used to figure out which part of the backtrace are reliable vs those
which are likely to be noise.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: add the capability to print fuzzy backtraces
Arjan van de Ven [Wed, 30 Jan 2008 12:33:07 +0000 (13:33 +0100)]
x86: add the capability to print fuzzy backtraces

For enhancing the 32 bit EBP based backtracer, I need the capability
for the backtracer to tell it's customer that an entry is either
reliable or unreliable, and the backtrace printing code then needs to
print the unreliable ones slightly different.

This patch adds the basic capability, the next patch will add a user
of this capability.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: fix 32-bit FRAME_POINTER chasing code
Arjan van de Ven [Wed, 30 Jan 2008 12:33:06 +0000 (13:33 +0100)]
x86: fix 32-bit FRAME_POINTER chasing code

The current x86 32 bit FRAME_POINTER chasing code has a nasty bug in
that the EBP tracer doesn't actually update the value of EBP it is
tracing, so that the code doesn't actually switch to the irq stack
properly.

The result is a truncated backtrace:

 WARNING: at timeroops.c:8 kerneloops_regression_test() (Not tainted)
 Pid: 0, comm: swapper Not tainted 2.6.24-0.77.rc4.git4.fc9 #1
  [<c040649a>] show_trace_log_lvl+0x1a/0x2f
  [<c0406d41>] show_trace+0x12/0x14
  [<c0407061>] dump_stack+0x6c/0x72
  [<e0258049>] kerneloops_regression_test+0x44/0x46 [timeroops]
  [<c04371ac>] run_timer_softirq+0x127/0x18f
  [<c0434685>] __do_softirq+0x78/0xff
  [<c0407759>] do_softirq+0x74/0xf7
  =======================

This patch fixes the code to update EBP properly, and to check the EIP
before printing (as the non-framepointer backtracer does) so that
the same test backtrace now looks like this:

 WARNING: at timeroops.c:8 kerneloops_regression_test()
 Pid: 0, comm: swapper Not tainted 2.6.24-rc7 #4
  [<c0405d17>] show_trace_log_lvl+0x1a/0x2f
  [<c0406681>] show_trace+0x12/0x14
  [<c0406ef2>] dump_stack+0x6a/0x70
  [<e01f6040>] kerneloops_regression_test+0x3b/0x3d [timeroops]
  [<c0426f07>] run_timer_softirq+0x11b/0x17c
  [<c04243ac>] __do_softirq+0x42/0x94
  [<c040704c>] do_softirq+0x50/0xb6
  [<c04242a9>] irq_exit+0x37/0x67
  [<c040714c>] do_IRQ+0x9a/0xaf
  [<c04057da>] common_interrupt+0x2e/0x34
  [<c05807fe>] cpuidle_idle_call+0x52/0x78
  [<c04034f3>] cpu_idle+0x46/0x60
  [<c05fbbd3>] rest_init+0x43/0x45
  [<c070aa3d>] start_kernel+0x279/0x27f
  =======================

This shows that the backtrace goes all the way down to user context now.
This bug was found during the port to 64 bit of the frame pointer backtracer.

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: make early printk selectable on 64-bit as well
Ingo Molnar [Wed, 30 Jan 2008 12:33:06 +0000 (13:33 +0100)]
x86: make early printk selectable on 64-bit as well

Enable CONFIG_EMBEDDED to select CONFIG_EARLY_PRINTK on 64-bit as well.

saves ~2K:

   text    data     bss     dec     hex filename
   7290283 3672091 1907848 12870222         c4624e vmlinux.before
   7288373 3671795 1907848 12868016         c459b0 vmlinux.after

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: early_idt_handler improvements, 64-bit
Roland McGrath [Wed, 30 Jan 2008 12:33:06 +0000 (13:33 +0100)]
x86: early_idt_handler improvements, 64-bit

It's not too pretty, but I found this made the "PANIC: early exception"
messages become much more reliably useful: 1. print the vector number,
2. print the %cs value, 3. handle error-code-pushing vs non-pushing vectors.

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: discover_ebda section mismatch
Randy Dunlap [Wed, 30 Jan 2008 12:33:05 +0000 (13:33 +0100)]
x86: discover_ebda section mismatch

Fix section mismatches.  discover_ebda() can be __init.

WARNING: vmlinux.o(.text+0x738a): Section mismatch: reference to .init.data:ebda_addr (between 'discover_ebda' and 'get_model_name')
WARNING: vmlinux.o(.text+0x73c4): Section mismatch: reference to .init.data:ebda_size (between 'discover_ebda' and 'get_model_name')

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86: preset apic clockevents multiplicator
Thomas Gleixner [Wed, 30 Jan 2008 12:33:04 +0000 (13:33 +0100)]
x86: preset apic clockevents multiplicator

The check for an unitialized clock event device triggers, when the local
apic timer is registered as a dummy clock event device for broadcasting.
Preset the multiplicator to avoid a false positive.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86: sanity check APIC timer frequency
Thomas Gleixner [Wed, 30 Jan 2008 12:33:04 +0000 (13:33 +0100)]
x86: sanity check APIC timer frequency

Check the APIC timer calibration result for sanity. When the frequency
is out of range, issue a warning and disable the local APIC timer.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
16 years agox86 setup: correct the base in the GDT_ENTRY() macro
H. Peter Anvin [Wed, 30 Jan 2008 12:33:04 +0000 (13:33 +0100)]
x86 setup: correct the base in the GDT_ENTRY() macro

The GDT_ENTRY() macro in pm.c would incorrectly cut the bottom 8 bits
off the base.  We didn't define any bases with the bottom 8 bits
nonzero, so it is a non-manifest bug, but it's still a bug.

Pointed out by John Smith <johnsmith9344@gmail.com>.
Cc: John Smith <johnsmith9344@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86 setup: guard the heap against invalid stack setups
H. Peter Anvin [Wed, 30 Jan 2008 12:33:04 +0000 (13:33 +0100)]
x86 setup: guard the heap against invalid stack setups

If we use the bootloader-provided stack pointer, we might end up in a
situation where the bootloader (incorrectly) pointed the stack in the
middle of our heap.  Catch this by simply comparing the computed heap
end value to the stack pointer minus the defined stack size.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86 setup: push video mode setup as late as possible
H. Peter Anvin [Wed, 30 Jan 2008 12:33:03 +0000 (13:33 +0100)]
x86 setup: push video mode setup as late as possible

Push video mode setup as late as possible; messages issued through the
BIOS interface after video mode setup will either not be seen (for
framebuffer modes) or will screw up the cursor (for text modes.)

In particular, this makes the EDD probing message show up correctly.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86 setup: add note to use edd=off on EDD probing hangs
H. Peter Anvin [Wed, 30 Jan 2008 12:33:03 +0000 (13:33 +0100)]
x86 setup: add note to use edd=off on EDD probing hangs

Tell the user to specify edd=off in the case of EDD probing hangs.
Per LKML discussion.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86 setup: add missing prototype; formatting fix
H. Peter Anvin [Wed, 30 Jan 2008 12:33:03 +0000 (13:33 +0100)]
x86 setup: add missing prototype; formatting fix

Add prototype for cmdline_find_option_bool() missing from:

    x86 setup: early cmdline parser handle boolean options

Also, fix up a minor formatting error in that patch.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86 setup: OK -> ok (no need to scream)
H. Peter Anvin [Wed, 30 Jan 2008 12:33:03 +0000 (13:33 +0100)]
x86 setup: OK -> ok (no need to scream)

Unnecessary capitals are shouting; no need for it here.
Thus, change "OK" to "ok" and add a space.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86 setup: be more verbose when probing EDD
devzero@web.de [Wed, 30 Jan 2008 12:33:03 +0000 (13:33 +0100)]
x86 setup: be more verbose when probing EDD

On early boot, probing the Bios for EDD happens without any message.

Enhanced Disk Drive Services (EDD) is a mechanism to match x86 BIOS device
names (int13 device 80h) to Linux device names (e.g. /dev/sda, /dev/hda)

There are buggy Bios out there having problems with EDD. This can be problems
with the Bios itself or with addon cards, too.

This patch is adds an informational message on early boot.

CONFIG_EDD is not set with defconfig, but with allmodconfig (i.e. CONFIG_EDD=m)
so the EDD probe may be active on early boot on many systems nowadays.

I can tell, that the probe is active on SuSE distro and with that I have seen
more than one system hanging endlessly with those "black screen with a blinking
cursor in the the upper left" on installation, making it difficult for the end-
user to find out, what`s the issue.
For sure I have seen this on FujitsuSiemens PCs with i810 and with i815 chipset.

This one also honours the "quiet" bootparam.

Also see:
http://marc.info/?l=linux-kernel&m=119781937207969&w=2
http://marc.info/?l=linux-kernel&m=119783934032326&w=2
http://marc.info/?l=linux-kernel&m=119783678529100&w=2

Signed-off-by: Roland Kletzing <devzero@web.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86 setup: early cmdline parser handle boolean options
devzero@web.de [Wed, 30 Jan 2008 12:33:02 +0000 (13:33 +0100)]
x86 setup: early cmdline parser handle boolean options

This patch extends the early commandline parser to support boolean options.
The current version in mainline only supports parsing "option=arg" value pairs.

With this it should be easy making other messages like "Uncompressing kernel"
honour the "quiet" parameter, too.

Signed-off-by: Roland Kletzing <devzero@web.de>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
16 years agox86 setup: fix constraints in segment accessor functions
H. Peter Anvin [Wed, 30 Jan 2008 12:33:02 +0000 (13:33 +0100)]
x86 setup: fix constraints in segment accessor functions

Fix the operand constraints for the segment accessor functions,
{rd,wr}{fs,gs}*.  In particular, the 8-bit functions used "r"
constraints instead of "q" constraints.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>