Adrian Bunk [Sat, 20 Jan 2007 17:54:03 +0000 (18:54 +0100)]
Linux 2.6.16.38
Adrian Bunk [Thu, 18 Jan 2007 15:13:44 +0000 (16:13 +0100)]
Linux 2.6.16.38-rc2
YOSHIFUJI Hideaki [Thu, 18 Jan 2007 15:11:56 +0000 (16:11 +0100)]
[IPV6] Fix joining all-node multicast group.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Paolo 'Blaisorblade' Giarrusso [Sun, 14 Jan 2007 14:42:49 +0000 (15:42 +0100)]
UML: fix the MODE_TT compilation
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Adrian Bunk [Tue, 9 Jan 2007 03:12:13 +0000 (04:12 +0100)]
Linux 2.6.16.38-rc1
Adrian Bunk [Tue, 9 Jan 2007 02:36:59 +0000 (03:36 +0100)]
x86_64: re-add a newline to RESTORE_CONTEXT
RESTORE_CONTEXT lost a newline:
http://www.mail-archive.com/kgdb-bugreport@lists.sourceforge.net/msg00559.html
Reported by Steven M. Christey.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Clemens Ladisch [Mon, 8 Jan 2007 22:12:26 +0000 (23:12 +0100)]
ALSA: snd_rtctimer: handle RTC interrupts with a tasklet
The calls to rtc_control() from inside the interrupt handler can
deadlock the RTC code, so move our interrupt handling code to a tasklet.
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Acked-By: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Arnaud Patard [Mon, 8 Jan 2007 22:09:59 +0000 (23:09 +0100)]
ALSA: emu10k1: Fix outl() in snd_emu10k1_resume_regs()
The emu10k1 driver saves the A_IOCFG and HCFG register on suspend and restores
it on resumes. Unfortunately, this doesn't work as the arguments to outl() are
reversed.
Signed-off-by: Arnaud Patard <arnaud.patard@rtp-net.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Takashi Iwai [Mon, 8 Jan 2007 22:09:11 +0000 (23:09 +0100)]
ALSA: Fix initiailization of user-space controls
Fix an assertion when accessing a user-defined control due to lack of
initialization (appears only when CONFIG_SND_DEBUg is enabled).
ALSA sound/core/control.c:660: BUG? (info->access == 0)
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Andrey Mirkin [Mon, 8 Jan 2007 22:07:27 +0000 (23:07 +0100)]
skip data conversion in compat_sys_mount when data_page is NULL
OpenVZ Linux kernel team has found a problem with mounting in compat mode.
Simple command "mount -t smbfs ..." on Fedora Core 5 distro in 32-bit mode
leads to oops:
Unable to handle kernel NULL pointer dereference at
0000000000000000 RIP:
[<
ffffffff802bc7c6>] compat_sys_mount+0xd6/0x290
PGD
34d48067 PUD
34d03067 PMD 0
Oops: 0000 [1] SMP
CPU: 0
Modules linked in: iptable_nat simfs smbfs ip_nat ip_conntrack vzdquota
parport_pc lp parport 8021q bridge llc vznetdev vzmon nfs lockd sunrpc vzdev
iptable_filter af_packet xt_length ipt_ttl xt_tcpmss ipt_TCPMSS
iptable_mangle xt_limit ipt_tos ipt_REJECT ip_tables x_tables thermal
processor fan button battery asus_acpi ac uhci_hcd ehci_hcd usbcore i2c_i801
i2c_core e100 mii floppy ide_cd cdrom
Pid: 14656, comm: mount
RIP: 0060:[<
ffffffff802bc7c6>] [<
ffffffff802bc7c6>]
compat_sys_mount+0xd6/0x290
RSP: 0000:
ffff810034d31f38 EFLAGS:
00010292
RAX:
000000000000002c RBX:
0000000000000000 RCX:
0000000000000000
RDX:
ffff810034c86bc0 RSI:
0000000000000096 RDI:
ffffffff8061fc90
RBP:
ffff810034d31f78 R08:
0000000000000000 R09:
000000000000000d
R10:
ffff810034d31e58 R11:
0000000000000001 R12:
ffff810039dc3000
R13:
000000000805ea48 R14:
0000000000000000 R15:
00000000c0ed0000
FS:
0000000000000000(0000) GS:
ffffffff80749000(0033) knlGS:
00000000b7d556b0
CS: 0060 DS: 007b ES: 007b CR0:
000000008005003b
CR2:
0000000000000000 CR3:
0000000034d43000 CR4:
00000000000006e0
Process mount (pid: 14656, veid=300, threadinfo
ffff810034d30000, task
ffff810034c86bc0)
Stack:
0000000000000000 ffff810034dd0000 ffff810034e4a000 000000000805ea48
0000000000000000 0000000000000000 0000000000000000 0000000000000000
000000000805ea48 ffffffff8021e64e 0000000000000000 0000000000000000
Call Trace:
[<
ffffffff8021e64e>] ia32_sysret+0x0/0xa
Code: 83 3b 06 0f 85 41 01 00 00 0f b7 43 0c 89 43 14 0f b7 43 0a
RIP [<
ffffffff802bc7c6>] compat_sys_mount+0xd6/0x290
RSP <
ffff810034d31f38>
CR2:
0000000000000000
The problem is that data_page pointer can be NULL, so we should skip data
conversion in this case.
Signed-off-by: Andrey Mirkin <amirkin@openvz.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Peter Zijlstra [Mon, 8 Jan 2007 08:09:15 +0000 (09:09 +0100)]
rtc: lockdep fix/workaround
BUG: warning at kernel/lockdep.c:1816/trace_hardirqs_on() (Not tainted)
[<
c04051ee>] show_trace_log_lvl+0x58/0x171
[<
c0405802>] show_trace+0xd/0x10
[<
c040591b>] dump_stack+0x19/0x1b
[<
c043abee>] trace_hardirqs_on+0xa2/0x11e
[<
c06143c3>] _spin_unlock_irq+0x22/0x26
[<
c0541540>] rtc_get_rtc_time+0x32/0x176
[<
c0419ba4>] hpet_rtc_interrupt+0x92/0x14d
[<
c0450f94>] handle_IRQ_event+0x20/0x4d
[<
c0451055>] __do_IRQ+0x94/0xef
[<
c040678d>] do_IRQ+0x9e/0xbd
[<
c0404a49>] common_interrupt+0x25/0x2c
DWARF2 unwinder stuck at common_interrupt+0x25/0x2c
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Chuck Ebbert [Mon, 8 Jan 2007 06:11:50 +0000 (07:11 +0100)]
ebtables: check struct type before computing gap
Check struct type before dereferencing fields in ebt_entry.
Failure to check can cause oops.
Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Maxime Bizon [Mon, 8 Jan 2007 06:07:36 +0000 (07:07 +0100)]
i2c-mv64xxx: Fix random oops at boot
I have a Marvell board which has the same i2c hw block than mv64xxx, so
I'm trying to use i2c-mv64xxx driver.
But I get the following random oops at boot:
Unable to handle kernel NULL pointer dereference at virtual address
00000002
Backtrace:
[<
c0397e4c>] (mv64xxx_i2c_intr+0x0/0x2b8) from [<
c02879c4>] (__do_irq+0x4c/0x8c)
[<
c0287978>] (__do_irq+0x0/0x8c) from [<
c0287c0c>] (do_level_IRQ+0x68/0xc0)
r8 =
C0501E08 r7 =
00000005 r6 =
C0501E08 r5 =
00000005
r4 =
C048BB78
[<
c0287ba4>] (do_level_IRQ+0x0/0xc0) from [<
c02885f8>] (asm_do_IRQ+0x50/0x134)
r6 =
C0449C78 r5 =
F1020000 r4 =
FFFFFFFF
[<
c02885a8>] (asm_do_IRQ+0x0/0x134) from [<
c02869c4>] (__irq_svc+0x24/0x100)
r8 =
C1CAC400 r7 =
00000005 r6 =
00000002 r5 =
F1020000
r4 =
FFFFFFFF
[<
c0287efc>] (setup_irq+0x0/0x124) from [<
c02880d0>] (request_irq+0xb0/0xd0)
r7 =
C041B2AC r6 =
C0397E4C r5 =
00000000 r4 =
00000005
[<
c0288020>] (request_irq+0x0/0xd0) from [<
c03985f4>] (mv64xxx_i2c_probe+0x148/0x244)
[<
c03984ac>] (mv64xxx_i2c_probe+0x0/0x244) from [<
c038bedc>] (platform_drv_probe+0x20/0x24)
The oops is caused by a spurious interrupt that occurs when request_irq
is called. mv64xxx_i2c_fsm() tries to read drv_data->msg, which is NULL.
I noticed that hardware init is done after requesting irq. Thus any
pending irq from previous hardware usage may cause this.
Signed-off-by: Maxime Bizon <mbizon@freebox.fr>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Jean Delvare [Mon, 8 Jan 2007 06:05:19 +0000 (07:05 +0100)]
V4L: cx88: Fix leadtek_eeprom tagging
reference to .init.text: from .text between 'cx88_card_setup'
(at offset 0x68c) and 'cx88_risc_field'
Caused by leadtek_eeprom() being declared __devinit and called from
a non-devinit context.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Phillip Lougher [Mon, 8 Jan 2007 06:02:45 +0000 (07:02 +0100)]
corrupted cramfs filesystems cause kernel oops (CVE-2006-5823)
Steve Grubb's fzfuzzer tool (http://people.redhat.com/sgrubb/files/
fsfuzzer-0.6.tar.gz) generates corrupt Cramfs filesystems which cause
Cramfs to kernel oops in cramfs_uncompress_block(). The cause of the oops
is an unchecked corrupted block length field read by cramfs_readpage().
This patch adds a sanity check to cramfs_readpage() which checks that the
block length field is sensible. The (PAGE_CACHE_SIZE << 1) size check is
intentional, even though the uncompressed data is not going to be larger
than PAGE_CACHE_SIZE, gzip sometimes generates compressed data larger than
the original source data. Mkcramfs checks that the compressed size is
always less than or equal to PAGE_CACHE_SIZE << 1. Of course Cramfs could
use the original uncompressed data in this case, but it doesn't.
Signed-off-by: Phillip Lougher <phillip@lougher.org.uk>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Eric Sandeen [Mon, 8 Jan 2007 06:01:06 +0000 (07:01 +0100)]
handle ext3 directory corruption better (CVE-2006-6053)
I've been using Steve Grubb's purely evil "fsfuzzer" tool, at
http://people.redhat.com/sgrubb/files/fsfuzzer-0.4.tar.gz
Basically it makes a filesystem, splats some random bits over it, then
tries to mount it and do some simple filesystem actions.
At best, the filesystem catches the corruption gracefully. At worst,
things spin out of control.
As you might guess, we found a couple places in ext3 where things spin out
of control :)
First, we had a corrupted directory that was never checked for
consistency... it was corrupt, and pointed to another bad "entry" of
length 0. The for() loop looped forever, since the length of
ext3_next_entry(de) was 0, and we kept looking at the same pointer over and
over and over and over... I modeled this check and subsequent action on
what is done for other directory types in ext3_readdir...
(adding this check adds some computational expense; I am testing a followup
patch to reduce the number of times we check and re-check these directory
entries, in all cases. Thanks for the idea, Andreas).
Next we had a root directory inode which had a corrupted size, claimed to
be > 200M on a 4M filesystem. There was only really 1 block in the
directory, but because the size was so large, readdir kept coming back for
more, spewing thousands of printk's along the way.
Per Andreas' suggestion, if we're in this read error condition and we're
trying to read an offset which is greater than i_blocks worth of bytes,
stop trying, and break out of the loop.
With these two changes fsfuzz test survives quite well on ext3.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Eric Sandeen [Mon, 8 Jan 2007 05:59:28 +0000 (06:59 +0100)]
ext2: skip pages past number of blocks in ext2_find_entry (CVE-2006-6054)
This one was pointed out on the MOKB site:
http://kernelfun.blogspot.com/2006/11/mokb-09-11-2006-linux-26x-ext2checkpage.html
If a directory's i_size is corrupted, ext2_find_entry() will keep processing
pages until the i_size is reached, even if there are no more blocks associated
with the directory inode. This patch puts in some minimal sanity-checking
so that we don't keep checking pages (and issuing errors) if we know there
can be no more data to read, based on the block count of the directory inode.
This is somewhat similar in approach to the ext3 patch I sent earlier this
year.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Eric Sandeen [Mon, 8 Jan 2007 05:55:37 +0000 (06:55 +0100)]
hfs_fill_super returns success even if no root inode (CVE-2006-6056)
http://kernelfun.blogspot.com/2006/11/mokb-14-11-2006-linux-26x-selinux.html
mount that image...
fs: filesystem was not cleanly unmounted, running fsck.hfs is recommended. mounting read-only.
hfs: get root inode failed.
BUG: unable to handle kernel NULL pointer dereference at virtual address
00000018
printing eip
...
EIP is at superblock_doinit+0x21/0x767
...
[] selinux_sb_kern_mount+0xc/0x4b
[] vfs_kern_mount+0x99/0xf6
[] do_kern_mount+0x2d/0x3e
[] do_mount+0x5fa/0x66d
[] sys_mount+0x77/0xae
[] syscall_call+0x7/0xb
DWARF2 unwinder stuck at syscall_call+0x7/0xb
hfs_fill_super() returns success even if
root_inode = hfs_iget(sb, &fd.search_key->cat, &rec);
or
sb->s_root = d_alloc_root(root_inode);
fails. This superblock finds its way to superblock_doinit() which does:
struct dentry *root = sb->s_root;
struct inode *inode = root->d_inode;
and boom. Need to make sure the error cases return an error, I think.
[akpm@osdl.org: return -ENOMEM on oom]
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Adrian Bunk [Sun, 7 Jan 2007 00:01:19 +0000 (01:01 +0100)]
USB_RTL8150 must select MII to avoid link errors.
Stolen from a patch by Randy Dunlap.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Badari Pulavarty [Sat, 6 Jan 2007 23:58:15 +0000 (00:58 +0100)]
Fix for shmem_truncate_range() BUG_ON()
Ran into BUG() while doing madvise(REMOVE) testing. If we are punching a
hole into shared memory segment using madvise(REMOVE) and the entire hole
is below the indirect blocks, we hit following assert.
BUG_ON(limit <= SHMEM_NR_DIRECT);
Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
Forwarded-by: Jordan Neumeyer
Signed-off-by: Adrian Bunk <bunk@stusta.de>
John Heffner [Sat, 6 Jan 2007 21:31:44 +0000 (22:31 +0100)]
TCP: Fix and simplify microsecond rtt sampling
This changes the microsecond RTT sampling so that samples are taken in
the same way that RTT samples are taken for the RTO calculator: on the
last segment acknowledged, and only when the segment hasn't been
retransmitted.
Signed-off-by: John Heffner <jheffner@psc.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Paolo 'Blaisorblade' Giarrusso [Sat, 6 Jan 2007 04:18:55 +0000 (05:18 +0100)]
uml: fix processor selection
Makes UML compile on any possible processor choice. The two problems were:
*) x86 code, when 386 is selected, checks at runtime boot_cpuflags, which we
not have.
*) 3Dnow support for memcpy() et al. does not compile currently and fixing t
is not trivial, so simply disable it; with this change, if one selects MK
UML compiles (while it did not).
Merged upstream.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Willy Tarreau [Sat, 6 Jan 2007 01:31:24 +0000 (02:31 +0100)]
rio: typo in bitwise AND expression.
The line:
hp->Mode &= !RIO_PCI_INT_ENABLE;
is obviously wrong as RIO_PCI_INT_ENABLE=0x04 and is used as a bitmask
2 lines before. Getting no IRQ would not disable RIO_PCI_INT_ENABLE
but rather RIO_PCI_BOOT_FROM_RAM which equals 0x01.
Obvious fix is to change ! for ~.
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Chuck Short [Sat, 6 Jan 2007 00:22:29 +0000 (01:22 +0100)]
drm: allow detection of new VIA chipsets
Update pci ids.
Signed-off-by: Chuck Short <zulcss@gmail.com>
Signed-off-by: Ben Collins <bcollins@ubuntu.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Dave Airlie [Sat, 6 Jan 2007 00:21:40 +0000 (01:21 +0100)]
drm: Add the P4VM800PRO PCI ID.
Signed-off-by: Dave Airlie <airlied@linux.ie>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Jason Gaston [Sat, 6 Jan 2007 00:16:39 +0000 (01:16 +0100)]
i2c-i801: SMBus patch for Intel ICH9
This updated patch adds the Intel ICH9 LPC and SMBus Controller DID's.
Signed-off-by: Jason Gaston <jason.d.gaston@intel.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Jason Gaston [Sat, 6 Jan 2007 00:15:36 +0000 (01:15 +0100)]
PCI: irq: irq and pci_ids patch for Intel ICH9
This updated patch adds the Intel ICH9 LPC and SMBus Controller DID's.
Signed-off-by: Jason Gaston <jason.d.gaston@intel.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Rudolf Marek [Sat, 6 Jan 2007 00:13:13 +0000 (01:13 +0100)]
i2c-viapro: Add support for the VT8237A and VT8251
Documentation update included. Compile tested.
Signed-off-by: Rudolf Marek <r.marek@sh.cvut.cz>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
David Brownell [Sat, 6 Jan 2007 00:08:47 +0000 (01:08 +0100)]
SPI/MTD: mtd_dataflash oops prevention
Return a fault code if the Dataflash driver runs into a "no device present"
error when the MISO line has a pulldown (it currently expects a pullup), so
that rmmod won't oops.
Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
David L Stevens [Sat, 6 Jan 2007 00:06:28 +0000 (01:06 +0100)]
[IPV4/IPV6]: Fix inet{,6} device initialization order.
It is important that we only assign dev->ip{,6}_ptr
only after all portions of the inet{,6} are setup.
Otherwise we can receive packets before the multicast
spinlocks et al. are initialized.
Signed-off-by: David L Stevens <dlstevens@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
David S. Miller [Sat, 6 Jan 2007 00:00:48 +0000 (01:00 +0100)]
[SOUND] Sparc CS4231: Use 64 for period_bytes_min
This matches what the ISA cs4231 driver uses.
Tested by Georg Chini.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Georg Chini [Sat, 6 Jan 2007 00:00:08 +0000 (01:00 +0100)]
[SOUND] Sparc CS4231: Fix IRQ return value and initialization.
SBUS: Change IRQ-handler return value from 0 to IRQ_HANDLED and
fix some initialisation problems.
Change period_bytes_min from 4096 to 256 to allow driver to work with
low latency (VOIP) applications. Hope this does not break EBUS.
Signed-off-by: Georg Chini <georg.chini@triaton-webhosting.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Mikael Pettersson [Fri, 5 Jan 2007 23:54:18 +0000 (00:54 +0100)]
USB: Fix alignment of buffer passed down to ->hub_control()
Implementations assume the buffer is at least 4 byte aligned.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Adrian Bunk [Fri, 5 Jan 2007 02:17:11 +0000 (03:17 +0100)]
fix the UML compilation
Based on patches from Linus' tree.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Fabrice Knevez [Thu, 4 Jan 2007 23:29:31 +0000 (00:29 +0100)]
[SUNKBD]: Fix sunkbd_enable(sunkbd, 0); obvious.
"sunkbd_enable(sunkbd, 0);" has no effect. Adding "sunkbd->enabled =
enable" in sunkbd_enable (obvious)
Signed-off-by: Fabrice Knevez <nuxdoors@cegetel.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Andrew Morton [Thu, 4 Jan 2007 22:29:51 +0000 (23:29 +0100)]
ibmtr section fixes
WARNING: drivers/net/tokenring/ibmtr.o - Section mismatch: reference to .init.data:ibmtr_mem_base from .text between 'ibmtr_probe1' (at offset 0x6e6) and 'ibmtr_probe_card'
WARNING: drivers/net/tokenring/ibmtr.o - Section mismatch: reference to .init.data:ibmtr_mem_base from .text between 'ibmtr_probe1' (at offset 0x74a) and 'ibmtr_probe_card'
WARNING: drivers/net/tokenring/ibmtr.o - Section mismatch: reference to .init.data:ibmtr_mem_base from .text between 'ibmtr_probe1' (at offset 0x7fd) and 'ibmtr_probe_card'
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Andi Kleen [Mon, 8 Jan 2007 21:44:07 +0000 (22:44 +0100)]
x86_64: Don't leak NT bit into next task (CVE-2006-5755)
SYSENTER can cause a NT to be set which might cause crashes on the IRET
in the next task.
Following similar i386 patch from Linus.
Backport to 2.6.16 by Chuck Ebbert <76306.1226@compuserve.com>
[Changed 'set_debugreg' to the older 'set_debug' in setup64.c
and added raw_local_save_flags() from 2.6.19 to system.h]
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Chuck Ebbert [Thu, 4 Jan 2007 22:01:18 +0000 (23:01 +0100)]
x86_64: fix ia32 syscall count
Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Marcel Holtmann [Thu, 4 Jan 2007 21:57:52 +0000 (22:57 +0100)]
Bluetooth: Add packet size checks for CAPI messages (CVE-2006-6106)
With malformed packets it might be possible to overwrite internal
CMTP and CAPI data structures. This patch adds additional length
checks to prevent these kinds of remote attacks.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Andrew Morton [Tue, 9 Jan 2007 02:23:15 +0000 (03:23 +0100)]
grow_buffers() infinite loop fix (CVE-2006-5757/CVE-2006-6060)
If grow_buffers() is for some reason passed a block number which wants to li
outside the maximum-addressable pagecache range (PAGE_SIZE * 4G bytes) then
will accidentally truncate `index' and will then instnatiate a page at the
wrong pagecache offset. This causes __getblk_slow() to go into an infinite
loop.
This can happen with corrupted disks, or with software errors elsewhere.
Detect that, and handle it.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Linus Torvalds [Thu, 4 Jan 2007 22:23:27 +0000 (23:23 +0100)]
i386: save/restore eflags in context switch (CVE-2006-5173)
(And reset it on new thread creation)
It turns out that eflags is important to save and restore not just
because of iopl, but due to the magic bits like the NT bit, which we
don't want leaking between different threads.
Backported to 2.6.16 by Chuck Ebbert <76306.1226@compuserve.com>
[Backport consisted of removing the CFI annotations.]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Marcel Holtmann [Thu, 4 Jan 2007 00:53:41 +0000 (01:53 +0100)]
Call init_timer() for ISDN PPP CCP reset state timer (CVE-2006-5749)
The function isdn_ppp_ccp_reset_alloc_state() sets ->timer.function
and ->timer.data and later on calls add_timer() with no init_timer()
ever done.
Noted by Al Viro.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Linus Torvalds [Thu, 4 Jan 2007 00:44:45 +0000 (01:44 +0100)]
Fix incorrect user space access locking in mincore() (CVE-2006-4814)
Doug Chapman noticed that mincore() will doa "copy_to_user()" of the
result while holding the mmap semaphore for reading, which is a big
no-no. While a recursive read-lock on a semaphore in the case of a page
fault happens to work, we don't actually allow them due to deadlock
schenarios with writers due to fairness issues.
Doug and Marcel sent in a patch to fix it, but I decided to just rewrite
the mess instead - not just fixing the locking problem, but making the
code smaller and (imho) much easier to understand.
Also included are two fixes for the original patch including one
by Oleg Nesterov.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Miklos Szeredi [Thu, 4 Jan 2007 00:14:06 +0000 (01:14 +0100)]
fuse: fix hang on SMP
Fuse didn't always call i_size_write() with i_mutex held which caused
rare hangs on SMP/32bit. This bug has been present since fuse-2.2,
well before being merged into mainline.
The simplest solution is to protect i_size_write() with the
per-connection spinlock. Using i_mutex for this purpose would require
some restructuring of the code and I'm not even sure it's always safe
to acquire i_mutex in all places i_size needs to be set.
Since most of vmtruncate is already duplicated for other reasons,
duplicate the remaining part as well, making all i_size_write() calls
internal to fuse.
Using i_size_write() was unnecessary in fuse_init_inode(), since this
function is only called on a newly created locked inode.
Reported by a few people over the years, but special thanks to Dana
Henriksen who was persistent enough in helping me debug it.
Adrian Bunk:
Backported to 2.6.16.
Signed-off-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Robert Olsson [Wed, 3 Jan 2007 23:57:17 +0000 (00:57 +0100)]
[PKTGEN]: Fix module load/unload races.
Adrian Bunk:
Backported to 2.6.16.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Dirk Eibach [Wed, 3 Jan 2007 23:42:01 +0000 (00:42 +0100)]
i2c: fix broken ds1337 initialization
On a custom board with ds1337 RTC I found that upgrade from 2.6.15 to
2.6.18 broke RTC support.
The main problem are changes to ds1337_init_client().
When a ds1337 recognizes a problem (e.g. power or clock failure) bit 7
in status register is set. This has to be reset by writing 0 to status
register. But since there are only 16 byte written to the chip and the
first byte is interpreted as an address, the status register (which is
the 16th) is never written.
The other problem is, that initializing all registers to zero is not
valid for day, date and month register. Funny enough this is checked by
ds1337_detect(), which depends on this values not being zero. So then
treated by ds1337_init_client() the ds1337 is not detected anymore,
whereas the failure bit in the status register is still set.
Broken by commit
f9e8957937ebf60d22732a5ca9130f48a7603f60 (2.6.16-rc1,
2006-01-06). This fix is in Linus' tree since 2.6.20-rc1 (commit
763d9c046a2e511ec090a8986d3f85edf7448e7e).
Signed-off-by: Dirk Stieler <stieler@gdsys.de>
Signed-off-by: Dirk Eibach <eibach@gdsys.de>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Patrick McHardy [Wed, 3 Jan 2007 23:38:10 +0000 (00:38 +0100)]
NET_SCHED: Fix fallout from dev->qdisc RCU change
The move of qdisc destruction to a rcu callback broke locking in the
entire qdisc layer by invalidating previously valid assumptions about
the context in which changes to the qdisc tree occur.
The two assumptions were:
- since changes only happen in process context, read_lock doesn't need
bottem half protection. Now invalid since destruction of inner qdiscs,
classifiers, actions and estimators happens in the RCU callback unless
they're manually deleted, resulting in dead-locks when read_lock in
process context is interrupted by write_lock_bh in bottem half context.
- since changes only happen under the RTNL, no additional locking is
necessary for data not used during packet processing (f.e. u32_list).
Again, since destruction now happens in the RCU callback, this assumption
is not valid anymore, causing races while using this data, which can
result in corruption or use-after-free.
Instead of "fixing" this by disabling bottem halfs everywhere and adding
new locks/refcounting, this patch makes these assumptions valid again by
moving destruction back to process context. Since only the dev->qdisc
pointer is protected by RCU, but ->enqueue and the qdisc tree are still
protected by dev->qdisc_lock, destruction of the tree can be performed
immediately and only the final free needs to happen in the rcu callback
to make sure dev_queue_xmit doesn't access already freed memory.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Adrian Bunk [Tue, 26 Dec 2006 10:47:45 +0000 (11:47 +0100)]
Linux 2.6.16.37
Adrian Bunk [Mon, 18 Dec 2006 20:08:14 +0000 (21:08 +0100)]
Linux 2.6.16.37-rc1
Trond Myklebust [Mon, 18 Dec 2006 19:55:04 +0000 (20:55 +0100)]
NFS: nfs_lookup - don't hash dentry when optimising away the lookup
If the open intents tell us that a given lookup is going to result in a,
exclusive create, we currently optimize away the lookup call itself. The
reason is that the lookup would not be atomic with the create RPC call, so
why do it in the first place?
A problem occurs, however, if the VFS aborts the exclusive create operation
after the lookup, but before the call to create the file/directory: in this
case we will end up with a hashed negative dentry in the dcache that has
never been looked up.
Fix this by only actually hashing the dentry once the create operation has
been successfully completed.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Brian King [Mon, 18 Dec 2006 19:53:26 +0000 (20:53 +0100)]
[SCSI] DAC960: PCI id table fixup
The PCI ID table in the DAC960 driver conflicts with some devices
that use the ipr driver. All ipr adapters that use this chip
have an IBM subvendor ID and all DAC960 adapters that use this
chip have a Mylex subvendor id.
Signed-off-by: Brian King <brking@us.ibm.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Stephen Hemminger [Mon, 18 Dec 2006 19:51:43 +0000 (20:51 +0100)]
bridge-netfilter: don't overwrite memory outside of skb
The bridge netfilter code needs to check for space at the
front of the skb before overwriting; otherwise if skb from
device doesn't have headroom, then it will cause random
memory corruption.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Andrew Morton [Sun, 17 Dec 2006 23:39:11 +0000 (00:39 +0100)]
hvc_console suspend fix
Fix http://bugzilla.kernel.org/show_bug.cgi?id=7152
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Akinobu Mita [Sun, 17 Dec 2006 23:23:42 +0000 (00:23 +0100)]
[WATCHDOG] sc1200wdt.c pnp unregister fix.
If no devices found or invalid parameter is specified,
scl200wdt_pnp_driver is left unregistered.
It breaks global list of pnp drivers.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Dave Jones [Sun, 17 Dec 2006 23:13:47 +0000 (00:13 +0100)]
[WATCHDOG] sc1200wdt.c printk fix
Fix printk output.
sc1200wdt: build
20020303<3>sc1200wdt: io parameter must be specified
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Jeff Garzik [Sun, 17 Dec 2006 23:07:43 +0000 (00:07 +0100)]
ISDN: fix drivers, by handling errors thrown by ->readstat()
This is a particularly ugly on-failure bug, possibly security, since the
lack of error handling here is covering up another class of bug: failure to
handle copy_to_user() return values.
The I4L API function ->readstat() returns an integer, and by looking at
several existing driver implementations, it is clear that a negative return
value was meant to indicate an error.
Given that several drivers already return a negative value indicating an
errno-style error, the current code would blindly accept that [negative]
value as a valid amount of bytes read. Obvious damage ensues.
Correcting ->readstat() handling to properly notice errors fixes the
existing code to work correctly on error, and enables future patches to
more easily indicate errors during operation.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Francois Romieu [Sun, 17 Dec 2006 21:14:09 +0000 (22:14 +0100)]
r8169: tweak the PCI data parity error recovery
The 8110SB based n2100 board signals a lot of what ought to be
PCI data parity errors durint operation of the 8169 as target.
Experiment proved that the driver can ignore the error and
process the packet as if nothing had happened.
Let's add an ad-hoc knob to enable users to fix their system while
avoiding the risks of a wholesale change.
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Arnaud Patard [Sun, 17 Dec 2006 21:12:09 +0000 (22:12 +0100)]
r8169: fix infinite loop during hotplug
Bug reported for PCMCIA.
Signed-off-by: Arnaud Patard <apatard@mandriva.com>
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Francois Romieu [Sun, 17 Dec 2006 21:11:16 +0000 (22:11 +0100)]
r8169: RX fifo overflow recovery
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Arjan van de Ven [Sun, 17 Dec 2006 20:49:57 +0000 (21:49 +0100)]
x86-64: Mark rdtsc as sync only for netburst, not for core2
On the Core2 cpus, the rdtsc instruction is not serializing (as defined
in the architecture reference since rdtsc exists) and due to the deep
speculation of these cores, it's possible that you can observe time go
backwards between cores due to this speculation. Since the kernel
already deals with this with the SYNC_RDTSC flag, the solution is
simple, only assume that the instruction is serializing on family 15...
The price one pays for this is a slightly slower gettimeofday (by a
dozen or two cycles), but that increase is quite small to pay for a
really-going-forward tsc counter.
Backport by Chris Wright.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
David S. Miller [Sun, 17 Dec 2006 20:37:17 +0000 (21:37 +0100)]
[IPV4] ip_fragment: Always compute hash with ipfrag_lock held.
Otherwise we could compute an inaccurate hash due to the
random seed changing.
Noticed by Zach Brown and patch is based upon some feedback
from Herbert Xu.
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Robin Holt [Sun, 17 Dec 2006 20:34:56 +0000 (21:34 +0100)]
IA64: bte_unaligned_copy() transfers one extra cache line.
When called to do a transfer that has a start offset within the cache
line which is uneven between source and destination and a length which
terminates the source of the copy exactly on a cache line, one extra
line gets copied into a temporary buffer. This is normally not an issue
since the buffer is a kernel buffer and only the requested information
gets copied into the user buffer.
The problem arises when the source ends at the very last physical page
of memory. That last cache line does not exist and results in the SHUB
chip raising an MCA.
Signed-off-by: Robin Holt <holt@sgi.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Tejun Heo [Sun, 17 Dec 2006 20:32:00 +0000 (21:32 +0100)]
scsi: clear garbage after CDBs on SG_IO
ATAPI devices transfer fixed number of bytes for CDBs (12 or 16). Some
ATAPI devices choke when shorter CDB is used and the left bytes contain
garbage. Block SG_IO cleared left bytes but SCSI SG_IO didn't. This patch
makes SCSI SG_IO clear it and simplify CDB clearing in block SG_IO.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Acked-by: Jens Axboe <jens.axboe@oracle.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Linus Torvalds [Fri, 15 Dec 2006 00:56:30 +0000 (01:56 +0100)]
AGP: Allocate AGP pages with GFP_DMA32 by default
Not all graphic page remappers support physical addresses over the 4GB
mark for remapping, so while some do (the AMD64 GART always did, and I
just fixed the i965 to do so properly), we're safest off just forcing
GFP_DMA32 allocations to make sure graphics pages get allocated in the
low 32-bit address space by default.
AGP sub-drivers that really care, and can do better, could just choose
to implement their own allocator (or we could add another "64-bit safe"
default allocator for their use), but quite frankly, you're not likely
to care in practice.
So for now, this trivial change means that we won't be allocating pages
that we can't map correctly by mistake on x86-64.
[ On traditional 32-bit x86, this could never happen, because GFP_KERNEL
would never allocate any highmem memory anyway ]
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Neil Brown [Fri, 15 Dec 2006 00:48:58 +0000 (01:48 +0100)]
md: Fix md grow/size code to correctly find the maximum available space
An md array can be asked to change the amount of each device that it is using,
and in particular can be asked to use the maximum available space. This
currently only works if the first device is not larger than the rest. As
'size' gets changed and so 'fit' becomes wrong. So check if a 'fit' is
required early and don't corrupt it.
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Zachary Amsden [Fri, 15 Dec 2006 00:38:04 +0000 (01:38 +0100)]
softirq: remove BUG_ONs which can incorrectly trigger
It is possible to have tasklets get scheduled before softirqd has had a chance
to spawn on all CPUs. This is totally harmless; after success during action
CPU_UP_PREPARE, action CPU_ONLINE will be called, which immediately wakes
softirqd on the appropriate CPU to process the already pending tasklets. So
there is no danger of having a missed wakeup for any tasklets that were
already pending.
In particular, i386 is affected by this during startup, and is visible when
using a very large initrd; during the time it takes for the initrd to be
decompressed, a timer IRQ can come in and schedule RCU callbacks. It is also
possible that resending of a hardware IRQ via a softirq triggers the same bug.
Because of different timing conditions, this shows up in all emulators and
virtual machines tested, including Xen, VMware, Virtual PC, and Qemu. It is
also possible to trigger on native hardware with a large enough initrd,
although I don't have a reliable case demonstrating that.
Signed-off-by: Zachary Amsden <zach@vmware.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Christophe Saout [Fri, 15 Dec 2006 00:21:59 +0000 (01:21 +0100)]
dm crypt: Fix data corruption with dm-crypt over RAID5
Fix corruption issue with dm-crypt on top of software raid5. Cancelled
readahead bio's that report no error, just have BIO_UPTODATE cleared
were reported as successful reads to the higher layers (and leaving
random content in the buffer cache). Already fixed in 2.6.19.
Signed-off-by: Christophe Saout <christophe@saout.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Christophe Saout [Fri, 15 Dec 2006 00:20:35 +0000 (01:20 +0100)]
Fix SUNRPC wakeup/execute race condition
The sunrpc scheduler contains a race condition that can let an RPC
task end up being neither running nor on any wait queue. The race takes
place between rpc_make_runnable (called from rpc_wake_up_task) and
__rpc_execute under the following condition:
First __rpc_execute calls tk_action which puts the task on some wait
queue. The task is dequeued by another process before __rpc_execute
continues its execution. While executing rpc_make_runnable exactly after
setting the task `running' bit and before clearing the `queued' bit
__rpc_execute picks up execution, clears `running' and subsequently
both functions fall through, both under the false assumption somebody
else took the job.
Swapping rpc_test_and_set_running with rpc_clear_queued in
rpc_make_runnable fixes that hole. This introduces another possible
race condition that can be handled by checking for `queued' after
setting the `running' bit.
Bug noticed on a 4-way x86_64 system under XEN with an NFSv4 server
on the same physical machine, apparently one of the few ways to hit
this race condition at all.
Signed-off-by: Christophe Saout <christophe@saout.de>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Dave Jones [Thu, 14 Dec 2006 23:40:39 +0000 (00:40 +0100)]
[ALSA] fix usbmixer double kfree
snd_ctl_add() kfree's the kcontrol already if we fail there,
so this driver is currently doing a double kfree.
Coverity bug #959
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Dave Jones [Thu, 14 Dec 2006 23:39:47 +0000 (00:39 +0100)]
[ALSA] sound/isa/sb/sb_mixer.c double kfree
snd_ctl_add() already does the free on error.
Coverity bug #957
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Dave Jones [Thu, 14 Dec 2006 23:39:00 +0000 (00:39 +0100)]
[ALSA] Fix use after free in opl3_seq and opl3_oss
Don't read from free'd memory. Also make use of the return
value, and don't register the device if something went wrong
creating the port.
Coverity #954, #955
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Dave Jones [Thu, 14 Dec 2006 23:37:55 +0000 (00:37 +0100)]
[ALSA] ad1848 double free
snd_ctl_add() already kfree's on error.
Coverity #956
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Adrian Bunk [Thu, 14 Dec 2006 23:28:27 +0000 (00:28 +0100)]
[ALSA] sound/pci/rme9652/hdspm.c: fix off-by-one errors
This patch fixes off-by-one errors found by the Coverity checker.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Adrian Bunk [Thu, 14 Dec 2006 23:25:52 +0000 (00:25 +0100)]
[ALSA] fix some memory leaks
This patch fixes two memory leaks spotted by the Coverity checker.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Adrian Bunk [Thu, 14 Dec 2006 23:24:31 +0000 (00:24 +0100)]
[ALSA] sound/core/: fix 3 off-by-one errors
This patch fixes three off-by-one errors found by the Coverity checker.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Peer Chen [Thu, 14 Dec 2006 22:25:25 +0000 (23:25 +0100)]
IDE: Add the support of nvidia PATA controllers of MCP67 to amd74xx.c
Add support for PATA controllers of MCP67 to amd74xx.c.
Signed-off-by: Peer Chen <pchen@nvidia.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Peer Chen [Thu, 14 Dec 2006 22:23:11 +0000 (23:23 +0100)]
pci_ids.h: Add NVIDIA PCI ID
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Randy Dunlap [Thu, 14 Dec 2006 22:21:30 +0000 (23:21 +0100)]
amd74xx.c: add some NVIDIA chipset IDs
Add some nVidia chipset ID's support.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Andrew Chew [Thu, 14 Dec 2006 22:18:27 +0000 (23:18 +0100)]
sata_nv/amd74xx: Add MCP61 support
Added MCP61 support to sata_nv and amd74xx.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Jeff Garzik [Thu, 14 Dec 2006 22:15:42 +0000 (23:15 +0100)]
[libata] sata_nv: add PCI IDs
Based on a patch contributed by Andrew Chew @ NVIDIA.
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Mark McLoughlin [Thu, 14 Dec 2006 22:09:07 +0000 (23:09 +0100)]
dm snapshot: fix metadata writing when suspending
When suspending a device-mapper device, dm_suspend() sleeps until all
necessary I/O is completed. This state is triggered by a callback from
persistent_commit(). But some I/O can still be issued *after* the callback
(to prepare the next metadata area for use if the current one is full). This
patch delays the callback until after that I/O is complete.
Signed-off-by: Mark McLoughlin <markmc@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Daniel Kobras [Thu, 14 Dec 2006 22:08:32 +0000 (23:08 +0100)]
dm: Fix deadlock under high i/o load in raid1 setup.
On an nForce4-equipped machine with two SATA disk in raid1 setup using dmraid,
we experienced frequent deadlock of the system under high i/o load. 'cat
/dev/zero > ~/zero' was the most reliable way to reproduce them: Randomly
after a few GB, 'cp' would be left in 'D' state along with kjournald and
kmirrord. The functions cp and kjournald were blocked in did vary, but
kmirrord's wchan always pointed to 'mempool_alloc()'. We've seen this pattern
on 2.6.15 and 2.6.17 kernels. http://lkml.org/lkml/2005/4/20/142 indicates
that this problem has been around even before.
So much for the facts, here's my interpretation: mempool_alloc() first tries
to atomically allocate the requested memory, or falls back to hand out
preallocated chunks from the mempool. If both fail, it puts the calling
process (kmirrord in this case) on a private waitqueue until somebody refills
the pool. Where the only 'somebody' is kmirrord itself, so we have a
deadlock.
I worked around this problem by falling back to a (blocking) kmalloc when
before kmirrord would have ended up on the waitqueue. This defeats part of
the benefits of using the mempool, but at least keeps the system running. And
it could be done with a two-line change. Note that mempool_alloc() clears the
GFP_NOIO flag internally, and only uses it to decide whether to wait or return
an error if immediate allocation fails, so the attached patch doesn't change
behaviour in the non-deadlocking case. Path is against current git
(2.6.18-rc4), but should apply to earlier versions as well. I've tested on
2.6.15, where this patch makes the difference between random lockup and a
stable system.
Signed-off-by: Daniel Kobras <kobras@linux.de>
Acked-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Neil Brown [Thu, 14 Dec 2006 22:07:54 +0000 (23:07 +0100)]
dm: mirror sector offset fix
The device-mapper core does not perform any remapping of bios before passing
them to the targets. If a particular mapping begins part-way into a device,
targets obtain the sector relative to the start of the mapping by subtracting
ti->begin.
The dm-raid1 target didn't do this everywhere: this patch fixes it, taking
care to subtract ti->begin exactly once for each bio.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Jeff Mahoney [Thu, 14 Dec 2006 22:07:12 +0000 (23:07 +0100)]
dm: add module ref counting
The reference counting on dm-mod is zero if no mapped devices are open. This
is incorrect, and can lead to an oops if the module is unloaded while mapped
devices exist.
This patch claims a reference to the module whenever a device is created, and
drops it again when the device is freed.
Devices must be removed before dm-mod is unloaded.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Alasdair G Kergon [Thu, 14 Dec 2006 22:04:35 +0000 (23:04 +0100)]
dm snapshot: unify chunk_size
Persistent snapshots currently store a private copy of the chunk size.
Userspace also supplies the chunk size when loading a snapshot. Ensure
consistency by only storing the chunk_size in one place instead of two.
Currently the two sizes will differ if the chunk size supplied by userspace
does not match the chunk size an existing snapshot actually uses. Amongst
other problems, this causes an incorrect 'percentage full' to be reported.
The patch ensures consistency by only storing the chunk_size in one place,
removing it from struct pstore. Some initialisation is delayed until the
correct chunk_size is known. If read_header() discovers that the wrong chun
size was supplied, the 'area' buffer (which the header already got read into
is reinitialised to the correct size.
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Michal Miroslaw [Thu, 14 Dec 2006 22:02:33 +0000 (23:02 +0100)]
dm: BUG/OOPS fix
Fix BUG I tripped on while testing failover and multipathing.
BUG shows up on error path in multipath_ctr() when parse_priority_group()
fails after returning at least once without error. The fix is to
initialize m->ti early - just after alloc()ing it.
BUG: unable to handle kernel NULL pointer dereference at virtual address
0000000
0
printing eip:
c027c3d2
*pde =
00000000
Oops: 0000 [#3]
Modules linked in: qla2xxx ext3 jbd mbcache sg ide_cd cdrom floppy
CPU: 0
EIP: 0060:[<
c027c3d2>] Not tainted VLI
EFLAGS:
00010202 (2.6.17.3 #1)
EIP is at dm_put_device+0xf/0x3b
eax:
00000001 ebx:
ee4fcac0 ecx:
00000000 edx:
ee4fcac0
esi:
ee4fc4e0 edi:
ee4fc4e0 ebp:
00000000 esp:
c5db3e78
ds: 007b es: 007b ss: 0068
Process multipathd (pid: 15912, threadinfo=
c5db2000 task=
ef485a90)
Stack:
ec4eda40 c02816bd ee4fc4c0 00000000 f7e89498 f883e0bc c02816f6 f7e89480
f7e8948c c0281801 ffffffea f7e89480 f883e080 c0281ffe 00000001 00000000
00000004 dfe9cab8 f7a693c0 f883e080 f883e0c0 ca4b99c0 c027c6ee 01400000
Call Trace:
<
c02816bd> free_pgpaths+0x31/0x45 <
c02816f6> free_priority_group+0x25/0x2e
<
c0281801> free_multipath+0x35/0x67 <
c0281ffe> multipath_ctr+0x123/0x12d
<
c027c6ee> dm_table_add_target+0x11e/0x18b <
c027e5b4> populate_table+0x8a/0xaf
<
c027e62b> table_load+0x52/0xf9 <
c027ec23> ctl_ioctl+0xca/0xfc
<
c027e5d9> table_load+0x0/0xf9 <
c0152146> do_ioctl+0x3e/0x43
<
c0152360> vfs_ioctl+0x16c/0x178 <
c01523b4> sys_ioctl+0x48/0x60
<
c01029b3> syscall_call+0x7/0xb
Code: 97 f0 00 00 00 89 c1 83 c9 01 80 e2 01 0f 44 c1 88 43 14 8b 04 24 59 5b 5e
5f 5d c3 53 89 c1 89 d3 ff 4a 08 0f 94 c0 84 c0 74 2a <8b> 01 8b 10 89 d8 e8 f6
fb ff ff 8b 03 8b 53 04 89 50 04 89 02
EIP: [<
c027c3d2>] dm_put_device+0xf/0x3b SS:ESP 0068:
c5db3e78
Signed-off-by: Michal Miroslaw <mirq-linux@rere.qmqm.pl>
Acked-by: Alasdair G Kergon <agk@redhat.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Jean Delvare [Thu, 14 Dec 2006 21:40:00 +0000 (22:40 +0100)]
[SCSI] gdth: Fix && typos
Fix uses of "&&" where "&" was obviously intended instead.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Joerg Ahrens [Thu, 14 Dec 2006 21:29:54 +0000 (22:29 +0100)]
xirc2ps_cs: Cannot reset card in atomic context
I am using a Xircom CEM33 pcmcia NIC which has occasional hardware problems.
If the netdev watchdog detects a transmit timeout, do_reset is called which
msleeps - this is illegal in atomic context.
This patch schedules the timeout handling as a workqueue item.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Alexey Kuznetsov [Thu, 14 Dec 2006 21:28:51 +0000 (22:28 +0100)]
[IPV4]: severe locking bug in fib_semantics.c
Found in 2.4 by Yixin Pan <yxpan@hotmail.com>.
> When I read fib_semantics.c of Linux-2.4.32, write_lock(&fib_info_lock) =
> is used in fib_release_info() instead of write_lock_bh(&fib_info_lock). =
> Is the following case possible: a BH interrupts fib_release_info() while =
> holding the write lock, and calls ip_check_fib_default() which calls =
> read_lock(&fib_info_lock), and spin forever.
Signed-off-by: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Hans Verkuil [Thu, 14 Dec 2006 21:26:40 +0000 (22:26 +0100)]
V4L: Fix broken TUNER_LG_NTSC_TAPE radio support
The TUNER_LG_NTSC_TAPE is identical in all respects to the
TUNER_PHILIPS_FM1236_MK3. So use the params struct for the Philips
tuner.
Also add this LG_NTSC_TAPE tuner to the switches where radio specific
parameters are set so it behaves like a TUNER_PHILIPS_FM1236_MK3. This
change fixes the radio support for this tuner (the wrong bandswitch byte
was used).
Thanks to Andy Walls <cwalls@radix.net> for finding this bug.
Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
Signed-off-by: Michael Krufky <mkrufky@linuxtv.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Michael Krufky [Thu, 14 Dec 2006 21:25:04 +0000 (22:25 +0100)]
DVB: lgdt330x: fix signal / lock status detection bug
In some cases when using VSB, the AGC status register has been known to
falsely report "no signal" when in fact there is a carrier lock. The
datasheet labels these status flags as QAM only, yet the lgdt330x
module is using these flags for both QAM and VSB.
This patch allows for the carrier recovery lock status register to be
tested, even if the agc signal status register falsely reports no signal.
Thanks to jcrews from #linuxtv in irc, for initially reporting this bug.
Signed-off-by: Michael Krufky <mkrufky@linuxtv.org>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Herbert Xu [Thu, 14 Dec 2006 20:27:09 +0000 (21:27 +0100)]
[CRYPTO] sha512: Fix sha384 block size
The SHA384 block size should be 128 bytes, not 96 bytes. This was
spotted by Andrew Donofrio.
This breaks HMAC which uses the block size during setup and the final
calculation.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Adrian Bunk [Wed, 13 Dec 2006 17:15:47 +0000 (18:15 +0100)]
Linux 2.6.16.36
Adrian Bunk [Sat, 9 Dec 2006 15:23:02 +0000 (16:23 +0100)]
Linux 2.6.16.36-rc1
Chuck Ebbert [Sat, 9 Dec 2006 15:21:59 +0000 (16:21 +0100)]
binfmt_elf: fix checks for bad address
Fix check for bad address; use macro instead of open-coding two checks.
Taken from RHEL4 kernel update.
From: Ernie Petrides <petrides@redhat.com>
For background, the BAD_ADDR() macro should return TRUE if the address is
TASK_SIZE, because that's the lowest address that is *not* valid for
user-space mappings. The macro was correct in binfmt_aout.c but was wrong
for the "equal to" case in binfmt_elf.c. There were two in-line validations
of user-space addresses in binfmt_elf.c, which have been appropriately
converted to use the corrected BAD_ADDR() macro in the patch you posted
yesterday. Note that the size checks against TASK_SIZE are okay as coded.
The additional changes that I propose are below. These are in the error
paths for bad ELF entry addresses once load_elf_binary() has already
committed to exec'ing the new image (following the tearing down of the
task's original address space).
The 1st hunk deals with the interp-side of the outer "if". There were two
problems here. The printk() should be removed because this path can be
triggered at will by a bogus interpreter image created and used by a
malicious user. Further, the error code should not be ENOEXEC, because that
causes the loop in search_binary_handler() to continue trying other exec
handlers (twice, in fact). But it's too late for this to work correctly,
because the user address space has already been torn down, and an exec()
failure cannot be returned to the user code because the code no longer
exists. The only recovery is to force a SIGSEGV, but it's best to terminate
the search loop immediately. I somewhat arbitrarily chose EINVAL as a
fallback error code, but any error returned by load_elf_interp() will
override that (but this value will never be seen by user-space).
The 2nd hunk deals with the non-interp-side of the outer "if". There were
two problems here as well. The SIGSEGV needs to be forced, because a prior
sigaction() syscall might have set the associated disposition to SIG_IGN.
And the ENOEXEC should be changed to EINVAL as described above.
Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
David S. Miller [Sat, 9 Dec 2006 15:16:56 +0000 (16:16 +0100)]
[IPSEC]: Fix inetpeer leak in ipv4 xfrm dst entries.
We grab a reference to the route's inetpeer entry but
forget to release it in xfrm4_dst_destroy().
Bug discovered by Kazunori MIYAZAWA <kazunori@miyazawa.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Patrick McHardy [Sat, 9 Dec 2006 15:14:39 +0000 (16:14 +0100)]
[XFRM]: Use output device disable_xfrm for forwarded packets
Currently the behaviour of disable_xfrm is inconsistent between
locally generated and forwarded packets. For locally generated
packets disable_xfrm disables the policy lookup if it is set on
the output device, for forwarded traffic however it looks at the
input device. This makes it impossible to disable xfrm on all
devices but a dummy device and use normal routing to direct
traffic to that device.
Always use the output device when checking disable_xfrm.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Adrian Bunk [Fri, 8 Dec 2006 16:00:35 +0000 (17:00 +0100)]
revert the quirk_via_irq changes
This patch reverts the quirk_via_irq changes in 2.6.16.17 that
caused regressions for several people.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Daniel Ritz [Wed, 6 Dec 2006 19:36:32 +0000 (20:36 +0100)]
PCI: add ICH7/8 ACPI/GPIO io resource quirks
Signed-off-by: Daniel Ritz <daniel.ritz@gmx.ch>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Daniel Ritz [Wed, 6 Dec 2006 19:19:36 +0000 (20:19 +0100)]
PCI: fix ICH6 quirks
- add the ICH6(R) LPC to the ICH6 ACPI quirks. currently only the ICH6-M
is handled. [ PCI_DEVICE_ID_INTEL_ICH6_1 is the ICH6-M LPC, ICH6_0 is
the ICH6(R) ]
Signed-off-by: Daniel Ritz <daniel.ritz@gmx.ch>
Signed-off-by: Adrian Bunk <bunk@stusta.de>