Fernando Luis Vázquez Cao [Mon, 13 Jun 2011 15:06:58 +0000 (15:06 +0000)]
IGMP snooping: set mrouters_only flag for IPv6 traffic
[ upstream commit
fc2af6c73fc9449cd5894a36bb76b8f8c0e49fd8 ]
properly
Upon reception of a MGM report packet the kernel sets the mrouters_only flag
in a skb that is a clone of the original skb, which means that the bridge
loses track of MGM packets (cb buffers are tied to a specific skb and not
shared) and it ends up forwading join requests to the bridge interface.
This can cause unexpected membership timeouts and intermitent/permanent loss
of connectivity as described in RFC 4541 [2.1.1. IGMP Forwarding Rules]:
A snooping switch should forward IGMP Membership Reports only to
those ports where multicast routers are attached.
[...]
Sending membership reports to other hosts can result, for IGMPv1
and IGMPv2, in unintentionally preventing a host from joining a
specific multicast group.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: David S. Miller <davem@conan.davemloft.net>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Fernando Luis Vázquez Cao [Mon, 13 Jun 2011 15:04:43 +0000 (15:04 +0000)]
IGMP snooping: set mrouters_only flag for IPv4 traffic
[ upstream commit
62b2bcb49cca72f6d3f39f831127a6ab315a475d ]
properly
Upon reception of a IGMP/IGMPv2 membership report the kernel sets the
mrouters_only flag in a skb that may be a clone of the original skb, which
means that sometimes the bridge loses track of membership report packets (cb
buffers are tied to a specific skb and not shared) and it ends up forwading
join requests to the bridge interface.
This can cause unexpected membership timeouts and intermitent/permanent loss
of connectivity as described in RFC 4541 [2.1.1. IGMP Forwarding Rules]:
A snooping switch should forward IGMP Membership Reports only to
those ports where multicast routers are attached.
[...]
Sending membership reports to other hosts can result, for IGMPv1
and IGMPv2, in unintentionally preventing a host from joining a
specific multicast group.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Tested-by: Hayato Kakuta <kakuta.hayato@oss.ntt.co.jp>
Signed-off-by: David S. Miller <davem@conan.davemloft.net>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Thomas Graf [Fri, 3 Sep 2010 03:04:20 +0000 (03:04 +0000)]
ipv6: add special mode forwarding=2 to send RS while
[ upstream commit
c3bccac2fa76f1619dfe4fb7b9bee69de7f066d8 ]
configured as router
Similar to accepting router advertisement, the IPv6 stack does not send router
solicitations if forwarding is enabled.
This patch enables this behavior to be overruled by setting forwarding to the
special value 2.
Signed-off-by: Thomas Graf <tgraf@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Suresh Siddha [Thu, 23 Jun 2011 18:19:26 +0000 (11:19 -0700)]
x86, mtrr: lock stop machine during MTRR rendezvous sequence
[ upstream commit
6d3321e8e2b3bf6a5892e2ef673c7bf536e3f904 ]
MTRR rendezvous sequence using stop_one_cpu_nowait() can potentially
happen in parallel with another system wide rendezvous using
stop_machine(). This can lead to deadlock (The order in which
works are queued can be different on different cpu's. Some cpu's
will be running the first rendezvous handler and others will be running
the second rendezvous handler. Each set waiting for the other set to join
for the system wide rendezvous, leading to a deadlock).
MTRR rendezvous sequence is not implemented using stop_machine() as this
gets called both from the process context aswell as the cpu online paths
(where the cpu has not come online and the interrupts are disabled etc).
stop_machine() works with only online cpus.
For now, take the stop_machine mutex in the MTRR rendezvous sequence that
gets called from an online cpu (here we are in the process context
and can potentially sleep while taking the mutex). And the MTRR rendezvous
that gets triggered during cpu online doesn't need to take this stop_machine
lock (as the stop_machine() already ensures that there is no cpu hotplug
going on in parallel by doing get_online_cpus())
TBD: Pursue a cleaner solution of extending the stop_machine()
infrastructure to handle the case where the calling cpu is
still not online and use this for MTRR rendezvous sequence.
fixes: https://bugzilla.novell.com/show_bug.cgi?id=672008
Reported-by: Vadim Kotelnikov <vadimuzzz@inbox.ru>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/20110623182056.807230326@sbsiddha-MOBL3.sc.intel.com
Cc: stable@kernel.org # 2.6.35+, backport a week or two after this gets more testing in mainline
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Ralf Baechle [Mon, 27 Jun 2011 13:26:56 +0000 (14:26 +0100)]
SERIAL: SC26xx: Fix link error.
[ upstream commit
f2eb3cdf14457fccb14ae8c4d7d7cee088cd3957 ]
Kconfig allows enabling console support for the SC26xx driver even when
it's configured as a module resulting in a:
ERROR: "uart_console_device" [drivers/tty/serial/sc26xx.ko] undefined!
modpost error since the driver was merged in
eea63e0e8a60d00485b47fb6e75d9aa2566b989b [SC26XX: New serial driver for
SC2681 uarts] in 2.6.25. Fixed by only allowing console support to be
enabled if the driver is builtin.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-serial@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: stable <stable@kernel.org>
Acked-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Mark Brown [Sat, 2 Jul 2011 00:24:46 +0000 (17:24 -0700)]
ASoC: Ensure we delay long enough for WM8994 FLL to lock
[ upstream commit
8e9ddf811ba021506d2316fcfe619faa0ab3f567 ]
when starting
This delay is very conservative.
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Acked-by: Liam Girdwood <lrg@ti.com>
Cc: stable@kernel.org
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Will Deacon [Fri, 1 Jul 2011 13:38:12 +0000 (14:38 +0100)]
ARM: 6989/1: perf: do not start the PMU when no events are
[ upstream commit
f4f38430c94c38187db73a2cf3892cc8b12a2713 ]
present
armpmu_enable can be called in situations where no events are present
(for example, from the event rotation tick after a profiled task has
exited). In this case, we currently start the PMU anyway which may
leave it active inevitably without any events being monitored.
This patch adds a simple check to the enabling code so that we avoid
starting the PMU when no events are present.
Cc: <stable@kernel.org>
Reported-by: Ashwin Chaugle <ashwinc@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
K. Y. Srinivasan [Thu, 16 Jun 2011 20:16:37 +0000 (13:16 -0700)]
Staging: hv: netvsc: Fix a bug in accounting transmit slots
[ upstream commit
9079ce691255792009c446d8c3382507b8d38635 ]
The transmit slots were manipulated without proper locking. Fix this bug by
making the variable tracking the transmit slots atomic.
This patch should be ported to prior stable kernels 2.6.32 and later.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: Hank Janssen <hjanssen@microsoft.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Vasiliy Kulikov [Sun, 26 Jun 2011 08:56:22 +0000 (12:56 +0400)]
staging: comedi: fix infoleak to userspace
[ upstream commit
819cbb120eaec7e014e5abd029260db1ca8c5735 ]
driver_name and board_name are pointers to strings, not buffers of size
COMEDI_NAMELEN. Copying COMEDI_NAMELEN bytes of a string containing
less than COMEDI_NAMELEN-1 bytes would leak some unrelated bytes.
Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Larry Finger [Sun, 19 Jun 2011 03:34:34 +0000 (22:34 -0500)]
staging: r8192e_pci: Handle duplicate PCI ID 0x10ec:0x8192
[ upstream commit
1c50bf7e415cf6ce9545dbecc2ac0d89d3916c53 ]
conflict with rtl8192se
There are two devices with PCI ID 0x10ec:0x8192, namely RTL8192E and
RTL8192SE. The method of distinguishing them is by the revision ID
at offset 0x8 of the PCI configuration space. If the value is 0x10,
then the device uses rtl8192se for a driver.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Joerg Roedel [Wed, 6 Jul 2011 15:14:44 +0000 (17:14 +0200)]
iommu/amd: Don't use MSI address range for DMA addresses
[ upstream commit
17f5b569e09cfa3488eaa663cbf9feada2e789f5 ]
Reserve the MSI address range in the address allocator so
that MSI addresses are not handed out as dma handles.
Cc: stable@kernel.org
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Paul Parsons [Sun, 8 May 2011 01:54:33 +0000 (01:54 +0000)]
ARM: pxa: fix PGSR register address calculation
[ upstream commit
beb0c9b056b1c23d2029b46a425362e9ccbeba01 ]
The file mfp-pxa2xx.c defines a macro, PGSR(), which translates a gpio
bank number to a PGSR register address. The function pxa2xx_mfp_suspend()
erroneously passed in a gpio number instead of a gpio bank number.
Signed-off-by: Paul Parsons <lost.distance@yahoo.com>
Cc: stable@kernel.org
Signed-off-by: Eric Miao <eric.y.miao@gmail.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Steven Rostedt [Tue, 5 Jul 2011 18:32:51 +0000 (14:32 -0400)]
tracing: Have "enable" file use refcounts like the "filter"
[ upstream commit
40ee4dffff061399eb9358e0c8fcfbaf8de4c8fe ]
file
The "enable" file for the event system can be removed when a module
is unloaded and the event system only has events from that module.
As the event system nr_events count goes to zero, it may be freed
if its ref_count is also set to zero.
Like the "filter" file, the "enable" file may be opened by a task and
referenced later, after a module has been unloaded and the events for
that event system have been removed.
Although the "filter" file referenced the event system structure,
the "enable" file only references a pointer to the event system
name. Since the name is freed when the event system is removed,
it is possible that an access to the "enable" file may reference
a freed pointer.
Update the "enable" file to use the subsystem_open() routine that
the "filter" file uses, to keep a reference to the event system
structure while the "enable" file is opened.
Cc: <stable@kernel.org>
Reported-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Steven Rostedt [Tue, 5 Jul 2011 15:36:06 +0000 (11:36 -0400)]
tracing: Fix bug when reading system filters on module
[ upstream commit
e9dbfae53eeb9fc3d4bb7da3df87fa9875f5da02 ]
removal
The event system is freed when its nr_events is set to zero. This happens
when a module created an event system and then later the module is
removed. Modules may share systems, so the system is allocated when
it is created and freed when the modules are unloaded and all the
events under the system are removed (nr_events set to zero).
The problem arises when a task opened the "filter" file for the
system. If the module is unloaded and it removed the last event for
that system, the system structure is freed. If the task that opened
the filter file accesses the "filter" file after the system has
been freed, the system will access an invalid pointer.
By adding a ref_count, and using it to keep track of what
is using the event system, we can free it after all users
are finished with the event system.
Cc: <stable@kernel.org>
Reported-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Kuninori Morimoto [Thu, 7 Jul 2011 00:58:56 +0000 (17:58 -0700)]
ASoC: ak4642: fixup snd_soc_update_bits mask for PW_MGMT2
[ upstream commit
bd7fdbcaa2d06d446577fd3c9b81847b04469e01 ]
mask didn't cover update-data
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Acked-by: Liam Girdwood <lrg@ti.com>
Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: stable@kernel.org
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Johannes Berg [Thu, 7 Jul 2011 16:24:54 +0000 (18:24 +0200)]
mac80211: fix TKIP replay vulnerability
[ upstream commit
34459512ffa7236c849466e3bd604801389734e1 ]
Unlike CCMP, the presence or absence of the QoS
field doesn't change the encryption, only the
TID is used. When no QoS field is present, zero
is used as the TID value. This means that it is
possible for an attacker to take a QoS packet
with TID 0 and replay it as a non-QoS packet.
Unfortunately, mac80211 uses different IVs for
checking the validity of the packet's TKIP IV
when it checks TID 0 and when it checks non-QoS
packets. This means it is vulnerable to this
replay attack.
To fix this, use the same replay counter for
TID 0 and non-QoS packets by overriding the
rx->queue value to 0 if it is 16 (non-QoS).
This is a minimal fix for now. I caused this
issue in
commit
1411f9b531f0a910cd1c85a337737c1e6ffbae6a
Author: Johannes Berg <johannes@sipsolutions.net>
Date: Thu Jul 10 10:11:02 2008 +0200
mac80211: fix RX sequence number check
while fixing a sequence number issue (there,
a separate counter needs to be used).
[AK: This was a non trivial backport. Johannes, John,
please double check]
Cc: stable@kernel.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Hans Verkuil [Sun, 12 Jun 2011 09:36:41 +0000 (06:36 -0300)]
v4l2-ioctl.c: prefill tuner type for g_frequency and
[ upstream commit
227690df75382e46a4f6ea1bbc5df855a674b47f ]
g/s_tuner
The subdevs are supposed to receive a valid tuner type for the g_frequency
and g/s_tuner subdev ops. Some drivers do this, others don't. So prefill
this in v4l2-ioctl.c based on whether the device node from which this is
called is a radio node or not.
The spec does not require applications to fill in the type, and if they
leave it at 0 then the 'check_mode' call in tuner-core.c will return
an error and the ioctl does nothing.
Cc: stable@kernel.org
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Hans Verkuil [Sun, 12 Jun 2011 09:39:52 +0000 (06:39 -0300)]
pvrusb2: fix g/s_tuner support
[ upstream commit
50e9efd60b213ce43ad6979bfc18e25eec2d8413 ]
The tuner-core subdev requires that the type field of v4l2_tuner is
filled in correctly. This is done in v4l2-ioctl.c, but pvrusb2 doesn't
use that yet, so we have to do it manually based on whether the current
input is radio or not.
Tested with my pvrusb2.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Acked-by: Mike Isely <isely@pobox.com>
Cc: stable@kernel.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Hans Verkuil [Sun, 12 Jun 2011 10:02:43 +0000 (07:02 -0300)]
bttv: fix s_tuner for radio
[ upstream commit
a024c1a6b274e11596d124619e43c25560f64c01 ]
Fix typo: g_tuner should have been s_tuner.
Tested with a bttv card.
Cc: stable@kernel.org
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Trond Myklebust [Wed, 6 Jul 2011 23:58:23 +0000 (19:58 -0400)]
SUNRPC: Fix a race between work-queue and rpc_killall_tasks
[ upstream commit
b55c59892e1f3b6c7d4b9ccffb4263e1486fb990 ]
Since rpc_killall_tasks may modify the rpc_task's tk_action field
without any locking, we need to be careful when dereferencing it.
Reported-by: Ben Greear <greearb@candelatech.com>
Tested-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Ajay Kumar Gupta [Fri, 8 Jul 2011 09:36:13 +0000 (15:06 +0530)]
usb: musb: restore INDEX register in resume path
[ upstream commit
3c5fec75e121b21a2eb35e5a6b44291509abba6f ]
Restoring the missing INDEX register value in musb_restore_context().
Without this suspend resume functionality is broken with offmode
enabled.
Cc: stable@kernel.org
Acked-by: Anand Gadiyar <gadiyar@ti.com>
Signed-off-by: Ajay Kumar Gupta <ajay.gupta@ti.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Rajkumar Manoharan [Thu, 7 Jul 2011 18:03:39 +0000 (23:33 +0530)]
mac80211: Restart STA timers only on associated state
[ upstream commit
676b58c27475a9defccc025fea1cbd2b141ee539 ]
A panic was observed when the device is failed to resume properly,
and there are no running interfaces. ieee80211_reconfig tries
to restart STA timers on unassociated state.
Cc: stable@kernel.org
Signed-off-by: Rajkumar Manoharan <rmanohar@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Sergei Shtylyov [Wed, 6 Jul 2011 19:19:38 +0000 (23:19 +0400)]
EHCI: only power off port if over-current is active
[ upstream commit
81463c1d707186adbbe534016cd1249edeab0dac ]
MAX4967 USB power supply chip we use on our boards signals over-current when
power is not enabled; once it's enabled, over-current signal returns to normal.
That unfortunately caused the endless stream of "over-current change on port"
messages. The EHCI root hub code reacts on every over-current signal change
with powering off the port -- such change event is generated the moment the
port power is enabled, so once enabled the power is immediately cut off.
I think we should only cut off power when we're seeing the active over-current
signal, so I'm adding such check to that code. I also think that the fact that
we've cut off the port power should be reflected in the result of GetPortStatus
request immediately, hence I'm adding a PORTSCn register readback after write...
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Cc: stable@kernel.org
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Bob Liu [Fri, 8 Jul 2011 22:39:46 +0000 (15:39 -0700)]
mm/nommu.c: fix remap_pfn_range()
[ upstream commit
8f3b1327aa454bc8283e96bca7669c3c88b83f79 ]
remap_pfn_range() means map physical address pfn<<PAGE_SHIFT to user addr.
For nommu arch it's implemented by vma->vm_start = pfn << PAGE_SHIFT which
is wrong acroding the original meaning of this function. And some driver
developer using remap_pfn_range() with correct parameter will get
unexpected result because vm_start is changed. It should be implementd
like addr = pfn << PAGE_SHIFT but which is meanless on nommu arch, this
patch just make it simply return.
Parameter name and setting of vma->vm_flags also be fixed.
Signed-off-by: Bob Liu <lliubbo@gmail.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: David Howells <dhowells@redhat.com>
Acked-by: Greg Ungerer <gerg@uclinux.org>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: Bob Liu <lliubbo@gmail.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Stefan Richter [Sat, 9 Jul 2011 22:23:03 +0000 (00:23 +0200)]
firewire: ohci: do not bind to Pinnacle cards, avert panic
[ upstream commit
7f7e37115a8b6724f26d0637a04e1d35e3c59717 ]
When firewire-ohci is bound to a Pinnacle MovieBoard, eventually a
"Register access failure" is logged and an interrupt storm or a kernel
panic happens. https://bugzilla.kernel.org/show_bug.cgi?id=36622
Until this is sorted out (if that is going to succeed at all), let's
just prevent firewire-ohci from touching these devices.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: <stable@kernel.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Igor Grinberg [Mon, 9 May 2011 11:41:46 +0000 (14:41 +0300)]
ARM: pxa/cm-x300: fix V3020 RTC functionality
[ upstream commit
6c7b3ea52e345ab614edb91d3f0e9f3bb3713871 ]
While in sleep mode the CS# and other V3020 RTC GPIOs must be driven
high, otherwise V3020 RTC fails to keep the right time in sleep mode.
Signed-off-by: Igor Grinberg <grinberg@compulab.co.il>
Cc: stable@kernel.org
Signed-off-by: Eric Miao <eric.y.miao@gmail.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Ben Greear [Tue, 12 Jul 2011 17:27:55 +0000 (10:27 -0700)]
SUNRPC: Fix use of static variable in rpcb_getport_async
[ upstream commit
ec0dd267bf7d08cb30e321e45a75fd40edd7e528 ]
Because struct rpcbind_args *map was declared static, if two
threads entered this method at the same time, the values
assigned to map could be sent two two differen tasks.
This could cause all sorts of problems, include use-after-free
and double-free of memory.
Fix this by removing the static declaration so that the map
pointer is on the stack.
Signed-off-by: Ben Greear <greearb@candelatech.com>
Cc: stable@kernel.org
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Len Brown [Thu, 14 Jul 2011 04:53:24 +0000 (00:53 -0400)]
x86, intel, power: Initialize MSR_IA32_ENERGY_PERF_BIAS
[ upstream commit
abe48b108247e9b90b4c6739662a2e5c765ed114 ]
Since 2.6.36 (
23016bf0d25), Linux prints the existence of "epb" in /proc/cpuinfo,
Since 2.6.38 (
d5532ee7b40), the x86_energy_perf_policy(8) utility has
been available in-tree to update MSR_IA32_ENERGY_PERF_BIAS.
However, the typical BIOS fails to initialize the MSR, presumably
because this is handled by high-volume shrink-wrap operating systems...
Linux distros, on the other hand, do not yet invoke x86_energy_perf_policy(8).
As a result, WSM-EP, SNB, and later hardware from Intel will run in its
default hardware power-on state (performance), which assumes that users
care for performance at all costs and not for energy efficiency.
While that is fine for performance benchmarks, the hardware's intended default
operating point is "normal" mode...
Initialize the MSR to the "normal" by default during kernel boot.
x86_energy_perf_policy(8) is available to change the default after boot,
should the user have a different preference.
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1107140051020.18606@x980
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: <stable@kernel.org>
Venkatesh Pallipadi [Fri, 4 Jun 2010 03:22:28 +0000 (23:22 -0400)]
x86: Look for IA32_ENERGY_PERF_BIAS support
[ upstream commit
23016bf0 ]
The new IA32_ENERGY_PERF_BIAS MSR allows system software to give
hardware a hint whether OS policy favors more power saving,
or more performance. This allows the OS to have some influence
on internal hardware power/performance tradeoffs where the OS
has previously had no influence.
The support for this feature is indicated by CPUID.06H.ECX.bit3,
as documented in the Intel Architectures Software Developer's Manual.
This patch discovers support of this feature and displays it
as "epb" in /proc/cpuinfo.
Signed-off-by: Venkatesh Pallipadi <venki@google.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
LKML-Reference: <alpine.LFD.2.00.
1006032310160.6669@localhost.localdomain>
Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
J. Bruce Fields [Wed, 29 Jun 2011 20:49:04 +0000 (16:49 -0400)]
svcrpc: fix list-corrupting race on nfsd shutdown
[ upstream commit
ebc63e531cc6a457595dd110b07ac530eae788c3 ]
After commit
3262c816a3d7fb1eaabce633caa317887ed549ae "[PATCH] knfsd:
split svc_serv into pools", svc_delete_xprt (then svc_delete_socket) no
longer removed its xpt_ready (then sk_ready) field from whatever list it
was on, noting that there was no point since the whole list was about to
be destroyed anyway.
That was mostly true, but forgot that a few svc_xprt_enqueue()'s might
still be hanging around playing with the about-to-be-destroyed list, and
could get themselves into trouble writing to freed memory if we left
this xprt on the list after freeing it.
(This is actually functionally identical to a patch made first by Ben
Greear, but with more comments.)
Cc: stable@kernel.org
Cc: gnb@fmeh.org
Reported-by: Ben Greear <greearb@candelatech.com>
Tested-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Stefan Richter [Sat, 9 Jul 2011 14:42:26 +0000 (16:42 +0200)]
firewire: cdev: return -ENOTTY for unimplemented ioctls, not
[ upstream commit
d873d794235efa590ab3c94d5ee22bb1fab19ac4 ]
-EINVAL
On Jun 27 Linus Torvalds wrote:
> The correct error code for "I don't understand this ioctl" is ENOTTY.
> The naming may be odd, but you should think of that error value as a
> "unrecognized ioctl number, you're feeding me random numbers that I
> don't understand and I assume for historical reasons that you tried to
> do some tty operation on me".
[...]
> The EINVAL thing goes way back, and is a disaster. It predates Linux
> itself, as far as I can tell. You'll find lots of man-pages that have
> this line in it:
>
> EINVAL Request or argp is not valid.
>
> and it shows up in POSIX etc. And sadly, it generally shows up
> _before_ the line that says
>
> ENOTTY The specified request does not apply to the kind of object
> that the descriptor d references.
>
> so a lot of people get to the EINVAL, and never even notice the ENOTTY.
[...]
> At least glibc (and hopefully other C libraries) use a _string_ that
> makes much more sense: strerror(ENOTTY) is "Inappropriate ioctl for
> device"
So let's correct this in the <linux/firewire-cdev.h> ABI while it is
still young, relative to distributor adoption.
Side note: We return -ENOTTY not only on _IOC_TYPE or _IOC_NR mismatch,
but also on _IOC_SIZE mismatch. An ioctl with an unsupported size of
argument structure can be seen as an unsupported version of that ioctl.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: <stable@kernel.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Stefan Richter [Sat, 9 Jul 2011 14:43:22 +0000 (16:43 +0200)]
firewire: cdev: prevent race between first get_info ioctl
[ upstream commit
93b37905f70083d6143f5f4dba0a45cc64379a62 ]
and bus reset event queuing
Between open(2) of a /dev/fw* and the first FW_CDEV_IOC_GET_INFO
ioctl(2) on it, the kernel already queues FW_CDEV_EVENT_BUS_RESET events
to be read(2) by the client. The get_info ioctl is practically always
issued right away after open, hence this condition only occurs if the
client opens during a bus reset, especially during a rapid series of bus
resets.
The problem with this condition is twofold:
- These bus reset events carry the (as yet undocumented) @closure
value of 0. But it is not the kernel's place to choose closures;
they are privat to the client. E.g., this 0 value forced from the
kernel makes it unsafe for clients to dereference it as a pointer to
a closure object without NULL pointer check.
- It is impossible for clients to determine the relative order of bus
reset events from get_info ioctl(2) versus those from read(2),
except in one way: By comparison of closure values. Again, such a
procedure imposes complexity on clients and reduces freedom in use
of the bus reset closure.
So, change the ABI to suppress queuing of bus reset events before the
first FW_CDEV_IOC_GET_INFO ioctl was issued by the client.
Note, this ABI change cannot be version-controlled. The kernel cannot
distinguish old from new clients before the first FW_CDEV_IOC_GET_INFO
ioctl.
We will try to back-merge this change into currently maintained stable/
longterm series, and we only document the new behaviour. The old
behavior is now considered a kernel bug, which it basically is.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: <stable@kernel.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Alan Stern [Fri, 15 Jul 2011 21:22:15 +0000 (17:22 -0400)]
USB: OHCI: fix another regression for NVIDIA controllers
[ upstream commit
6ea12a04d295235ed67010a09fdea58c949e3eb0 ]
The NVIDIA series of OHCI controllers continues to be troublesome. A
few people using the MCP67 chipset have reported that even with the
most recent kernels, the OHCI controller fails to handle new
connections and spams the system log with "unable to enumerate USB
port" messages. This is different from the other problems previously
reported for NVIDIA OHCI controllers, although it is probably related.
It turns out that the MCP67 controller does not like to be kept in the
RESET state very long. After only a few seconds, it decides not to
work any more. This patch (as1479) changes the PCI initialization
quirk code so that NVIDIA controllers are switched into the SUSPEND
state after 50 ms of RESET. With no interrupts enabled and all the
downstream devices reset, and thus unable to send wakeup requests,
this should be perfectly safe (even for non-NVIDIA hardware).
The removal code in ohci-hcd hasn't been changed; it will still leave
the controller in the RESET state. As a result, if someone unloads
ohci-hcd and then reloads it, the controller won't work again until
the system is rebooted. If anybody complains about this, the removal
code can be updated similarly.
This fixes Bugzilla #22052.
Tested-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: stable <stable@kernel.org>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Luca Tettamanti [Sun, 17 Jul 2011 16:39:18 +0000 (18:39 +0200)]
hwmon: (asus_atk0110) Fix memory leak
[ upstream commit
0b8e77f12cb6bfe2e5a67f2cdc8c7af23abc4ccf ]
The object returned by atk_gitm is dynamically allocated and must be
freed.
Signed-off-by: Luca Tettamanti <kronos.it@gmail.com>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: stable@kernel.org
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Pavel Herrmann [Sun, 17 Jul 2011 16:39:19 +0000 (18:39 +0200)]
hwmon: (max1111) Fix race condition causing NULL pointer
[ upstream commit
d3f684f2820a7f42acef68bea6622d9032127fb2 ]
exception
spi_sync call uses its spi_message parameter to keep completion information,
using a drvdata structure is not thread-safe. Use a mutex to prevent
multiple access to shared driver data.
Signed-off-by: Pavel Herrmann <morpheus.ibis@gmail.com>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Marek Vasut <marek.vasut@gmail.com>
Acked-by: Cyril Hrubis <metan@ucw.cz>
Tested-by: Stanislav Brabec <utx@penguin.cz>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: stable@kernel.org
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Mauro Carvalho Chehab [Sun, 17 Jul 2011 03:24:37 +0000 (00:24 -0300)]
si4713-i2c: avoid potential buffer overflow on si4713
[ upstream commit
dc6b845044ccb7e9e6f3b7e71bd179b3cf0223b6 ]
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
While compiling it with Fedora 15, I noticed this issue:
inlined from ‘si4713_write_econtrol_string’ at drivers/media/radio/si4713-i2c.c:1065:24:
arch/x86/include/asm/uaccess_32.h:211:26: error: call to ‘copy_from_user_overflow’ declared with attribute error: copy_from_user() buffer size is not provably correct
Cc: stable@kernel.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Acked-by: Sakari Ailus <sakari.ailus@maxwell.research.nokia.com>
Acked-by: Eduardo Valentin <edubezval@gmail.com>
Reviewed-by: Eugene Teo <eugeneteo@kernel.sg>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Michael Neuling [Mon, 4 Jul 2011 20:40:10 +0000 (20:40 +0000)]
powerpc/kdump: Fix timeout in crash_kexec_wait_realmode
[ upstream commit
63f21a56f1cc0b800a4c00349c59448f82473d19 ]
The existing code it pretty ugly. How about we clean it up even more
like this?
From: Anton Blanchard <anton@samba.org>
We check for timeout expiry in the outer loop, but we also need to
check it in the inner loop or we can lock up forever waiting for a
CPU to hit real mode.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Cc: <stable@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Hendrik Brueckner [Tue, 5 Jul 2011 21:50:18 +0000 (21:50 +0000)]
hvc_console: Improve tty/console put_chars handling
[ upstream commit
8c2381af0d3ef62a681dac5a141b6dabb27bf2e1 ]
Currently, the hvc_console_print() function drops console output if the
hvc backend's put_chars() returns 0. This patch changes this behavior
to allow a retry through returning -EAGAIN.
This change also affects the hvc_push() function. Both functions are
changed to handle -EAGAIN and to retry the put_chars() operation.
If a hvc backend returns -EAGAIN, the retry handling differs:
- hvc_console_print() spins to write the complete console output.
- hvc_push() behaves the same way as for returning 0.
Now hvc backends can indirectly control the way how console output is
handled through the hvc console layer.
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Acked-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Anton Blanchard [Tue, 5 Jul 2011 21:51:36 +0000 (21:51 +0000)]
powerpc/pseries/hvconsole: Fix dropped console output
[ upstream commit
51d33021425e1f905beb4208823146f2fb6517da ]
Return -EAGAIN when we get H_BUSY back from the hypervisor. This
makes the hvc console driver retry, avoiding dropped printks.
Signed-off-by: Anton Blanchard <anton@samba.org>
Cc: <stable@kernel.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Jon Povey [Tue, 19 Jul 2011 03:30:11 +0000 (12:30 +0900)]
davinci: DM365 EVM: fix video input mux bits
[ upstream commit
9daedd833a38edd90cf7baa1b1fcf61c3a0721e3 ]
Video input mux settings for tvp7002 and imager inputs were swapped.
Comment was correct.
Tested on EVM with tvp7002 input.
Signed-off-by: Jon Povey <jon.povey@racelogic.co.uk>
Acked-by: Manjunath Hadli <manjunath.hadli@ti.com>
Cc: stable@kernel.org
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Wolfgang Denk [Tue, 19 Jul 2011 09:25:38 +0000 (11:25 +0200)]
USB: serial: add IDs for WinChipHead USB->RS232 adapter
[ upstream commit
026dfaf18973404a01f488d6aa556a8c466e06a4 ]
Add ID 4348:5523 for WinChipHead USB->RS 232 adapter with
Prolifec PL2303 chipset
Signed-off-by: Wolfgang Denk <wd@denx.de>
Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Huang Ying [Thu, 14 Jul 2011 01:34:37 +0000 (09:34 +0800)]
kexec, x86: Fix incorrect jump back address if not
[ upstream commit
050438ed5a05b25cdf287f5691e56a58c2606997 ]
preserving context
In kexec jump support, jump back address passed to the kexeced
kernel via function calling ABI, that is, the function call
return address is the jump back entry.
Furthermore, jump back entry == 0 should be used to signal that
the jump back or preserve context is not enabled in the original
kernel.
But in the current implementation the stack position used for
function call return address is not cleared context
preservation is disabled. The patch fixes this bug.
Reported-and-tested-by: Yin Kangkai <kangkai.yin@intel.com>
Signed-off-by: Huang Ying <ying.huang@intel.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1310607277-25029-1-git-send-email-ying.huang@intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Andi Kleen [Mon, 1 Aug 2011 20:55:00 +0000 (13:55 -0700)]
Drop -Werror in perf
Drop -Werror in perf Makefile. This fixes the build with gcc 4.5.x
Not in mainline.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Manoj Iyer [Mon, 11 Jul 2011 21:28:35 +0000 (16:28 -0500)]
mmc: Added quirks for Ricoh 1180:e823 lower base clock
[ upstream commit
15bed0f2fa8e1d7db201692532c210a7823d2d21 ]
frequency
Ricoh 1180:e823 does not recognize certain types of SD/MMC cards,
as reported at http://launchpad.net/bugs/773524. Lowering the SD
base clock frequency from 200Mhz to 50Mhz fixes this issue. This
solution was suggest by Koji Matsumuro, Ricoh Company, Ltd.
This change has no negative performance effect on standard SD
cards, though it's quite possible that there will be one on
UHS-1 cards.
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Tested-by: Daniel Manrique <daniel.manrique@canonical.com>
Cc: Koji Matsumuro <matsumur@nts.ricoh.co.jp>
Cc: <stable@kernel.org>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Manoj Iyer [Thu, 26 May 2011 16:19:05 +0000 (11:19 -0500)]
mmc: Add PCI fixup quirks for Ricoh 1180:e823 reader
[ upstream commit
be98ca652faa6468916a9b7608befff215a8ca70 ]
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Cc: <stable@kernel.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
James Bottomley [Thu, 7 Jul 2011 20:45:40 +0000 (15:45 -0500)]
fix crash in scsi_dispatch_cmd()
[ upstream commit
bfe159a51203c15d23cb3158fffdc25ec4b4dda1 ]
USB surprise removal of sr is triggering an oops in
scsi_dispatch_command(). What seems to be happening is that USB is
hanging on to a queue reference until the last close of the upper
device, so the crash is caused by surprise remove of a mounted CD
followed by attempted unmount.
The problem is that USB doesn't issue its final commands as part of
the SCSI teardown path, but on last close when the block queue is long
gone. The long term fix is probably to make sr do the teardown in the
same way as sd (so remove all the lower bits on ejection, but keep the
upper disk alive until last close of user space). However, the
current oops can be simply fixed by not allowing any commands to be
sent to a dead queue.
Cc: stable@kernel.org
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Guo-Fu Tseng [Wed, 20 Jul 2011 16:57:36 +0000 (16:57 +0000)]
jme: Fix unmap error (Causing system freeze)
[ upstream commit
94c5b41b327e08de0ddf563237855f55080652a1 ]
This patch add the missing dma_unmap().
Which solved the critical issue of system freeze on heavy load.
Michal Miroslaw's rejected patch:
[PATCH v2 10/46] net: jme: convert to generic DMA API
Pointed out the issue also, thank you Michal.
But the fix was incorrect. It would unmap needed address
when low memory.
Got lots of feedback from End user and Gentoo Bugzilla.
https://bugs.gentoo.org/show_bug.cgi?id=373109
Thank you all. :)
Cc: stable@kernel.org
Signed-off-by: Guo-Fu Tseng <cooldavid@cooldavid.org>
Acked-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Manfred Spraul [Tue, 26 Jul 2011 00:11:47 +0000 (17:11 -0700)]
ipc/sem.c: fix race with concurrent semtimedop() timeouts
[ upstream commit
d694ad62bf539dbb20a0899ac2a954555f9e4a83 ]
and IPC_RMID
If a semaphore array is removed and in parallel a sleeping task is woken
up (signal or timeout, does not matter), then the woken up task does not
wait until wake_up_sem_queue_do() is completed. This will cause crashes,
because wake_up_sem_queue_do() will read from a stale pointer.
The fix is simple: Regardless of anything, always call get_queue_result().
This function waits until wake_up_sem_queue_do() has finished it's task.
Addresses https://bugzilla.kernel.org/show_bug.cgi?id=27142
Reported-by: Yuriy Yevtukhov <yuriy@ucoz.com>
Reported-by: Harald Laabs <kernel@dasr.de>
Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: <stable@kernel.org> [2.6.35+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Dan Rosenberg [Tue, 26 Jul 2011 00:11:53 +0000 (17:11 -0700)]
xtensa: prevent arbitrary read in ptrace
[ upstream commit
0d0138ebe24b94065580bd2601f8bb7eb6152f56 ]
Prevent an arbitrary kernel read. Check the user pointer with access_ok()
before copying data in.
[akpm@linux-foundation.org: s/EIO/EFAULT/]
Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com>
Cc: Christian Zankel <chris@zankel.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Peter Zijlstra [Tue, 26 Jul 2011 00:11:57 +0000 (17:11 -0700)]
mm/backing-dev.c: reset bdi min_ratio in bdi_unregister()
[ upstream commit
ccb6108f5b0b541d3eb332c3a73e645c0f84278e ]
Vito said:
: The system has many usb disks coming and going day to day, with their
: respective bdi's having min_ratio set to 1 when inserted. It works for
: some time until eventually min_ratio can no longer be set, even when the
: active set of bdi's seen in /sys/class/bdi/*/min_ratio doesn't add up to
: anywhere near 100.
:
: This then leads to an unrelated starvation problem caused by write-heavy
: fuse mounts being used atop the usb disks, a problem the min_ratio setting
: at the underlying devices bdi effectively prevents.
Fix this leakage by resetting the bdi min_ratio when unregistering the
BDI.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Reported-by: Vito Caputo <lkml@pengaru.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Benjamin Herrenschmidt [Tue, 26 Jul 2011 00:12:32 +0000 (17:12 -0700)]
mm/futex: fix futex writes on archs with SW tracking of
[ upstream commit
2efaca927f5cd7ecd0f1554b8f9b6a9a2c329c03 ]
dirty & young
I haven't reproduced it myself but the fail scenario is that on such
machines (notably ARM and some embedded powerpc), if you manage to hit
that futex path on a writable page whose dirty bit has gone from the PTE,
you'll livelock inside the kernel from what I can tell.
It will go in a loop of trying the atomic access, failing, trying gup to
"fix it up", getting succcess from gup, go back to the atomic access,
failing again because dirty wasn't fixed etc...
So I think you essentially hang in the kernel.
The scenario is probably rare'ish because affected architecture are
embedded and tend to not swap much (if at all) so we probably rarely hit
the case where dirty is missing or young is missing, but I think Shan has
a piece of SW that can reliably reproduce it using a shared writable
mapping & fork or something like that.
On archs who use SW tracking of dirty & young, a page without dirty is
effectively mapped read-only and a page without young unaccessible in the
PTE.
Additionally, some architectures might lazily flush the TLB when relaxing
write protection (by doing only a local flush), and expect a fault to
invalidate the stale entry if it's still present on another processor.
The futex code assumes that if the "in_atomic()" access -EFAULT's, it can
"fix it up" by causing get_user_pages() which would then be equivalent to
taking the fault.
However that isn't the case. get_user_pages() will not call
handle_mm_fault() in the case where the PTE seems to have the right
permissions, regardless of the dirty and young state. It will eventually
update those bits ... in the struct page, but not in the PTE.
Additionally, it will not handle the lazy TLB flushing that can be
required by some architectures in the fault case.
Basically, gup is the wrong interface for the job. The patch provides a
more appropriate one which boils down to just calling handle_mm_fault()
since what we are trying to do is simulate a real page fault.
The futex code currently attempts to write to user memory within a
pagefault disabled section, and if that fails, tries to fix it up using
get_user_pages().
This doesn't work on archs where the dirty and young bits are maintained
by software, since they will gate access permission in the TLB, and will
not be updated by gup().
In addition, there's an expectation on some archs that a spurious write
fault triggers a local TLB flush, and that is missing from the picture as
well.
I decided that adding those "features" to gup() would be too much for this
already too complex function, and instead added a new simpler
fixup_user_fault() which is essentially a wrapper around handle_mm_fault()
which the futex code can call.
[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: fix some nits Darren saw, fiddle comment layout]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reported-by: Shan Hai <haishan.bai@gmail.com>
Tested-by: Shan Hai <haishan.bai@gmail.com>
Cc: David Laight <David.Laight@ACULAB.COM>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Darren Hart <darren.hart@intel.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
H. Peter Anvin [Thu, 21 Jul 2011 18:22:21 +0000 (11:22 -0700)]
x86: Make Dell Latitude E6420 use reboot=pci
[ upstream commit
a536877e77f73ea22d12d94a019fedd9671b6acd ]
Yet another variant of the Dell Latitude series which requires
reboot=pci.
From the E5420 bug report by Daniel J Blueman:
> The E6420 is affected also (same platform, different casing and
> features), which provides an external confirmation of the issue; I can
> submit a patch for that later or include it if you prefer:
> http://linux.koolsolutions.com/2009/08/04/howto-fix-linux-hangfreeze-during-reboots-and-restarts/
Reported-by: Daniel J Blueman <daniel.blueman@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: <stable@kernel.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Daniel J Blueman [Fri, 13 May 2011 01:04:59 +0000 (09:04 +0800)]
x86: Make Dell Latitude E5420 use reboot=pci
[ upstream commit
b7798d28ec15d20fd34b70fa57eb13f0cf6d1ecd ]
Rebooting on the Dell E5420 often hangs with the keyboard or ACPI
methods, but is reliable via the PCI method.
[ hpa: this was deferred because we believed for a long time that the
recent reshuffling of the boot priorities in commit
660e34cebf0a11d54f2d5dd8838607452355f321 fixed this platform.
Unfortunately that turned out to be incorrect. ]
Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1305248699-2347-1-git-send-email-daniel.blueman@gmail.com
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Cc: <stable@kernel.org>
Dan Rosenberg [Wed, 15 Jun 2011 22:09:01 +0000 (15:09 -0700)]
alpha: fix several security issues
[ upstream commit
21c5977a836e399fc710ff2c5367845ed5c2527f ]
Fix several security issues in Alpha-specific syscalls. Untested, but
mostly trivial.
1. Signedness issue in osf_getdomainname allows copying out-of-bounds
kernel memory to userland.
2. Signedness issue in osf_sysinfo allows copying large amounts of
kernel memory to userland.
3. Typo (?) in osf_getsysinfo bounds minimum instead of maximum copy
size, allowing copying large amounts of kernel memory to userland.
4. Usage of user pointer in osf_wait4 while under KERNEL_DS allows
privilege escalation via writing return value of sys_wait4 to kernel
memory.
Said to fix CVE-2011-2208 to CVE-2011-2211
Signed-off-by: Dan Rosenberg <drosenberg@vsecurity.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Vasiliy Kulikov [Fri, 24 Jun 2011 12:08:38 +0000 (16:08 +0400)]
proc: restrict access to /proc/PID/io
[ upstream commit
1d1221f375c94ef961ba8574ac4f85c8870ddd51 ]
/proc/PID/io may be used for gathering private information. E.g. for
openssh and vsftpd daemons wchars/rchars may be used to learn the
precise password length. Restrict it to processes being able to ptrace
the target process.
ptrace_may_access() is needed to prevent keeping open file descriptor of
"io" file, executing setuid binary and gathering io information of the
setuid'ed process.
Said to be CVE-2011-2495
Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Miklos Szeredi [Wed, 23 Feb 2011 12:49:47 +0000 (13:49 +0100)]
mm: prevent concurrent unmap_mapping_range() on the same inode
commit
2aa15890f3c191326678f1bd68af61ec6b8753ec upstream.
Michael Leun reported that running parallel opens on a fuse filesystem
can trigger a "kernel BUG at mm/truncate.c:475"
Gurudas Pai reported the same bug on NFS.
The reason is, unmap_mapping_range() is not prepared for more than
one concurrent invocation per inode. For example:
thread1: going through a big range, stops in the middle of a vma and
stores the restart address in vm_truncate_count.
thread2: comes in with a small (e.g. single page) unmap request on
the same vma, somewhere before restart_address, finds that the
vma was already unmapped up to the restart address and happily
returns without doing anything.
Another scenario would be two big unmap requests, both having to
restart the unmapping and each one setting vm_truncate_count to its
own value. This could go on forever without any of them being able to
finish.
Truncate and hole punching already serialize with i_mutex. Other
callers of unmap_mapping_range() do not, and it's difficult to get
i_mutex protection for all callers. In particular ->d_revalidate(),
which calls invalidate_inode_pages2_range() in fuse, may be called
with or without i_mutex.
This patch adds a new mutex to 'struct address_space' to prevent
running multiple concurrent unmap_mapping_range() on the same mapping.
[ We'll hopefully get rid of all this with the upcoming mm
preemptibility series by Peter Zijlstra, the "mm: Remove i_mmap_mutex
lockbreak" patch in particular. But that is for 2.6.39 ]
Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Reported-by: Michael Leun <lkml20101129@newton.leun.net>
Reported-by: Gurudas Pai <gurudas.pai@oracle.com>
Tested-by: Gurudas Pai <gurudas.pai@oracle.com>
Acked-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Xufeng Zhang [Tue, 21 Jun 2011 10:43:40 +0000 (10:43 +0000)]
udp/recvmsg: Clear MSG_TRUNC flag when starting over for a new packet
[ Upstream commit
9cfaa8def1c795a512bc04f2aec333b03724ca2e ]
Consider this scenario: When the size of the first received udp packet
is bigger than the receive buffer, MSG_TRUNC bit is set in msg->msg_flags.
However, if checksum error happens and this is a blocking socket, it will
goto try_again loop to receive the next packet. But if the size of the
next udp packet is smaller than receive buffer, MSG_TRUNC flag should not
be set, but because MSG_TRUNC bit is not cleared in msg->msg_flags before
receive the next packet, MSG_TRUNC is still set, which is wrong.
Fix this problem by clearing MSG_TRUNC flag when starting over for a
new packet.
Signed-off-by: Xufeng Zhang <xufeng.zhang@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Xufeng Zhang [Tue, 21 Jun 2011 10:43:39 +0000 (10:43 +0000)]
ipv6/udp: Use the correct variable to determine non-blocking condition
[ Upstream commit
32c90254ed4a0c698caa0794ebb4de63fcc69631 ]
udpv6_recvmsg() function is not using the correct variable to determine
whether or not the socket is in non-blocking operation, this will lead
to unexpected behavior when a UDP checksum error occurs.
Consider a non-blocking udp receive scenario: when udpv6_recvmsg() is
called by sock_common_recvmsg(), MSG_DONTWAIT bit of flags variable in
udpv6_recvmsg() is cleared by "flags & ~MSG_DONTWAIT" in this call:
err = sk->sk_prot->recvmsg(iocb, sk, msg, size, flags & MSG_DONTWAIT,
flags & ~MSG_DONTWAIT, &addr_len);
i.e. with udpv6_recvmsg() getting these values:
int noblock = flags & MSG_DONTWAIT
int flags = flags & ~MSG_DONTWAIT
So, when udp checksum error occurs, the execution will go to
csum_copy_err, and then the problem happens:
csum_copy_err:
...............
if (flags & MSG_DONTWAIT)
return -EAGAIN;
goto try_again;
...............
But it will always go to try_again as MSG_DONTWAIT has been cleared
from flags at call time -- only noblock contains the original value
of MSG_DONTWAIT, so the test should be:
if (noblock)
return -EAGAIN;
This is also consistent with what the ipv4/udp code does.
Signed-off-by: Xufeng Zhang <xufeng.zhang@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Marcus Meissner [Thu, 2 Jun 2011 04:05:22 +0000 (21:05 -0700)]
net/ipv4: Check for mistakenly passed in non-IPv4 address
[ Upstream commit
d0733d2e29b652b2e7b1438ececa732e4eed98eb ]
Check against mistakenly passing in IPv6 addresses (which would result
in an INADDR_ANY bind) or similar incompatible sockaddrs.
Signed-off-by: Marcus Meissner <meissner@suse.de>
Cc: Reinhard Max <max@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Eric Dumazet [Tue, 7 Jun 2011 05:42:06 +0000 (22:42 -0700)]
af_packet: prevent information leak
[ Upstream commit
13fcb7bd322164c67926ffe272846d4860196dc6 ]
In 2.6.27, commit
393e52e33c6c2 (packet: deliver VLAN TCI to userspace)
added a small information leak.
Add padding field and make sure its zeroed before copy to user.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
CC: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Joe Perches [Sat, 21 May 2011 07:48:40 +0000 (07:48 +0000)]
net: filter: Use WARN_RATELIMIT
[ Upstream commit
6c4a5cb219520c7bc937ee186ca53f03733bd09f ]
A mis-configured filter can spam the logs with lots of stack traces.
Rate-limit the warnings and add printout of the bogus filter information.
Original-patch-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Joe Perches [Sat, 21 May 2011 07:48:39 +0000 (07:48 +0000)]
bug.h: Add WARN_RATELIMIT
[ Upstream commit
b3eec79b0776e5340a3db75b34953977c7e5086e ]
Add a generic mechanism to ratelimit WARN(foo, fmt, ...) messages
using a hidden per call site static struct ratelimit_state.
Also add an __WARN_RATELIMIT variant to be able to use a specific
struct ratelimit_state.
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Rafael J. Wysocki [Wed, 6 Jul 2011 18:15:23 +0000 (20:15 +0200)]
PM / Hibernate: Fix free_unnecessary_pages()
commit
4d4cf23cdde2f8f9324f5684a7f349e182039529 upstream.
There is a bug in free_unnecessary_pages() that causes it to
attempt to free too many pages in some cases, which triggers the
BUG_ON() in memory_bm_clear_bit() for copy_bm. Namely, if
count_data_pages() is initially greater than alloc_normal, we get
to_free_normal equal to 0 and "save" greater from 0. In that case,
if the sum of "save" and count_highmem_pages() is greater than
alloc_highmem, we subtract a positive number from to_free_normal.
Hence, since to_free_normal was 0 before the subtraction and is
an unsigned int, the result is converted to a huge positive number
that is used as the number of pages to free.
Fix this bug by checking if to_free_normal is actually greater
than or equal to the number we're going to subtract from it.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Reported-and-tested-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Eric Dumazet [Fri, 17 Jun 2011 20:25:39 +0000 (16:25 -0400)]
inet_diag: fix inet_diag_bc_audit()
[ Upstream commit
eeb1497277d6b1a0a34ed36b97e18f2bd7d6de0d ]
A malicious user or buggy application can inject code and trigger an
infinite loop in inet_diag_bc_audit()
Also make sure each instruction is aligned on 4 bytes boundary, to avoid
unaligned accesses.
Reported-by: Dan Rosenberg <drosenberg@vsecurity.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Roland McGrath [Tue, 26 Oct 2010 21:22:19 +0000 (14:22 -0700)]
uml: fix CONFIG_STATIC_LINK=y build failure with newer glibc
commit
aa5fb4dbfd121296ca97c68cf90043a7ea97579d upstream.
With glibc 2.11 or later that was built with --enable-multi-arch, the UML
link fails with undefined references to __rel_iplt_start and similar
symbols. In recent binutils, the default linker script defines these
symbols (see ld --verbose). Fix the UML linker scripts to match the new
defaults for these sections.
Signed-off-by: Roland McGrath <roland@redhat.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Alan Stern [Wed, 15 Jun 2011 20:29:16 +0000 (16:29 -0400)]
USB: don't let the hub driver prevent system sleep
commit
cbb330045e5df8f665ac60227ff898421fc8fb92 upstream.
This patch (as1465) continues implementation of the policy that errors
during suspend or hibernation should not prevent the system from going
to sleep.
In this case, failure to turn on the Suspend feature for a hub port
shouldn't be reported as an error. There are situations where this
does actually occur (such as when the device plugged into that port
was disconnected in the recent past), and it turns out to be harmless.
There's no reason for it to prevent a system sleep.
Also, don't allow the hub driver to fail a system suspend if the
downstream ports aren't all suspended. This is also harmless (and
should never happen, given the change mentioned above); printing a
warning message in the kernel log is all we really need to do.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Alan Stern [Wed, 15 Jun 2011 20:27:43 +0000 (16:27 -0400)]
USB: don't let errors prevent system sleep
commit
0af212ba8f123c2eba151af7726c34a50b127962 upstream.
This patch (as1464) implements the recommended policy that most errors
during suspend or hibernation should not prevent the system from going
to sleep. In particular, failure to suspend a USB driver or a USB
device should not prevent the sleep from succeeding:
Failure to suspend a device won't matter, because the device will
automatically go into suspend mode when the USB bus stops carrying
packets. (This might be less true for USB-3.0 devices, but let's not
worry about them now.)
Failure of a driver to suspend might lead to trouble later on when the
system wakes up, but it isn't sufficient reason to prevent the system
from going to sleep.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Vasiliy Kulikov [Mon, 27 Jun 2011 23:18:11 +0000 (16:18 -0700)]
taskstats: don't allow duplicate entries in listener mode
commit
26c4caea9d697043cc5a458b96411b86d7f6babd upstream.
Currently a single process may register exit handlers unlimited times.
It may lead to a bloated listeners chain and very slow process
terminations.
Eg after 10KK sent TASKSTATS_CMD_ATTR_REGISTER_CPUMASKs ~300 Mb of
kernel memory is stolen for the handlers chain and "time id" shows 2-7
seconds instead of normal 0.003. It makes it possible to exhaust all
kernel memory and to eat much of CPU time by triggerring numerous exits
on a single CPU.
The patch limits the number of times a single process may register
itself on a single CPU to one.
One little issue is kept unfixed - as taskstats_exit() is called before
exit_files() in do_exit(), the orphaned listener entry (if it was not
explicitly deregistered) is kept until the next someone's exit() and
implicit deregistration in send_cpu_listeners(). So, if a process
registered itself as a listener exits and the next spawned process gets
the same pid, it would inherit taskstats attributes.
Signed-off-by: Vasiliy Kulikov <segooon@gmail.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Arnd Bergmann [Sat, 2 Jul 2011 00:30:00 +0000 (17:30 -0700)]
6pack,mkiss: fix lock inconsistency
commit
6e4e2f811bade330126d4029c88c831784a7efd9 upstream.
Lockdep found a locking inconsistency in the mkiss_close function:
> kernel: [ INFO: inconsistent lock state ]
> kernel: 2.6.39.1 #3
> kernel: ---------------------------------
> kernel: inconsistent {IN-SOFTIRQ-R} -> {SOFTIRQ-ON-W} usage.
> kernel: ax25ipd/2813 [HC0[0]:SC0[0]:HE1:SE1] takes:
> kernel: (disc_data_lock){+++?.-}, at: [<
ffffffffa018552b>] mkiss_close+0x1b/0x90 [mkiss]
> kernel: {IN-SOFTIRQ-R} state was registered at:
The message hints that disc_data_lock is aquired with softirqs disabled,
but does not itself disable softirqs, which can in rare circumstances
lead to a deadlock.
The same problem is present in the 6pack driver, this patch fixes both
by using write_lock_bh instead of write_lock.
Reported-by: Bernard F6BVP <f6bvp@free.fr>
Tested-by: Bernard F6BVP <f6bvp@free.fr>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Ralf Baechle<ralf@linux-mips.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Trond Myklebust [Fri, 17 Jun 2011 14:14:59 +0000 (10:14 -0400)]
SUNRPC: Ensure the RPC client only quits on fatal signals
commit
5afa9133cfe67f1bfead6049a9640c9262a7101c upstream.
Fix a couple of instances where we were exiting the RPC client on
arbitrary signals. We should only do so on fatal signals.
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
NeilBrown [Tue, 28 Jun 2011 06:59:42 +0000 (16:59 +1000)]
md: avoid endless recovery loop when waiting for fail device to complete.
commit
4274215d24633df7302069e51426659d4759c5ed upstream.
If a device fails in a way that causes pending request to take a while
to complete, md will not be able to immediately remove it from the
array in remove_and_add_spares.
It will then incorrectly look like a spare device and md will try to
recover it even though it is failed.
This leads to a recovery process starting and instantly aborting over
and over again.
We should check if the device is faulty before considering it to be a
spare. This will avoid trying to start a recovery that cannot
proceed.
This bug was introduced in 2.6.26 so that patch is suitable for any
kernel since then.
Reported-by: Jim Paradis <james.paradis@stratus.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Jean Delvare [Wed, 29 Jun 2011 09:36:10 +0000 (11:36 +0200)]
i2c-taos-evm: Fix log messages
commit
9b640f2e154268cb516efcaf9c434f2e73c6783e upstream.
* Print all error and information messages even when debugging is
disabled.
* Don't use adapter device to log messages before it is ready.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Shaohua Li [Mon, 27 Jun 2011 07:03:47 +0000 (09:03 +0200)]
cfq-iosched: fix a rcu warning
commit
3181faa85bda3dc3f5e630a1846526c9caaa38e3 upstream.
I got a rcu warnning at boot. the ioc->ioc_data is rcu_deferenced, but
doesn't hold rcu_read_lock.
Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Jens Axboe [Sun, 5 Jun 2011 04:01:13 +0000 (06:01 +0200)]
cfq-iosched: fix locking around ioc->ioc_data assignment
commit
ab4bd22d3cce6977dc039664cc2d052e3147d662 upstream.
Since we are modifying this RCU pointer, we need to hold
the lock protecting it around it.
This fixes a potential reuse and double free of a cfq
io_context structure. The bug has been in CFQ for a long
time, it hit very few people but those it did hit seemed
to see it a lot.
Tracked in RH bugzilla here:
https://bugzilla.redhat.com/show_bug.cgi?id=577968
Credit goes to Paul Bolle for figuring out that the issue
was around the one-hit ioc->ioc_data cache. Thanks to his
hard work the issue is now fixed.
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Marcin Slusarz [Sat, 28 May 2011 11:23:42 +0000 (13:23 +0200)]
debugobjects: Fix boot crash when kmemleak and debugobjects enabled
commit
161b6ae0e067e421b20bb35caf66bdb405c929ac upstream.
Order of initialization look like this:
...
debugobjects
kmemleak
...(lots of other subsystems)...
workqueues (through early initcall)
...
debugobjects use schedule_work for batch freeing of its data and kmemleak
heavily use debugobjects, so when it comes to freeing and workqueues were
not initialized yet, kernel crashes:
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<
ffffffff810854d1>] __queue_work+0x29/0x41a
[<
ffffffff81085910>] queue_work_on+0x16/0x1d
[<
ffffffff81085abc>] queue_work+0x29/0x55
[<
ffffffff81085afb>] schedule_work+0x13/0x15
[<
ffffffff81242de1>] free_object+0x90/0x95
[<
ffffffff81242f6d>] debug_check_no_obj_freed+0x187/0x1d3
[<
ffffffff814b6504>] ? _raw_spin_unlock_irqrestore+0x30/0x4d
[<
ffffffff8110bd14>] ? free_object_rcu+0x68/0x6d
[<
ffffffff8110890c>] kmem_cache_free+0x64/0x12c
[<
ffffffff8110bd14>] free_object_rcu+0x68/0x6d
[<
ffffffff810b58bc>] __rcu_process_callbacks+0x1b6/0x2d9
...
because system_wq is NULL.
Fix it by checking if workqueues susbystem was initialized before using.
Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Dipankar Sarma <dipankar@in.ibm.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/20110528112342.GA3068@joi.lan
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Florian Fainelli [Wed, 15 Jun 2011 17:15:23 +0000 (19:15 +0200)]
watchdog: mtx1-wdt: request gpio before using it
commit
9b19d40aa3ebaf1078779da10555da2ab8512422 upstream.
Otherwise, the gpiolib autorequest feature will produce a WARN_ON():
WARNING: at drivers/gpio/gpiolib.c:101 0x8020ec6c()
autorequest GPIO-215
[...]
Signed-off-by: Florian Fainelli <florian@openwrt.org>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Sjoerd Simons [Tue, 24 May 2011 15:22:03 +0000 (12:22 -0300)]
uvcvideo: Remove buffers from the queues when freeing
commit
8ca2c80b170c47eeb55f0c2a0f2b8edf85f35d49 upstream.
When freeing memory for the video buffers also remove them from the
irq & main queues.
This fixes an oops when doing the following:
open ("/dev/video", ..)
VIDIOC_REQBUFS
VIDIOC_QBUF
VIDIOC_REQBUFS
close ()
As the second VIDIOC_REQBUFS will cause the list entries of the buffers
to be cleared while they still hang around on the main and irc queues
Signed-off-by: Sjoerd Simons <sjoerd.simons@collabora.co.uk>
Acked-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Rafael Aquini [Wed, 15 Jun 2011 22:08:39 +0000 (15:08 -0700)]
mm: fix negative commitlimit when gigantic hugepages are allocated
commit
b0320c7b7d1ac1bd5c2d9dff3258524ab39bad32 upstream.
When 1GB hugepages are allocated on a system, free(1) reports less
available memory than what really is installed in the box. Also, if the
total size of hugepages allocated on a system is over half of the total
memory size, CommitLimit becomes a negative number.
The problem is that gigantic hugepages (order > MAX_ORDER) can only be
allocated at boot with bootmem, thus its frames are not accounted to
'totalram_pages'. However, they are accounted to hugetlb_total_pages()
What happens to turn CommitLimit into a negative number is this
calculation, in fs/proc/meminfo.c:
allowed = ((totalram_pages - hugetlb_total_pages())
* sysctl_overcommit_ratio / 100) + total_swap_pages;
A similar calculation occurs in __vm_enough_memory() in mm/mmap.c.
Also, every vm statistic which depends on 'totalram_pages' will render
confusing values, as if system were 'missing' some part of its memory.
Impact of this bug:
When gigantic hugepages are allocated and sysctl_overcommit_memory ==
OVERCOMMIT_NEVER. In a such situation, __vm_enough_memory() goes through
the mentioned 'allowed' calculation and might end up mistakenly returning
-ENOMEM, thus forcing the system to start reclaiming pages earlier than it
would be ususal, and this could cause detrimental impact to overall
system's performance, depending on the workload.
Besides the aforementioned scenario, I can only think of this causing
annoyances with memory reports from /proc/meminfo and free(1).
[akpm@linux-foundation.org: standardize comment layout]
Reported-by: Russ Anderson <rja@sgi.com>
Signed-off-by: Rafael Aquini <aquini@linux.com>
Acked-by: Russ Anderson <rja@sgi.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Christoph Lameter <cl@linux.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Eugene A. Shatokhin [Wed, 29 Jun 2011 03:04:51 +0000 (23:04 -0400)]
ath5k: fix memory leak when fewer than N_PD_CURVES are in use
commit
a0b8de350be458b33248e48b2174d9af8a4c4798 upstream.
We would free the proper number of curves, but in the wrong
slots, due to a missing level of indirection through
the pdgain_idx table.
It's simpler just to try to free all four slots, so do that.
Signed-off-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Michal Kubecek [Sat, 18 Jun 2011 18:34:01 +0000 (20:34 +0200)]
PM: Free memory bitmaps if opening /dev/snapshot fails
commit
8440f4b19494467883f8541b7aa28c7bbf6ac92b upstream.
When opening /dev/snapshot device, snapshot_open() creates memory
bitmaps which are freed in snapshot_release(). But if any of the
callbacks called by pm_notifier_call_chain() returns NOTIFY_BAD, open()
fails, snapshot_release() is never called and bitmaps are not freed.
Next attempt to open /dev/snapshot then triggers BUG_ON() check in
create_basic_memory_bitmaps(). This happens e.g. when vmwatchdog module
is active on s390x.
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Sarah Sharp [Mon, 6 Jun 2011 06:10:04 +0000 (23:10 -0700)]
xhci: Reject double add of active endpoints.
commit
fa75ac379e63c2864e9049b5e8615e40f65c1e70 upstream.
While trying to switch a UAS device from the BOT configuration to the UAS
configuration via the bConfigurationValue file, Tanya ran into an issue in
the USB core. usb_disable_device() sets entries in udev->ep_out and
udev->ep_out to NULL, but doesn't call into the xHCI bandwidth management
functions to remove the BOT configuration endpoints from the xHCI host's
internal structures.
The USB core would then attempt to add endpoints for the UAS
configuration, and some of the endpoints had the same address as endpoints
in the BOT configuration. The xHCI driver blindly added the endpoints
again, but the xHCI host controller rejected the Configure Endpoint
command because active endpoints were added without being dropped.
Make the xHCI driver reject calls to xhci_add_endpoint() that attempt to
add active endpoints without first calling xhci_drop_endpoint().
This should be backported to kernels as old as 2.6.31.
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Reported-by: Tanya Brokhman <tlinder@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Thomas Gleixner [Thu, 16 Jun 2011 14:22:08 +0000 (16:22 +0200)]
clocksource: Make watchdog robust vs. interruption
commit
b5199515c25cca622495eb9c6a8a1d275e775088 upstream.
The clocksource watchdog code is interruptible and it has been
observed that this can trigger false positives which disable the TSC.
The reason is that an interrupt storm or a long running interrupt
handler between the read of the watchdog source and the read of the
TSC brings the two far enough apart that the delta is larger than the
unstable treshold. Move both reads into a short interrupt disabled
region to avoid that.
Reported-and-tested-by: Vernon Mauery <vernux@us.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Stefano Stabellini [Fri, 3 Jun 2011 09:51:34 +0000 (09:51 +0000)]
xen: partially revert "xen: set max_pfn_mapped to the last pfn mapped"
commit
a91d92875ee94e4703fd017ccaadb48cfb344994 upstream.
We only need to set max_pfn_mapped to the last pfn mapped on x86_64 to
make sure that cleanup_highmap doesn't remove important mappings at
_end.
We don't need to do this on x86_32 because cleanup_highmap is not called
on x86_32. Besides lowering max_pfn_mapped on x86_32 has the unwanted
side effect of limiting the amount of memory available for the 1:1
kernel pagetable allocation.
This patch reverts the x86_32 part of the original patch.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Andrea Arcangeli [Thu, 16 Jun 2011 19:56:19 +0000 (12:56 -0700)]
migrate: don't account swapcache as shmem
commit
99a15e21d96f6857dafab1e5167e5e8183215c9c upstream.
swapcache will reach the below code path in migrate_page_move_mapping,
and swapcache is accounted as NR_FILE_PAGES but it's not accounted as
NR_SHMEM.
Hugh pointed out we must use PageSwapCache instead of comparing
mapping to &swapper_space, to avoid build failure with CONFIG_SWAP=n.
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Hugh Dickins [Wed, 15 Jun 2011 22:08:58 +0000 (15:08 -0700)]
ksm: fix NULL pointer dereference in scan_get_next_rmap_item()
commit
2b472611a32a72f4a118c069c2d62a1a3f087afd upstream.
Andrea Righi reported a case where an exiting task can race against
ksmd::scan_get_next_rmap_item (http://lkml.org/lkml/2011/6/1/742) easily
triggering a NULL pointer dereference in ksmd.
ksm_scan.mm_slot == &ksm_mm_head with only one registered mm
CPU 1 (__ksm_exit) CPU 2 (scan_get_next_rmap_item)
list_empty() is false
lock slot == &ksm_mm_head
list_del(slot->mm_list)
(list now empty)
unlock
lock
slot = list_entry(slot->mm_list.next)
(list is empty, so slot is still ksm_mm_head)
unlock
slot->mm == NULL ... Oops
Close this race by revalidating that the new slot is not simply the list
head again.
Andrea's test case:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/mman.h>
#define BUFSIZE getpagesize()
int main(int argc, char **argv)
{
void *ptr;
if (posix_memalign(&ptr, getpagesize(), BUFSIZE) < 0) {
perror("posix_memalign");
exit(1);
}
if (madvise(ptr, BUFSIZE, MADV_MERGEABLE) < 0) {
perror("madvise");
exit(1);
}
*(char *)NULL = 0;
return 0;
}
Reported-by: Andrea Righi <andrea@betterlinux.com>
Tested-by: Andrea Righi <andrea@betterlinux.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Kasper Pedersen [Wed, 20 Oct 2010 22:55:15 +0000 (15:55 -0700)]
time: Compensate for rounding on odd-frequency clocksources
commit
a386b5af8edda1c742ce9f77891e112eefffc005 upstream.
When the clocksource is not a multiple of HZ, the clock will be off. For
acpi_pm, HZ=1000 the error is 127.111 ppm:
The rounding of cycle_interval ends up generating a false error term in
ntp_error accumulation since xtime_interval is not exactly 1/HZ. So, we
subtract out the error caused by the rounding.
This has been visible since 2.6.32-rc2
commit
a092ff0f90cae22b2ac8028ecd2c6f6c1a9e4601
time: Implement logarithmic time accumulation
That commit raised NTP_INTERVAL_FREQ and exposed the rounding error.
testing tool: http://n1.taur.dk/permanent/testpmt.c
Also tested with ntpd and a frequency counter.
Signed-off-by: Kasper Pedersen <kkp2010@kasperkp.dk>
Acked-by: john stultz <johnstul@us.ibm.com>
Cc: John Kacur <jkacur@redhat.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Will Tisdale <willtisdale@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Thomas Gleixner [Sat, 5 Feb 2011 20:08:59 +0000 (20:08 +0000)]
xen: Use IRQF_FORCE_RESUME
commit
676dc3cf5bc36a9e129a3ad8fe3bd7b2ebf20f5d upstream.
Mark the IRQF_NO_SUSPEND interrupts IRQF_FORCE_RESUME and remove the extra
walk through the interrupt descriptors.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Thomas Gleixner [Fri, 4 Feb 2011 12:19:20 +0000 (13:19 +0100)]
genirq: Add IRQF_FORCE_RESUME
commit
dc5f219e88294b93009eef946251251ffffb6d60 upstream.
Xen needs to reenable interrupts which are marked IRQF_NO_SUSPEND in the
resume path. Add a flag to force the reenabling in the resume code.
Tested-and-acked-by: Ian Campbell <Ian.Campbell@eu.citrix.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Ian Campbell [Mon, 1 Nov 2010 16:30:09 +0000 (16:30 +0000)]
xen: events: do not unmask event channels on resume
commit
6903591f314b8947d0e362bda7715e90eb9df75e upstream.
The IRQ core code will take care of disabling and reenabling
interrupts over suspend resume automatically, therefore we do not need
to do this in the Xen event channel code.
The only exception is those event channels marked IRQF_NO_SUSPEND
which the IRQ core ignores. We must unmask these ourselves, taking
care to obey the current IRQ_DISABLED status. Failure check for
IRQ_DISABLED leads to enabling polled only event channels, such as
that associated with the pv spinlocks, which must never be enabled:
[ 21.970432] ------------[ cut here ]------------
[ 21.970432] kernel BUG at arch/x86/xen/spinlock.c:343!
[ 21.970432] invalid opcode: 0000 [#1] SMP
[ 21.970432] last sysfs file: /sys/devices/virtual/net/lo/operstate
[ 21.970432] Modules linked in:
[ 21.970432]
[ 21.970432] Pid: 0, comm: swapper Not tainted (2.6.32.24-x86_32p
-xen-01034-g787c727 #34)
[ 21.970432] EIP: 0061:[<
c102e209>] EFLAGS:
00010046 CPU: 3
[ 21.970432] EIP is at dummy_handler+0x3/0x7
[ 21.970432] EAX:
0000021c EBX:
dfc16880 ECX:
0000001a EDX:
00000000
[ 21.970432] ESI:
dfc02c00 EDI:
00000001 EBP:
dfc47e10 ESP:
dfc47e10
[ 21.970432] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
[ 21.970432] Process swapper (pid: 0, ti=
dfc46000 task=
dfc39440 task.ti=
dfc46000)
[ 21.970432] Stack:
[ 21.970432]
dfc47e30 c10a39f0 0000021c 00000000 00000000 dfc16880 0000021c 00000001
[ 21.970432] <0>
dfc47e40 c10a4f08 0000021c 00000000 dfc47e78 c12240a7 c1839284 c1839284
[ 21.970432] <0>
00000200 00000000 00000000 f5720000 c1f3d028 c1f3d02c 00000180 dfc47e90
[ 21.970432] Call Trace:
[ 21.970432] [<
c10a39f0>] ? handle_IRQ_event+0x5f/0x122
[ 21.970432] [<
c10a4f08>] ? handle_percpu_irq+0x2f/0x55
[ 21.970432] [<
c12240a7>] ? __xen_evtchn_do_upcall+0xdb/0x15f
[ 21.970432] [<
c122481e>] ? xen_evtchn_do_upcall+0x20/0x30
[ 21.970432] [<
c1030d47>] ? xen_do_upcall+0x7/0xc
[ 21.970432] [<
c102007b>] ? apic_reg_read+0xd3/0x22d
[ 21.970432] [<
c1002227>] ? hypercall_page+0x227/0x1005
[ 21.970432] [<
c102d30b>] ? xen_force_evtchn_callback+0xf/0x14
[ 21.970432] [<
c102da7c>] ? check_events+0x8/0xc
[ 21.970432] [<
c102da3b>] ? xen_irq_enable_direct_end+0x0/0x1
[ 21.970432] [<
c105e485>] ? finish_task_switch+0x62/0xba
[ 21.970432] [<
c14e3f84>] ? schedule+0x808/0x89d
[ 21.970432] [<
c1084dc5>] ? hrtimer_start_expires+0x1a/0x22
[ 21.970432] [<
c1085154>] ? tick_nohz_restart_sched_tick+0x15a/0x162
[ 21.970432] [<
c102f43a>] ? cpu_idle+0x6d/0x6f
[ 21.970432] [<
c14db29e>] ? cpu_bringup_and_idle+0xd/0xf
[ 21.970432] Code: 5d 0f 95 c0 0f b6 c0 c3 55 66 83 78 02 00 89 e5 5d 0f 95 \
c0 0f b6 c0 c3 55 b2 01 86 10 31 c0 84 d2 89 e5 0f 94 c0 5d c3 55 89 e5 <0f> 0b \
eb fe 55 80 3d 4c ce 84 c1 00 89 e5 57 56 89 c6 53 74 15
[ 21.970432] EIP: [<
c102e209>] dummy_handler+0x3/0x7 SS:ESP 0069:
dfc47e10
[ 21.970432] ---[ end trace
c0b71f7e12cf3011 ]---
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Fernando Luis Vazquez Cao [Tue, 10 May 2011 08:00:21 +0000 (10:00 +0200)]
netfilter: IPv6: fix DSCP mangle code
commit
1ed2f73d90fb49bcf5704aee7e9084adb882bfc5 upstream.
The mask indicates the bits one wants to zero out, so it needs to be
inverted before applying to the original TOS field.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Fernando Luis Vazquez Cao [Tue, 10 May 2011 07:55:44 +0000 (09:55 +0200)]
netfilter: IPv6: initialize TOS field in REJECT target module
commit
4319cc0cf5bb894b7368008cdf6dd20eb8868018 upstream.
The IPv6 header is not zeroed out in alloc_skb so we must initialize
it properly unless we want to see IPv6 packets with random TOS fields
floating around. The current implementation resets the flow label
but this could be changed if deemed necessary.
We stumbled upon this issue when trying to apply a mangle rule to
the RST packet generated by the REJECT target module.
Signed-off-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Mathias Krause [Thu, 9 Jun 2011 18:05:18 +0000 (20:05 +0200)]
exec: delay address limit change until point of no return
commit
dac853ae89043f1b7752875300faf614de43c74b upstream.
Unconditionally changing the address limit to USER_DS and not restoring
it to its old value in the error path is wrong because it prevents us
using kernel memory on repeated calls to this function. This, in fact,
breaks the fallback of hard coded paths to the init program from being
ever successful if the first candidate fails to load.
With this patch applied switching to USER_DS is delayed until the point
of no return is reached which makes it possible to have a multi-arch
rootfs with one arch specific init binary for each of the (hard coded)
probed paths.
Since the address limit is already set to USER_DS when start_thread()
will be invoked, this redundancy can be safely removed.
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Namhyung Kim [Mon, 13 Jun 2011 05:48:22 +0000 (14:48 +0900)]
md/raid5: fix raid5_set_bi_hw_segments
commit
9b2dc8b665932a8e681a7ab3237f60475e75e161 upstream.
The @bio->bi_phys_segments consists of active stripes count in the
lower 16 bits and processed stripes count in the upper 16 bits. So
logical-OR operator should be bitwise one.
This bug has been present since 2.6.27 and the fix is suitable for any
-stable kernel since then. Fortunately the bad code is only used on
error paths and is relatively unlikely to be hit.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Namhyung Kim [Thu, 9 Jun 2011 01:42:54 +0000 (11:42 +1000)]
md: check ->hot_remove_disk when removing disk
commit
01393f3d5836b7d62e925e6f4658a7eb22b83a11 upstream.
Check pers->hot_remove_disk instead of pers->hot_add_disk in slot_store()
during disk removal. The linear personality only has ->hot_add_disk and
no ->hot_remove_disk, so that removing disk in the array resulted to
following kernel bug:
$ sudo mdadm --create /dev/md0 --level=linear --raid-devices=4 /dev/loop[0-3]
$ echo none | sudo tee /sys/block/md0/md/dev-loop2/slot
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [< (null)>] (null)
PGD
c9f5d067 PUD
8575a067 PMD 0
Oops: 0010 [#1] SMP
CPU 2
Modules linked in: linear loop bridge stp llc kvm_intel kvm asus_atk0110 sr_mod cdrom sg
Pid: 10450, comm: tee Not tainted 3.0.0-rc1-leonard+ #173 System manufacturer System Product Name/P5G41TD-M PRO
RIP: 0010:[<
0000000000000000>] [< (null)>] (null)
RSP: 0018:
ffff880085757df0 EFLAGS:
00010282
RAX:
ffffffffa00168e0 RBX:
ffff8800d1431800 RCX:
000000000000006e
RDX:
0000000000000001 RSI:
0000000000000002 RDI:
ffff88008543c000
RBP:
ffff880085757e48 R08:
0000000000000002 R09:
000000000000000a
R10:
0000000000000000 R11:
ffff88008543c2e0 R12:
00000000ffffffff
R13:
ffff8800b4641000 R14:
0000000000000005 R15:
0000000000000000
FS:
00007fe8c9e05700(0000) GS:
ffff88011fa00000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
000000008005003b
CR2:
0000000000000000 CR3:
00000000b4502000 CR4:
00000000000406e0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000ffff0ff0 DR7:
0000000000000400
Process tee (pid: 10450, threadinfo
ffff880085756000, task
ffff8800c9f08000)
Stack:
ffffffff8138496a ffff8800b4641000 ffff88008543c268 0000000000000000
ffff8800b4641000 ffff88008543c000 ffff8800d1431868 ffffffff81a78a90
ffff8800b4641000 ffff88008543c000 ffff8800d1431800 ffff880085757e98
Call Trace:
[<
ffffffff8138496a>] ? slot_store+0xaa/0x265
[<
ffffffff81384bae>] rdev_attr_store+0x89/0xa8
[<
ffffffff8115a96a>] sysfs_write_file+0x108/0x144
[<
ffffffff81106b87>] vfs_write+0xb1/0x10d
[<
ffffffff8106e6c0>] ? trace_hardirqs_on_caller+0x111/0x135
[<
ffffffff81106cac>] sys_write+0x4d/0x77
[<
ffffffff814fe702>] system_call_fastpath+0x16/0x1b
Code: Bad RIP value.
RIP [< (null)>] (null)
RSP <
ffff880085757df0>
CR2:
0000000000000000
---[ end trace
ba5fc64319a826fb ]---
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Dave Jones [Sun, 12 Jun 2011 20:35:28 +0000 (16:35 -0400)]
CPUFREQ: Remove cpufreq_stats sysfs entries on module unload.
commit
13f067537f34456443f61c950cd6dc37d1d5f3ee upstream.
cpufreq_stats leaves behind its sysfs entries, which causes a panic
when something stumbled across them.
(Discovered by unloading cpufreq_stats while powertop was loaded).
Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Robert Richter [Tue, 31 May 2011 10:35:41 +0000 (12:35 +0200)]
oprofile, dcookies: Fix possible circular locking dependency
commit
fe47ae7f53e179d2ef6771024feb000cbb86640f upstream.
The lockdep warning below detects a possible A->B/B->A locking
dependency of mm->mmap_sem and dcookie_mutex. The order in
sync_buffer() is mm->mmap_sem/dcookie_mutex, while in
sys_lookup_dcookie() it is vice versa.
Fixing it in sys_lookup_dcookie() by unlocking dcookie_mutex before
copy_to_user().
oprofiled/4432 is trying to acquire lock:
(&mm->mmap_sem){++++++}, at: [<
ffffffff810b444b>] might_fault+0x53/0xa3
but task is already holding lock:
(dcookie_mutex){+.+.+.}, at: [<
ffffffff81124d28>] sys_lookup_dcookie+0x45/0x149
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (dcookie_mutex){+.+.+.}:
[<
ffffffff8106557f>] lock_acquire+0xf8/0x11e
[<
ffffffff814634f0>] mutex_lock_nested+0x63/0x309
[<
ffffffff81124e5c>] get_dcookie+0x30/0x144
[<
ffffffffa0000fba>] sync_buffer+0x196/0x3ec [oprofile]
[<
ffffffffa0001226>] task_exit_notify+0x16/0x1a [oprofile]
[<
ffffffff81467b96>] notifier_call_chain+0x37/0x63
[<
ffffffff8105803d>] __blocking_notifier_call_chain+0x50/0x67
[<
ffffffff81058068>] blocking_notifier_call_chain+0x14/0x16
[<
ffffffff8105a718>] profile_task_exit+0x1a/0x1c
[<
ffffffff81039e8f>] do_exit+0x2a/0x6fc
[<
ffffffff8103a5e4>] do_group_exit+0x83/0xae
[<
ffffffff8103a626>] sys_exit_group+0x17/0x1b
[<
ffffffff8146ad4b>] system_call_fastpath+0x16/0x1b
-> #0 (&mm->mmap_sem){++++++}:
[<
ffffffff81064dfb>] __lock_acquire+0x1085/0x1711
[<
ffffffff8106557f>] lock_acquire+0xf8/0x11e
[<
ffffffff810b4478>] might_fault+0x80/0xa3
[<
ffffffff81124de7>] sys_lookup_dcookie+0x104/0x149
[<
ffffffff8146ad4b>] system_call_fastpath+0x16/0x1b
other info that might help us debug this:
1 lock held by oprofiled/4432:
#0: (dcookie_mutex){+.+.+.}, at: [<
ffffffff81124d28>] sys_lookup_dcookie+0x45/0x149
stack backtrace:
Pid: 4432, comm: oprofiled Not tainted
2.6.39-00008-ge5a450d #9
Call Trace:
[<
ffffffff81063193>] print_circular_bug+0xae/0xbc
[<
ffffffff81064dfb>] __lock_acquire+0x1085/0x1711
[<
ffffffff8102ef13>] ? get_parent_ip+0x11/0x42
[<
ffffffff810b444b>] ? might_fault+0x53/0xa3
[<
ffffffff8106557f>] lock_acquire+0xf8/0x11e
[<
ffffffff810b444b>] ? might_fault+0x53/0xa3
[<
ffffffff810d7d54>] ? path_put+0x22/0x27
[<
ffffffff810b4478>] might_fault+0x80/0xa3
[<
ffffffff810b444b>] ? might_fault+0x53/0xa3
[<
ffffffff81124de7>] sys_lookup_dcookie+0x104/0x149
[<
ffffffff8146ad4b>] system_call_fastpath+0x16/0x1b
References: https://bugzilla.kernel.org/show_bug.cgi?id=13809
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Daniel T Chen [Mon, 6 Jun 2011 22:55:34 +0000 (18:55 -0400)]
ALSA: hda: Fix quirk for Dell Inspiron 910
commit
0a1896b27b030529ec770aefd790544a1bdb7d5a upstream.
BugLink: https://launchpad.net/bugs/792712
The original reporter states that sound from the internal speakers is
inaudible until using the model=auto quirk. This symptom is due to an
existing quirk mask for 0x102802b* that uses the model=dell quirk. To
limit the possible regressions, leave the existing quirk mask but add
a higher priority specific mask for the reporter's PCI SSID.
Reported-and-tested-by: rodni hipp
Signed-off-by: Daniel T Chen <crimsun@ubuntu.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Dmitry Torokhov [Tue, 31 May 2011 21:37:23 +0000 (14:37 -0700)]
USB: xhci - fix interval calculation for FS isoc endpoints
commit
cd3c18ba2fac14b34d03cae111f215009735ea06 upstream.
Full-speed isoc endpoints specify interval in exponent based form in
frames, not microframes, so we need to adjust accordingly.
NEC xHCI host controllers will return an error code of 0x11 if a full
speed isochronous endpoint is added with the Interval field set to
something less than 3 (2^3 = 8 microframes, or one frame). It is
impossible for a full speed device to have an interval smaller than one
frame.
This was always an issue in the xHCI driver, but commit
dfa49c4ad120a784ef1ff0717168aa79f55a483a "USB: xhci - fix math in
xhci_get_endpoint_interval()" removed the clamping of the minimum value
in the Interval field, which revealed this bug.
This needs to be backported to stable kernels back to 2.6.31.
Reported-by: Matt Evans <matt@ozlabs.org>
Signed-off-by: Dmitry Torokhov <dtor@vmware.com>
Signed-off-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Steffen Sledz [Tue, 7 Jun 2011 12:01:56 +0000 (14:01 +0200)]
USB: serial: add another 4N-GALAXY.DE PID to ftdi_sio driver
commit
a26d31cef06f43a76327c21235e75450869df2b8 upstream.
E.g. newer CAN 2.0 A/B <=> USB 2.0 converters report idProduct=f3c2.
Signed-off-by: Steffen Sledz <sledz@dresearch-fe.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Libor Pechacek [Fri, 20 May 2011 12:53:25 +0000 (14:53 +0200)]
USB: core: Tolerate protocol stall during hub and port status read
commit
3824c1ddaf744be44b170a335332b9d6afe79254 upstream.
Protocol stall should not be fatal while reading port or hub status as it is
transient state. Currently hub EP0 STALL during port status read results in
failed device enumeration. This has been observed with ST-Ericsson (formerly
Philips) USB 2.0 Hub (04cc:1521) after connecting keyboard.
Signed-off-by: Libor Pechacek <lpechacek@suse.cz>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Andi Kleen <ak@linux.intel.com>