Lukasz Odzioba [Mon, 12 Oct 2015 11:53:32 +0000 (13:53 +0200)]
hwmon: (coretemp) Increase limit of maximum core ID from 32 to 128.
commit
cc904f9cf26dc455cffbdf6e7eb86717e047d8ff upstream.
A new limit selected arbitrarily as power of two greater than
required minimum for Xeon Phi processor (72 for Knights Landing).
Currently driver is not able to handle cores with core ID greater than 32.
Such attempt ends up with the following error in dmesg:
coretemp coretemp.0: Adding Core XXX failed
Signed-off-by: Lukasz Odzioba <lukasz.odzioba@intel.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Martin Schwidefsky [Mon, 15 Feb 2016 13:46:49 +0000 (14:46 +0100)]
s390/mm: four page table levels vs. fork
commit
3446c13b268af86391d06611327006b059b8bab1 upstream.
The fork of a process with four page table levels is broken since
git commit
6252d702c5311ce9 "[S390] dynamic page tables."
All new mm contexts are created with three page table levels and
an asce limit of 4TB. If the parent has four levels dup_mmap will
add vmas to the new context which are outside of the asce limit.
The subsequent call to copy_page_range will walk the three level
page table structure of the new process with non-zero pgd and pud
indexes. This leads to memory clobbers as the pgd_index *and* the
pud_index is added to the mm->pgd pointer without a pgd_deref
in between.
The init_new_context() function is selecting the number of page
table levels for a new context. The function is used by mm_init()
which in turn is called by dup_mm() and mm_alloc(). These two are
used by fork() and exec(). The init_new_context() function can
distinguish the two cases by looking at mm->context.asce_limit,
for fork() the mm struct has been copied and the number of page
table levels may not change. For exec() the mm_alloc() function
set the new mm structure to zero, in this case a three-level page
table is created as the temporary stack space is located at
STACK_TOP_MAX = 4TB.
This fixes CVE-2016-2143.
Reported-by: Marcin Kościelnicki <koriakin@0x04.net>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Johan Hovold [Tue, 12 Jan 2016 11:05:20 +0000 (12:05 +0100)]
USB: visor: fix null-deref at probe
commit
cac9b50b0d75a1d50d6c056ff65c005f3224c8e0 upstream.
Fix null-pointer dereference at probe should a (malicious) Treo device
lack the expected endpoints.
Specifically, the Treo port-setup hack was dereferencing the bulk-in and
interrupt-in urbs without first making sure they had been allocated by
core.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Wei Huang [Wed, 13 Aug 2014 16:06:14 +0000 (12:06 -0400)]
KVM: SVM: add rdmsr support for AMD event registers
commit
dc9b2d933a1d5782b70977024f862759c8ebb2f7 upstream.
Current KVM only supports RDMSR for K7_EVNTSEL0 and K7_PERFCTR0
MSRs. Reading the rest MSRs will trigger KVM to inject #GP into
guest VM. This causes a warning message "Failed to access perfctr
msr (MSR
c0010001 is
ffffffffffffffff)" on AMD host. This patch
adds RDMSR support for all K7_EVNTSELn and K7_PERFCTRn registers
and thus supresses the warning message.
Signed-off-by: Wei Huang <wehuang@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Dirk Brandewie [Mon, 24 Mar 2014 14:41:29 +0000 (07:41 -0700)]
intel_pstate: Use del_timer_sync in intel_pstate_cpu_stop
commit
c2294a2f7853e6450361d078b65407bdaa6d1d11 upstream.
Ensure that no timer callback is running since we are about to free
the timer structure. We cannot guarantee that the call back is called
on the CPU where the timer is running.
Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Dirk Brandewie <dirk.j.brandewie@intel.com>
Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Alan Stern [Wed, 16 Dec 2015 18:32:38 +0000 (13:32 -0500)]
USB: fix invalid memory access in hub_activate()
commit
e50293ef9775c5f1cf3fcc093037dd6a8c5684ea upstream.
Commit
8520f38099cc ("USB: change hub initialization sleeps to
delayed_work") changed the hub_activate() routine to make part of it
run in a workqueue. However, the commit failed to take a reference to
the usb_hub structure or to lock the hub interface while doing so. As
a result, if a hub is plugged in and quickly unplugged before the work
routine can run, the routine will try to access memory that has been
deallocated. Or, if the hub is unplugged while the routine is
running, the memory may be deallocated while it is in active use.
This patch fixes the problem by taking a reference to the usb_hub at
the start of hub_activate() and releasing it at the end (when the work
is finished), and by locking the hub interface while the work routine
is running. It also adds a check at the start of the routine to see
if the hub has already been disconnected, in which nothing should be
done.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Alexandru Cornea <alexandru.cornea@intel.com>
Tested-by: Alexandru Cornea <alexandru.cornea@intel.com>
Fixes: 8520f38099cc ("USB: change hub initialization sleeps to delayed_work")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Michal Hocko [Wed, 4 Jun 2014 23:07:36 +0000 (16:07 -0700)]
memcg: do not hang on OOM when killed by userspace OOM access to memory reserves
commit
d8dc595ce3909fbc131bdf5ab8c9808fe624b18d upstream.
Eric has reported that he can see task(s) stuck in memcg OOM handler
regularly. The only way out is to
echo 0 > $GROUP/memory.oom_control
His usecase is:
- Setup a hierarchy with memory and the freezer (disable kernel oom and
have a process watch for oom).
- In that memory cgroup add a process with one thread per cpu.
- In one thread slowly allocate once per second I think it is 16M of ram
and mlock and dirty it (just to force the pages into ram and stay
there).
- When oom is achieved loop:
* attempt to freeze all of the tasks.
* if frozen send every task SIGKILL, unfreeze, remove the directory in
cgroupfs.
Eric has then pinpointed the issue to be memcg specific.
All tasks are sitting on the memcg_oom_waitq when memcg oom is disabled.
Those that have received fatal signal will bypass the charge and should
continue on their way out. The tricky part is that the exit path might
trigger a page fault (e.g. exit_robust_list), thus the memcg charge,
while its memcg is still under OOM because nobody has released any charges
yet.
Unlike with the in-kernel OOM handler the exiting task doesn't get
TIF_MEMDIE set so it doesn't shortcut further charges of the killed task
and falls to the memcg OOM again without any way out of it as there are no
fatal signals pending anymore.
This patch fixes the issue by checking PF_EXITING early in
mem_cgroup_try_charge and bypass the charge same as if it had fatal
signal pending or TIF_MEMDIE set.
Normally exiting tasks (aka not killed) will bypass the charge now but
this should be OK as the task is leaving and will release memory and
increasing the memory pressure just to release it in a moment seems
dubious wasting of cycles. Besides that charges after exit_signals should
be rare.
I am bringing this patch again (rebased on the current mmotm tree). I
hope we can move forward finally. If there is still an opposition then
I would really appreciate a concurrent approach so that we can discuss
alternatives.
http://comments.gmane.org/gmane.linux.kernel.stable/77650 is a reference
to the followup discussion when the patch has been dropped from the mmotm
last time.
Reported-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Takashi Iwai [Mon, 15 Feb 2016 15:20:24 +0000 (16:20 +0100)]
ALSA: seq: Fix leak of pool buffer at concurrent writes
commit
d99a36f4728fcbcc501b78447f625bdcce15b842 upstream.
When multiple concurrent writes happen on the ALSA sequencer device
right after the open, it may try to allocate vmalloc buffer for each
write and leak some of them. It's because the presence check and the
assignment of the buffer is done outside the spinlock for the pool.
The fix is to move the check and the assignment into the spinlock.
(The current implementation is suboptimal, as there can be multiple
unnecessary vmallocs because the allocation is done before the check
in the spinlock. But the pool size is already checked beforehand, so
this isn't a big problem; that is, the only possible path is the
multiple writes before any pool assignment, and practically seen, the
current coverage should be "good enough".)
The issue was triggered by syzkaller fuzzer.
Buglink: http://lkml.kernel.org/r/CACT4Y+bSzazpXNvtAr=WXaL8hptqjHwqEyFA+VN2AWEx=aurkg@mail.gmail.com
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Takashi Iwai [Sun, 31 Jan 2016 10:57:41 +0000 (11:57 +0100)]
ALSA: rawmidi: Make snd_rawmidi_transmit() race-free
commit
06ab30034ed9c200a570ab13c017bde248ddb2a6 upstream.
A kernel WARNING in snd_rawmidi_transmit_ack() is triggered by
syzkaller fuzzer:
WARNING: CPU: 1 PID: 20739 at sound/core/rawmidi.c:1136
Call Trace:
[< inline >] __dump_stack lib/dump_stack.c:15
[<
ffffffff82999e2d>] dump_stack+0x6f/0xa2 lib/dump_stack.c:50
[<
ffffffff81352089>] warn_slowpath_common+0xd9/0x140 kernel/panic.c:482
[<
ffffffff813522b9>] warn_slowpath_null+0x29/0x30 kernel/panic.c:515
[<
ffffffff84f80bd5>] snd_rawmidi_transmit_ack+0x275/0x400 sound/core/rawmidi.c:1136
[<
ffffffff84fdb3c1>] snd_virmidi_output_trigger+0x4b1/0x5a0 sound/core/seq/seq_virmidi.c:163
[< inline >] snd_rawmidi_output_trigger sound/core/rawmidi.c:150
[<
ffffffff84f87ed9>] snd_rawmidi_kernel_write1+0x549/0x780 sound/core/rawmidi.c:1223
[<
ffffffff84f89fd3>] snd_rawmidi_write+0x543/0xb30 sound/core/rawmidi.c:1273
[<
ffffffff817b0323>] __vfs_write+0x113/0x480 fs/read_write.c:528
[<
ffffffff817b1db7>] vfs_write+0x167/0x4a0 fs/read_write.c:577
[< inline >] SYSC_write fs/read_write.c:624
[<
ffffffff817b50a1>] SyS_write+0x111/0x220 fs/read_write.c:616
[<
ffffffff86336c36>] entry_SYSCALL_64_fastpath+0x16/0x7a arch/x86/entry/entry_64.S:185
Also a similar warning is found but in another path:
Call Trace:
[< inline >] __dump_stack lib/dump_stack.c:15
[<
ffffffff82be2c0d>] dump_stack+0x6f/0xa2 lib/dump_stack.c:50
[<
ffffffff81355139>] warn_slowpath_common+0xd9/0x140 kernel/panic.c:482
[<
ffffffff81355369>] warn_slowpath_null+0x29/0x30 kernel/panic.c:515
[<
ffffffff8527e69a>] rawmidi_transmit_ack+0x24a/0x3b0 sound/core/rawmidi.c:1133
[<
ffffffff8527e851>] snd_rawmidi_transmit_ack+0x51/0x80 sound/core/rawmidi.c:1163
[<
ffffffff852d9046>] snd_virmidi_output_trigger+0x2b6/0x570 sound/core/seq/seq_virmidi.c:185
[< inline >] snd_rawmidi_output_trigger sound/core/rawmidi.c:150
[<
ffffffff85285a0b>] snd_rawmidi_kernel_write1+0x4bb/0x760 sound/core/rawmidi.c:1252
[<
ffffffff85287b73>] snd_rawmidi_write+0x543/0xb30 sound/core/rawmidi.c:1302
[<
ffffffff817ba5f3>] __vfs_write+0x113/0x480 fs/read_write.c:528
[<
ffffffff817bc087>] vfs_write+0x167/0x4a0 fs/read_write.c:577
[< inline >] SYSC_write fs/read_write.c:624
[<
ffffffff817bf371>] SyS_write+0x111/0x220 fs/read_write.c:616
[<
ffffffff86660276>] entry_SYSCALL_64_fastpath+0x16/0x7a arch/x86/entry/entry_64.S:185
In the former case, the reason is that virmidi has an open code
calling snd_rawmidi_transmit_ack() with the value calculated outside
the spinlock. We may use snd_rawmidi_transmit() in a loop just for
consuming the input data, but even there, there is a race between
snd_rawmidi_transmit_peek() and snd_rawmidi_tranmit_ack().
Similarly in the latter case, it calls snd_rawmidi_transmit_peek() and
snd_rawmidi_tranmit_ack() separately without protection, so they are
racy as well.
The patch tries to address these issues by the following ways:
- Introduce the unlocked versions of snd_rawmidi_transmit_peek() and
snd_rawmidi_transmit_ack() to be called inside the explicit lock.
- Rewrite snd_rawmidi_transmit() to be race-free (the former case).
- Make the split calls (the latter case) protected in the rawmidi spin
lock.
Buglink: http://lkml.kernel.org/r/CACT4Y+YPq1+cYLkadwjWa5XjzF1_Vki1eHnVn-Lm0hzhSpu5PA@mail.gmail.com
Buglink: http://lkml.kernel.org/r/CACT4Y+acG4iyphdOZx47Nyq_VHGbpJQK-6xNpiqUjaZYqsXOGw@mail.gmail.com
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
John Allen [Thu, 14 Jan 2016 23:22:16 +0000 (15:22 -0800)]
drivers/base/memory.c: fix kernel warning during memory hotplug on ppc64
commit
cb5490a5eea415106d7438df440da5fb1e17318d upstream.
Fix a bug where a kernel warning is triggered when performing a memory
hotplug on ppc64. This warning may also occur on any architecture that
uses the memory_probe_store interface.
WARNING: at drivers/base/memory.c:200
CPU: 9 PID: 13042 Comm: systemd-udevd Not tainted
4.4.0-rc4-00113-g0bd0f1e-dirty #7
NIP [
c00000000055e034] pages_correctly_reserved+0x134/0x1b0
LR [
c00000000055e7f8] memory_subsys_online+0x68/0x140
Call Trace:
memory_subsys_online+0x68/0x140
device_online+0xb4/0x120
store_mem_state+0xb0/0x180
dev_attr_store+0x34/0x60
sysfs_kf_write+0x64/0xa0
kernfs_fop_write+0x17c/0x1e0
__vfs_write+0x40/0x160
vfs_write+0xb8/0x200
SyS_write+0x60/0x110
system_call+0x38/0xd0
The warning is triggered because there is a udev rule that automatically
tries to online memory after it has been added. The udev rule varies
from distro to distro, but will generally look something like:
SUBSYSTEM=="memory", ACTION=="add", ATTR{state}=="offline", ATTR{state}="online"
On any architecture that uses memory_probe_store to reserve memory, the
udev rule will be triggered after the first section of the block is
reserved and will subsequently attempt to online the entire block,
interrupting the memory reservation process and causing the warning.
This patch modifies memory_probe_store to add a block of memory with a
single call to add_memory as opposed to looping through and adding each
section individually. A single call to add_memory is protected by the
mem_hotplug mutex which will prevent the udev rule from onlining memory
until the reservation of the entire block is complete.
Signed-off-by: John Allen <jallen@linux.vnet.ibm.com>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Yuval Mintz [Thu, 27 Aug 2015 05:03:08 +0000 (08:03 +0300)]
bnx2x: Add new device ids under the Qlogic vendor
commit
9c9a6524b5fdf6cb57c9ff627b7f242a6a4e0b00 upstream.
This adds support for 3 new PCI device combinations -
1077:16a1, 1077:16a4 and 1077:16ad.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Wang Shilong [Thu, 13 Feb 2014 03:19:47 +0000 (11:19 +0800)]
Btrfs: skip locking when searching commit root
commit
e84752d434b5cca0869e906e7b94d0531b25c6d3 upstream.
We won't change commit root, skip locking dance with commit root
when walking backrefs, this can speed up btrfs send operations.
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Kirill Tkhai [Mon, 27 Oct 2014 10:18:25 +0000 (14:18 +0400)]
sched: Fix race between task_group and sched_task_group
commit
eeb61e53ea19be0c4015b00b2e8b3b2185436f2b upstream.
The race may happen when somebody is changing task_group of a forking task.
Child's cgroup is the same as parent's after dup_task_struct() (there just
memory copying). Also, cfs_rq and rt_rq are the same as parent's.
But if parent changes its task_group before it's called cgroup_post_fork(),
we do not reflect this situation on child. Child's cfs_rq and rt_rq remain
the same, while child's task_group changes in cgroup_post_fork().
To fix this we introduce fork() method, which calls sched_move_task() directly.
This function changes sched_task_group on appropriate (also its logic has
no problem with freshly created tasks, so we shouldn't introduce something
special; we are able just to use it).
Possibly, this decides the Burke Libbey's problem: https://lkml.org/lkml/2014/10/24/456
Signed-off-by: Kirill Tkhai <ktkhai@parallels.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1414405105.19914.169.camel@tkhai
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Eric Sandeen [Thu, 24 Jul 2014 10:51:54 +0000 (20:51 +1000)]
xfs: allow inode allocations in post-growfs disk space
commit
9de67c3ba9ea961ba420573d56479d09d33a7587 upstream.
Today, if we perform an xfs_growfs which adds allocation groups,
mp->m_maxagi is not properly updated when the growfs is complete.
Therefore inodes will continue to be allocated only in the
AGs which existed prior to the growfs, and the new space
won't be utilized.
This is because of this path in xfs_growfs_data_private():
xfs_growfs_data_private
xfs_initialize_perag(mp, nagcount, &nagimax);
if (mp->m_flags & XFS_MOUNT_32BITINODES)
index = xfs_set_inode32(mp);
else
index = xfs_set_inode64(mp);
if (maxagi)
*maxagi = index;
where xfs_set_inode* iterates over the (old) agcount in
mp->m_sb.sb_agblocks, which has not yet been updated
in the growfs path. So "index" will be returned based on
the old agcount, not the new one, and new AGs are not available
for inode allocation.
Fix this by explicitly passing the proper AG count (which
xfs_initialize_perag() already has) down another level,
so that xfs_set_inode* can make the proper decision about
acceptable AGs for inode allocation in the potentially
newly-added AGs.
This has been broken since 3.7, when these two
xfs_set_inode* functions were added in commit
2d2194f.
Prior to that, we looped over "agcount" not sb_agblocks
in these calculations.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Konrad Rzeszutek Wilk [Thu, 11 Feb 2016 21:10:24 +0000 (16:10 -0500)]
xen/pciback: Save the number of MSI-X entries to be copied later.
commit
d159457b84395927b5a52adb72f748dd089ad5e5 upstream.
Commit
8135cf8b092723dbfcc611fe6fdcb3a36c9951c5 (xen/pciback: Save
xen_pci_op commands before processing it) broke enabling MSI-X because
it would never copy the resulting vectors into the response. The
number of vectors requested was being overwritten by the return value
(typically zero for success).
Save the number of vectors before processing the op, so the correct
number of vectors are copied afterwards.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Cc: "Jan Beulich" <JBeulich@suse.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Konrad Rzeszutek Wilk [Mon, 16 Nov 2015 17:40:48 +0000 (12:40 -0500)]
xen/pciback: Save xen_pci_op commands before processing it
commit
8135cf8b092723dbfcc611fe6fdcb3a36c9951c5 upstream.
Double fetch vulnerabilities that happen when a variable is
fetched twice from shared memory but a security check is only
performed the first time.
The xen_pcibk_do_op function performs a switch statements on the op->cmd
value which is stored in shared memory. Interestingly this can result
in a double fetch vulnerability depending on the performed compiler
optimization.
This patch fixes it by saving the xen_pci_op command before
processing it. We also use 'barrier' to make sure that the
compiler does not perform any optimization.
This is part of XSA155.
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jan Beulich <JBeulich@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: "Jan Beulich" <JBeulich@suse.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Roger Pau Monné [Tue, 3 Nov 2015 16:40:43 +0000 (16:40 +0000)]
xen-blkback: read from indirect descriptors only once
commit
18779149101c0dd43ded43669ae2a92d21b6f9cb upstream.
Since indirect descriptors are in memory shared with the frontend, the
frontend could alter the first_sect and last_sect values after they have
been validated but before they are recorded in the request. This may
result in I/O requests that overflow the foreign page, possibly
overwriting local pages when the I/O request is executed.
When parsing indirect descriptors, only read first_sect and last_sect
once.
This is part of XSA155.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Jan Beulich <JBeulich@suse.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Roger Pau Monné [Tue, 3 Nov 2015 16:34:09 +0000 (16:34 +0000)]
xen-blkback: only read request operation from shared ring once
commit
1f13d75ccb806260079e0679d55d9253e370ec8a upstream.
A compiler may load a switch statement value multiple times, which could
be bad when the value is in memory shared with the frontend.
When converting a non-native request to a native one, ensure that
src->operation is only loaded once by using READ_ONCE().
This is part of XSA155.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: "Jan Beulich" <JBeulich@suse.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
David Vrabel [Fri, 30 Oct 2015 15:17:06 +0000 (15:17 +0000)]
xen-netback: use RING_COPY_REQUEST() throughout
commit
68a33bfd8403e4e22847165d149823a2e0e67c9c upstream.
Instead of open-coding memcpy()s and directly accessing Tx and Rx
requests, use the new RING_COPY_REQUEST() that ensures the local copy
is correct.
This is more than is strictly necessary for guest Rx requests since
only the id and gref fields are used and it is harmless if the
frontend modifies these.
This is part of XSA155.
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
David Vrabel [Fri, 30 Oct 2015 15:16:01 +0000 (15:16 +0000)]
xen-netback: don't use last request to determine minimum Tx credit
commit
0f589967a73f1f30ab4ac4dd9ce0bb399b4d6357 upstream.
The last from guest transmitted request gives no indication about the
minimum amount of credit that the guest might need to send a packet
since the last packet might have been a small one.
Instead allow for the worst case 128 KiB packet.
This is part of XSA155.
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
David Vrabel [Fri, 30 Oct 2015 14:58:08 +0000 (14:58 +0000)]
xen: Add RING_COPY_REQUEST()
commit
454d5d882c7e412b840e3c99010fe81a9862f6fb upstream.
Using RING_GET_REQUEST() on a shared ring is easy to use incorrectly
(i.e., by not considering that the other end may alter the data in the
shared ring while it is being inspected). Safe usage of a request
generally requires taking a local copy.
Provide a RING_COPY_REQUEST() macro to use instead of
RING_GET_REQUEST() and an open-coded memcpy(). This takes care of
ensuring that the copy is done correctly regardless of any possible
compiler optimizations.
Use a volatile source to prevent the compiler from reordering or
omitting the copy.
This is part of XSA155.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Peter Zijlstra [Thu, 26 Mar 2015 16:45:37 +0000 (17:45 +0100)]
locking: Remove atomicy checks from {READ,WRITE}_ONCE
commit
7bd3e239d6c6d1cad276e8f130b386df4234dcd7 upstream.
The fact that volatile allows for atomic load/stores is a special case
not a requirement for {READ,WRITE}_ONCE(). Their primary purpose is to
force the compiler to emit load/stores _once_.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Linus Torvalds [Fri, 20 Feb 2015 23:46:31 +0000 (15:46 -0800)]
kernel: make READ_ONCE() valid on const arguments
commit
dd36929720f40f17685e841ae0d4c581c165ea60 upstream.
The use of READ_ONCE() causes lots of warnings witht he pending paravirt
spinlock fixes, because those ends up having passing a member to a
'const' structure to READ_ONCE().
There should certainly be nothing wrong with using READ_ONCE() with a
const source, but the helper function __read_once_size() would cause
warnings because it would drop the 'const' qualifier, but also because
the destination would be marked 'const' too due to the use of 'typeof'.
Use a union of types in READ_ONCE() to avoid this issue.
Also make sure to use parenthesis around the macro arguments to avoid
possible operator precedence issues.
Tested-by: Ingo Molnar <mingo@kernel.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Christian Borntraeger [Tue, 13 Jan 2015 09:46:42 +0000 (10:46 +0100)]
kernel: Change ASSIGN_ONCE(val, x) to WRITE_ONCE(x, val)
commit
43239cbe79fc369f5d2160bd7f69e28b5c50a58c upstream.
Feedback has shown that WRITE_ONCE(x, val) is easier to use than
ASSIGN_ONCE(val,x).
There are no in-tree users yet, so lets change it for 3.19.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Christian Borntraeger [Tue, 25 Nov 2014 09:01:16 +0000 (10:01 +0100)]
kernel: Provide READ_ONCE and ASSIGN_ONCE
commit
230fa253df6352af12ad0a16128760b5cb3f92df upstream.
ACCESS_ONCE does not work reliably on non-scalar types. For
example gcc 4.6 and 4.7 might remove the volatile tag for such
accesses during the SRA (scalar replacement of aggregates) step
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145)
Let's provide READ_ONCE/ASSIGN_ONCE that will do all accesses via
scalar types as suggested by Linus Torvalds. Accesses larger than
the machines word size cannot be guaranteed to be atomic. These
macros will use memcpy and emit a build warning.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Eric W. Biederman [Tue, 7 Oct 2014 19:19:53 +0000 (12:19 -0700)]
umount: Do not allow unmounting rootfs.
commit
da362b09e42ee0bcaf0356afee6078b4f324baff upstream.
Andrew Vagin <avagin@parallels.com> writes:
> #define _GNU_SOURCE
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <fcntl.h>
> #include <sched.h>
> #include <unistd.h>
> #include <sys/mount.h>
>
> int main(int argc, char **argv)
> {
> int fd;
>
> fd = open("/proc/self/ns/mnt", O_RDONLY);
> if (fd < 0)
> return 1;
> while (1) {
> if (umount2("/", MNT_DETACH) ||
> setns(fd, CLONE_NEWNS))
> break;
> }
>
> return 0;
> }
>
> root@ubuntu:/home/avagin# gcc -Wall nsenter.c -o nsenter
> root@ubuntu:/home/avagin# strace ./nsenter
> execve("./nsenter", ["./nsenter"], [/* 22 vars */]) = 0
> ...
> open("/proc/self/ns/mnt", O_RDONLY) = 3
> umount("/", MNT_DETACH) = 0
> setns(3, 131072) = 0
> umount("/", MNT_DETACH
>
causes:
> [ 260.548301] ------------[ cut here ]------------
> [ 260.550941] kernel BUG at /build/buildd/linux-3.13.0/fs/pnode.c:372!
> [ 260.552068] invalid opcode: 0000 [#1] SMP
> [ 260.552068] Modules linked in: xt_CHECKSUM iptable_mangle xt_tcpudp xt_addrtype xt_conntrack ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack bridge stp llc dm_thin_pool dm_persistent_data dm_bufio dm_bio_prison iptable_filter ip_tables x_tables crct10dif_pclmul crc32_pclmul ghash_clmulni_intel binfmt_misc nfsd auth_rpcgss nfs_acl aesni_intel nfs lockd aes_x86_64 sunrpc fscache lrw gf128mul glue_helper ablk_helper cryptd serio_raw ppdev parport_pc lp parport btrfs xor raid6_pq libcrc32c psmouse floppy
> [ 260.552068] CPU: 0 PID: 1723 Comm: nsenter Not tainted 3.13.0-30-generic #55-Ubuntu
> [ 260.552068] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> [ 260.552068] task:
ffff8800376097f0 ti:
ffff880074824000 task.ti:
ffff880074824000
> [ 260.552068] RIP: 0010:[<
ffffffff811e9483>] [<
ffffffff811e9483>] propagate_umount+0x123/0x130
> [ 260.552068] RSP: 0018:
ffff880074825e98 EFLAGS:
00010246
> [ 260.552068] RAX:
ffff88007c741140 RBX:
0000000000000002 RCX:
ffff88007c741190
> [ 260.552068] RDX:
ffff88007c741190 RSI:
ffff880074825ec0 RDI:
ffff880074825ec0
> [ 260.552068] RBP:
ffff880074825eb0 R08:
00000000000172e0 R09:
ffff88007fc172e0
> [ 260.552068] R10:
ffffffff811cc642 R11:
ffffea0001d59000 R12:
ffff88007c741140
> [ 260.552068] R13:
ffff88007c741140 R14:
ffff88007c741140 R15:
0000000000000000
> [ 260.552068] FS:
00007fd5c7e41740(0000) GS:
ffff88007fc00000(0000) knlGS:
0000000000000000
> [ 260.552068] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
> [ 260.552068] CR2:
00007fd5c7968050 CR3:
0000000070124000 CR4:
00000000000406f0
> [ 260.552068] Stack:
> [ 260.552068]
0000000000000002 0000000000000002 ffff88007c631000 ffff880074825ed8
> [ 260.552068]
ffffffff811dcfac ffff88007c741140 0000000000000002 ffff88007c741160
> [ 260.552068]
ffff880074825f38 ffffffff811dd12b ffffffff811cc642 0000000075640000
> [ 260.552068] Call Trace:
> [ 260.552068] [<
ffffffff811dcfac>] umount_tree+0x20c/0x260
> [ 260.552068] [<
ffffffff811dd12b>] do_umount+0x12b/0x300
> [ 260.552068] [<
ffffffff811cc642>] ? final_putname+0x22/0x50
> [ 260.552068] [<
ffffffff811cc849>] ? putname+0x29/0x40
> [ 260.552068] [<
ffffffff811dd88c>] SyS_umount+0xdc/0x100
> [ 260.552068] [<
ffffffff8172aeff>] tracesys+0xe1/0xe6
> [ 260.552068] Code: 89 50 08 48 8b 50 08 48 89 02 49 89 45 08 e9 72 ff ff ff 0f 1f 44 00 00 4c 89 e6 4c 89 e7 e8 f5 f6 ff ff 48 89 c3 e9 39 ff ff ff <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 90 66 66 66 66 90 55 b8 01
> [ 260.552068] RIP [<
ffffffff811e9483>] propagate_umount+0x123/0x130
> [ 260.552068] RSP <
ffff880074825e98>
> [ 260.611451] ---[ end trace
11c33d85f1d4c652 ]--
Which in practice is totally uninteresting. Only the global root user can
do it, and it is just a stupid thing to do.
However that is no excuse to allow a silly way to oops the kernel.
We can avoid this silly problem by setting MNT_LOCKED on the rootfs
mount point and thus avoid needing any special cases in the unmount
code.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
David S. Miller [Mon, 14 Mar 2016 03:28:00 +0000 (23:28 -0400)]
ipv4: Don't do expensive useless work during inetdev destroy.
commit
fbd40ea0180a2d328c5adc61414dc8bab9335ce2 upstream.
When an inetdev is destroyed, every address assigned to the interface
is removed. And in this scenerio we do two pointless things which can
be very expensive if the number of assigned interfaces is large:
1) Address promotion. We are deleting all addresses, so there is no
point in doing this.
2) A full nf conntrack table purge for every address. We only need to
do this once, as is already caught by the existing
masq_dev_notifier so masq_inet_event() can skip this.
[mk] 3.12.*: The change in masq_inet_event() needs to be duplicated in
both IPv4 and IPv6 version of the function, these two were merged in
3.18.
Reported-by: Solar Designer <solar@openwall.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tested-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Gabriel Krisman Bertazi [Thu, 25 Feb 2016 16:54:20 +0000 (13:54 -0300)]
ipr: Fix regression when loading firmware
commit
21b81716c6bff24cda52dc75588455f879ddbfe9 upstream.
Commit
d63c7dd5bcb9 ("ipr: Fix out-of-bounds null overwrite") removed
the end of line handling when storing the update_fw sysfs attribute.
This changed the userpace API because it started refusing writes
terminated by a line feed, which broke the update tools we already have.
This patch re-adds that handling, so both a write terminated by a line
feed or not can make it through with the update.
Fixes: d63c7dd5bcb9 ("ipr: Fix out-of-bounds null overwrite")
Signed-off-by: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
Cc: Insu Yun <wuninsu@gmail.com>
Acked-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Insu Yun [Wed, 6 Jan 2016 17:44:01 +0000 (12:44 -0500)]
ipr: Fix out-of-bounds null overwrite
commit
d63c7dd5bcb9441af0526d370c43a65ca2c980d9 upstream.
Return value of snprintf is not bound by size value, 2nd argument.
(https://www.kernel.org/doc/htmldocs/kernel-api/API-snprintf.html).
Return value is number of printed chars, can be larger than 2nd
argument. Therefore, it can write null byte out of bounds ofbuffer.
Since snprintf puts null, it does not need to put additional null byte.
Signed-off-by: Insu Yun <wuninsu@gmail.com>
Reviewed-by: Shane Seymour <shane.seymour@hpe.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Jiri Slaby [Wed, 16 Mar 2016 08:29:22 +0000 (09:29 +0100)]
Linux 3.12.57
Konrad Rzeszutek Wilk [Thu, 11 Feb 2016 21:10:23 +0000 (16:10 -0500)]
xen/pciback: Check PF instead of VF for PCI_COMMAND_MEMORY
commit
8d47065f7d1980dde52abb874b301054f3013602 upstream.
Commit
408fb0e5aa7fda0059db282ff58c3b2a4278baa0 (xen/pciback: Don't
allow MSI-X ops if PCI_COMMAND_MEMORY is not set) prevented enabling
MSI-X on passed-through virtual functions, because it checked the VF
for PCI_COMMAND_MEMORY but this is not a valid bit for VFs.
Instead, check the physical function for PCI_COMMAND_MEMORY.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Konrad Rzeszutek Wilk [Mon, 2 Nov 2015 23:13:27 +0000 (18:13 -0500)]
xen/pciback: Don't allow MSI-X ops if PCI_COMMAND_MEMORY is not set.
commit
408fb0e5aa7fda0059db282ff58c3b2a4278baa0 upstream.
commit
f598282f51 ("PCI: Fix the NIU MSI-X problem in a better way")
teaches us that dealing with MSI-X can be troublesome.
Further checks in the MSI-X architecture shows that if the
PCI_COMMAND_MEMORY bit is turned of in the PCI_COMMAND we
may not be able to access the BAR (since they are memory regions).
Since the MSI-X tables are located in there.. that can lead
to us causing PCIe errors. Inhibit us performing any
operation on the MSI-X unless the MEMORY bit is set.
Note that Xen hypervisor with:
"x86/MSI-X: access MSI-X table only after having enabled MSI-X"
will return:
xen_pciback: 0000:0a:00.1: error -6 enabling MSI-X for guest 3!
When the generic MSI code tries to setup the PIRQ without
MEMORY bit set. Which means with later versions of Xen
(4.6) this patch is not neccessary.
This is part of XSA-157
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Konrad Rzeszutek Wilk [Wed, 1 Apr 2015 14:49:47 +0000 (10:49 -0400)]
xen/pciback: For XEN_PCI_OP_disable_msi[|x] only disable if device has MSI(X) enabled.
commit
7cfb905b9638982862f0331b36ccaaca5d383b49 upstream.
Otherwise just continue on, returning the same values as
previously (return of 0, and op->result has the PIRQ value).
This does not change the behavior of XEN_PCI_OP_disable_msi[|x].
The pci_disable_msi or pci_disable_msix have the checks for
msi_enabled or msix_enabled so they will error out immediately.
However the guest can still call these operations and cause
us to disable the 'ack_intr'. That means the backend IRQ handler
for the legacy interrupt will not respond to interrupts anymore.
This will lead to (if the device is causing an interrupt storm)
for the Linux generic code to disable the interrupt line.
Naturally this will only happen if the device in question
is plugged in on the motherboard on shared level interrupt GSI.
This is part of XSA-157
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Konrad Rzeszutek Wilk [Mon, 2 Nov 2015 22:24:08 +0000 (17:24 -0500)]
xen/pciback: Do not install an IRQ handler for MSI interrupts.
commit
a396f3a210c3a61e94d6b87ec05a75d0be2a60d0 upstream.
Otherwise an guest can subvert the generic MSI code to trigger
an BUG_ON condition during MSI interrupt freeing:
for (i = 0; i < entry->nvec_used; i++)
BUG_ON(irq_has_action(entry->irq + i));
Xen PCI backed installs an IRQ handler (request_irq) for
the dev->irq whenever the guest writes PCI_COMMAND_MEMORY
(or PCI_COMMAND_IO) to the PCI_COMMAND register. This is
done in case the device has legacy interrupts the GSI line
is shared by the backend devices.
To subvert the backend the guest needs to make the backend
to change the dev->irq from the GSI to the MSI interrupt line,
make the backend allocate an interrupt handler, and then command
the backend to free the MSI interrupt and hit the BUG_ON.
Since the backend only calls 'request_irq' when the guest
writes to the PCI_COMMAND register the guest needs to call
XEN_PCI_OP_enable_msi before any other operation. This will
cause the generic MSI code to setup an MSI entry and
populate dev->irq with the new PIRQ value.
Then the guest can write to PCI_COMMAND PCI_COMMAND_MEMORY
and cause the backend to setup an IRQ handler for dev->irq
(which instead of the GSI value has the MSI pirq). See
'xen_pcibk_control_isr'.
Then the guest disables the MSI: XEN_PCI_OP_disable_msi
which ends up triggering the BUG_ON condition in 'free_msi_irqs'
as there is an IRQ handler for the entry->irq (dev->irq).
Note that this cannot be done using MSI-X as the generic
code does not over-write dev->irq with the MSI-X PIRQ values.
The patch inhibits setting up the IRQ handler if MSI or
MSI-X (for symmetry reasons) code had been called successfully.
P.S.
Xen PCIBack when it sets up the device for the guest consumption
ends up writting 0 to the PCI_COMMAND (see xen_pcibk_reset_device).
XSA-120 addendum patch removed that - however when upstreaming said
addendum we found that it caused issues with qemu upstream. That
has now been fixed in qemu upstream.
This is part of XSA-157
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Konrad Rzeszutek Wilk [Mon, 2 Nov 2015 23:07:44 +0000 (18:07 -0500)]
xen/pciback: Return error on XEN_PCI_OP_enable_msix when device has MSI or MSI-X enabled
commit
5e0ce1455c09dd61d029b8ad45d82e1ac0b6c4c9 upstream.
The guest sequence of:
a) XEN_PCI_OP_enable_msix
b) XEN_PCI_OP_enable_msix
results in hitting an NULL pointer due to using freed pointers.
The device passed in the guest MUST have MSI-X capability.
The a) constructs and SysFS representation of MSI and MSI groups.
The b) adds a second set of them but adding in to SysFS fails (duplicate entry).
'populate_msi_sysfs' frees the newly allocated msi_irq_groups (note that
in a) pdev->msi_irq_groups is still set) and also free's ALL of the
MSI-X entries of the device (the ones allocated in step a) and b)).
The unwind code: 'free_msi_irqs' deletes all the entries and tries to
delete the pdev->msi_irq_groups (which hasn't been set to NULL).
However the pointers in the SysFS are already freed and we hit an
NULL pointer further on when 'strlen' is attempted on a freed pointer.
The patch adds a simple check in the XEN_PCI_OP_enable_msix to guard
against that. The check for msi_enabled is not stricly neccessary.
This is part of XSA-157
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Konrad Rzeszutek Wilk [Fri, 3 Apr 2015 15:08:22 +0000 (11:08 -0400)]
xen/pciback: Return error on XEN_PCI_OP_enable_msi when device has MSI or MSI-X enabled
commit
56441f3c8e5bd45aab10dd9f8c505dd4bec03b0d upstream.
The guest sequence of:
a) XEN_PCI_OP_enable_msi
b) XEN_PCI_OP_enable_msi
c) XEN_PCI_OP_disable_msi
results in hitting an BUG_ON condition in the msi.c code.
The MSI code uses an dev->msi_list to which it adds MSI entries.
Under the above conditions an BUG_ON() can be hit. The device
passed in the guest MUST have MSI capability.
The a) adds the entry to the dev->msi_list and sets msi_enabled.
The b) adds a second entry but adding in to SysFS fails (duplicate entry)
and deletes all of the entries from msi_list and returns (with msi_enabled
is still set). c) pci_disable_msi passes the msi_enabled checks and hits:
BUG_ON(list_empty(dev_to_msi_list(&dev->dev)));
and blows up.
The patch adds a simple check in the XEN_PCI_OP_enable_msi to guard
against that. The check for msix_enabled is not stricly neccessary.
This is part of XSA-157.
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Rusty Russell [Wed, 3 Feb 2016 06:25:26 +0000 (16:55 +1030)]
modules: fix longstanding /proc/kallsyms vs module insertion race.
commit
8244062ef1e54502ef55f54cced659913f244c3e upstream.
For CONFIG_KALLSYMS, we keep two symbol tables and two string tables.
There's one full copy, marked SHF_ALLOC and laid out at the end of the
module's init section. There's also a cut-down version that only
contains core symbols and strings, and lives in the module's core
section.
After module init (and before we free the module memory), we switch
the mod->symtab, mod->num_symtab and mod->strtab to point to the core
versions. We do this under the module_mutex.
However, kallsyms doesn't take the module_mutex: it uses
preempt_disable() and rcu tricks to walk through the modules, because
it's used in the oops path. It's also used in /proc/kallsyms.
There's nothing atomic about the change of these variables, so we can
get the old (larger!) num_symtab and the new symtab pointer; in fact
this is what I saw when trying to reproduce.
By grouping these variables together, we can use a
carefully-dereferenced pointer to ensure we always get one or the
other (the free of the module init section is already done in an RCU
callback, so that's safe). We allocate the init one at the end of the
module init section, and keep the core one inside the struct module
itself (it could also have been allocated at the end of the module
core, but that's probably overkill).
[ Rebased for 4.4-stable and older, because the following changes aren't
in the older trees:
-
e0224418516b4d8a6c2160574bac18447c354ef0: adds arg to is_core_symbol
-
7523e4dc5057e157212b4741abd6256e03404cf1: module_init/module_core/init_size/core_size
become init_layout.base/core_layout.base/init_layout.size/core_layout.size.
Original commit:
8244062ef1e54502ef55f54cced659913f244c3e
]
Reported-by: Weilong Chen <chenweilong@huawei.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=111541
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Jason Andryuk [Fri, 12 Feb 2016 23:13:33 +0000 (23:13 +0000)]
lib/ucs2_string: Correct ucs2 -> utf8 conversion
commit
a68075908a37850918ad96b056acc9ac4ce1bd90 upstream.
The comparisons should be >= since 0x800 and 0x80 require an additional bit
to store.
For the 3 byte case, the existing shift would drop off 2 more bits than
intended.
For the 2 byte case, there should be 5 bits bits in byte 1, and 6 bits in
byte 2.
Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Cc: Peter Jones <pjones@redhat.com>
Cc: Matthew Garrett <mjg59@coreos.com>
Cc: "Lee, Chun-Yi" <jlee@suse.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Matt Fleming [Mon, 15 Feb 2016 10:34:05 +0000 (10:34 +0000)]
efi: Add pstore variables to the deletion whitelist
commit
e246eb568bc4cbbdd8a30a3c11151ff9b7ca7312 upstream.
Laszlo explains why this is a good idea,
'This is because the pstore filesystem can be backed by UEFI variables,
and (for example) a crash might dump the last kilobytes of the dmesg
into a number of pstore entries, each entry backed by a separate UEFI
variable in the above GUID namespace, and with a variable name
according to the above pattern.
Please see "drivers/firmware/efi/efi-pstore.c".
While this patch series will not prevent the user from deleting those
UEFI variables via the pstore filesystem (i.e., deleting a pstore fs
entry will continue to delete the backing UEFI variable), I think it
would be nice to preserve the possibility for the sysadmin to delete
Linux-created UEFI variables that carry portions of the crash log,
*without* having to mount the pstore filesystem.'
There's also no chance of causing machines to become bricked by
deleting these variables, which is the whole purpose of excluding
things from the whitelist.
Use the LINUX_EFI_CRASH_GUID guid and a wildcard '*' for the match so
that we don't have to update the string in the future if new variable
name formats are created for crash dump variables.
Reported-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Peter Jones <pjones@redhat.com>
Tested-by: Peter Jones <pjones@redhat.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: "Lee, Chun-Yi" <jlee@suse.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Peter Jones [Mon, 8 Feb 2016 19:48:15 +0000 (14:48 -0500)]
efi: Make efivarfs entries immutable by default
commit
ed8b0de5a33d2a2557dce7f9429dca8cb5bc5879 upstream.
"rm -rf" is bricking some peoples' laptops because of variables being
used to store non-reinitializable firmware driver data that's required
to POST the hardware.
These are 100% bugs, and they need to be fixed, but in the mean time it
shouldn't be easy to *accidentally* brick machines.
We have to have delete working, and picking which variables do and don't
work for deletion is quite intractable, so instead make everything
immutable by default (except for a whitelist), and make tools that
aren't quite so broad-spectrum unset the immutable flag.
Signed-off-by: Peter Jones <pjones@redhat.com>
Tested-by: Lee, Chun-Yi <jlee@suse.com>
Acked-by: Matthew Garrett <mjg59@coreos.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Peter Jones [Mon, 8 Feb 2016 19:48:14 +0000 (14:48 -0500)]
efi: Make our variable validation list include the guid
commit
8282f5d9c17fe15a9e658c06e3f343efae1a2a2f upstream.
All the variables in this list so far are defined to be in the global
namespace in the UEFI spec, so this just further ensures we're
validating the variables we think we are.
Including the guid for entries will become more important in future
patches when we decide whether or not to allow deletion of variables
based on presence in this list.
Signed-off-by: Peter Jones <pjones@redhat.com>
Tested-by: Lee, Chun-Yi <jlee@suse.com>
Acked-by: Matthew Garrett <mjg59@coreos.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Peter Jones [Mon, 8 Feb 2016 19:48:13 +0000 (14:48 -0500)]
efi: Do variable name validation tests in utf8
commit
3dcb1f55dfc7631695e69df4a0d589ce5274bd07 upstream.
Actually translate from ucs2 to utf8 before doing the test, and then
test against our other utf8 data, instead of fudging it.
Signed-off-by: Peter Jones <pjones@redhat.com>
Acked-by: Matthew Garrett <mjg59@coreos.com>
Tested-by: Lee, Chun-Yi <jlee@suse.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Peter Jones [Mon, 8 Feb 2016 19:48:12 +0000 (14:48 -0500)]
efi: Use ucs2_as_utf8 in efivarfs instead of open coding a bad version
commit
e0d64e6a880e64545ad7d55786aa84ab76bac475 upstream.
Translate EFI's UCS-2 variable names to UTF-8 instead of just assuming
all variable names fit in ASCII.
Signed-off-by: Peter Jones <pjones@redhat.com>
Acked-by: Matthew Garrett <mjg59@coreos.com>
Tested-by: Lee, Chun-Yi <jlee@suse.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Peter Jones [Mon, 8 Feb 2016 19:48:11 +0000 (14:48 -0500)]
lib/ucs2_string: Add ucs2 -> utf8 helper functions
commit
73500267c930baadadb0d02284909731baf151f7 upstream.
This adds ucs2_utf8size(), which tells us how big our ucs2 string is in
bytes, and ucs2_as_utf8, which translates from ucs2 to utf8..
Signed-off-by: Peter Jones <pjones@redhat.com>
Tested-by: Lee, Chun-Yi <jlee@suse.com>
Acked-by: Matthew Garrett <mjg59@coreos.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Marcelo Tosatti [Wed, 14 Oct 2015 22:33:09 +0000 (19:33 -0300)]
KVM: x86: move steal time initialization to vcpu entry time
commit
7cae2bedcbd4680b155999655e49c27b9cf020fa upstream.
As reported at https://bugs.launchpad.net/qemu/+bug/
1494350,
it is possible to have vcpu->arch.st.last_steal initialized
from a thread other than vcpu thread, say the iothread, via
KVM_SET_MSRS.
Which can cause an overflow later (when subtracting from vcpu threads
sched_info.run_delay).
To avoid that, move steal time accumulation to vcpu entry time,
before copying steal time data to guest.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Reviewed-by: David Matlack <dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Andreas Schwab [Fri, 5 Feb 2016 18:50:03 +0000 (19:50 +0100)]
powerpc: Fix dedotify for binutils >= 2.26
commit
f15838e9cac8f78f0cc506529bb9d3b9fa589c1f upstream.
Since binutils 2.26 BFD is doing suffix merging on STRTAB sections. But
dedotify modifies the symbol names in place, which can also modify
unrelated symbols with a name that matches a suffix of a dotted name. To
remove the leading dot of a symbol name we can just increment the pointer
into the STRTAB section instead.
Backport to all stables to avoid breakage when people update their
binutils - mpe.
Signed-off-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Felix Fietkau [Thu, 18 Feb 2016 18:49:18 +0000 (19:49 +0100)]
mac80211: minstrel_ht: set default tx aggregation timeout to 0
commit
7a36b930e6ed4702c866dc74a5ad07318a57c688 upstream.
The value 5000 was put here with the addition of the timeout field to
ieee80211_start_tx_ba_session. It was originally added in mac80211 to
save resources for drivers like iwlwifi, which only supports a limited
number of concurrent aggregation sessions.
Since iwlwifi does not use minstrel_ht and other drivers don't need
this, 0 is a better default - especially since there have been
recent reports of aggregation setup related issues reproduced with
ath9k. This should improve stability without causing any adverse
effects.
Acked-by: Avery Pennarun <apenwarr@gmail.com>
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Johannes Berg [Wed, 27 Jan 2016 11:37:52 +0000 (12:37 +0100)]
wext: fix message delay/ordering
commit
8bf862739a7786ae72409220914df960a0aa80d8 upstream.
Beniamino reported that he was getting an RTM_NEWLINK message for a
given interface, after the RTM_DELLINK for it. It turns out that the
message is a wireless extensions message, which was sent because the
interface had been connected and disconnection while it was deleted
caused a wext message.
For its netlink messages, wext uses RTM_NEWLINK, but the message is
without all the regular rtnetlink attributes, so "ip monitor link"
prints just rudimentary information:
5: wlan1: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN group default
link/ether 02:00:00:00:01:00 brd ff:ff:ff:ff:ff:ff
Deleted 5: wlan1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default
link/ether 02:00:00:00:01:00 brd ff:ff:ff:ff:ff:ff
5: wlan1: <BROADCAST,MULTICAST,UP>
link/ether
(from my hwsim reproduction)
This can cause userspace to get confused since it doesn't expect an
RTM_NEWLINK message after RTM_DELLINK.
The reason for this is that wext schedules a worker to send out the
messages, and the scheduling delay can cause the messages to get out
to userspace in different order.
To fix this, have wext register a netdevice notifier and flush out
any pending messages when netdevice state changes. This fixes any
ordering whenever the original message wasn't sent by a notifier
itself.
Reported-by: Beniamino Galvani <bgalvani@redhat.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Takashi Iwai [Mon, 29 Feb 2016 17:01:12 +0000 (18:01 +0100)]
ASoC: wm8958: Fix enum ctl accesses in a wrong type
commit
d0784829ae3b0beeb69b476f017d5c8a2eb95198 upstream.
"MBC Mode", "VSS Mode", "VSS HPF Mode" and "Enhanced EQ Mode" ctls in
wm8958 codec driver are enum, while the current driver accesses
wrongly via value.integer.value[]. They have to be via
value.enumerated.item[] instead.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Takashi Iwai [Mon, 29 Feb 2016 17:01:15 +0000 (18:01 +0100)]
ASoC: wm8994: Fix enum ctl accesses in a wrong type
commit
8019c0b37cd5a87107808300a496388b777225bf upstream.
The DRC Mode like "AIF1DRC1 Mode" and EQ Mode like "AIF1.1 EQ Mode" in
wm8994 codec driver are enum ctls, while the current driver accesses
wrongly via value.integer.value[]. They have to be via
value.enumerated.item[] instead.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Steven Rostedt (Red Hat) [Wed, 9 Mar 2016 16:58:41 +0000 (11:58 -0500)]
tracing: Fix check for cpu online when event is disabled
commit
dc17147de328a74bbdee67c1bf37d2f1992de756 upstream.
Commit
f37755490fe9b ("tracepoints: Do not trace when cpu is offline") added
a check to make sure that tracepoints only get called when the cpu is
online, as it uses rcu_read_lock_sched() for protection.
Commit
3a630178fd5f3 ("tracing: generate RCU warnings even when tracepoints
are disabled") added lockdep checks (including rcu checks) for events that
are not enabled to catch possible RCU issues that would only be triggered if
a trace event was enabled. Commit
f37755490fe9b only stopped the warnings
when the trace event was enabled but did not prevent warnings if the trace
event was called when disabled.
To fix this, the cpu online check is moved to where the condition is added
to the trace event. This will place the cpu online check in all places that
it may be used now and in the future.
Fixes: f37755490fe9b ("tracepoints: Do not trace when cpu is offline")
Fixes: 3a630178fd5f3 ("tracing: generate RCU warnings even when tracepoints are disabled")
Reported-by: Sudeep Holla <sudeep.holla@arm.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Radim Krčmář [Fri, 4 Mar 2016 14:08:42 +0000 (15:08 +0100)]
KVM: VMX: disable PEBS before a guest entry
commit
7099e2e1f4d9051f31bbfa5803adf954bb5d76ef upstream.
Linux guests on Haswell (and also SandyBridge and Broadwell, at least)
would crash if you decided to run a host command that uses PEBS, like
perf record -e 'cpu/mem-stores/pp' -a
This happens because KVM is using VMX MSR switching to disable PEBS, but
SDM [2015-12] 18.4.4.4 Re-configuring PEBS Facilities explains why it
isn't safe:
When software needs to reconfigure PEBS facilities, it should allow a
quiescent period between stopping the prior event counting and setting
up a new PEBS event. The quiescent period is to allow any latent
residual PEBS records to complete its capture at their previously
specified buffer address (provided by IA32_DS_AREA).
There might not be a quiescent period after the MSR switch, so a CPU
ends up using host's MSR_IA32_DS_AREA to access an area in guest's
memory. (Or MSR switching is just buggy on some models.)
The guest can learn something about the host this way:
If the guest doesn't map address pointed by MSR_IA32_DS_AREA, it results
in #PF where we leak host's MSR_IA32_DS_AREA through CR2.
After that, a malicious guest can map and configure memory where
MSR_IA32_DS_AREA is pointing and can therefore get an output from
host's tracing.
This is not a critical leak as the host must initiate with PEBS tracing
and I have not been able to get a record from more than one instruction
before vmentry in vmx_vcpu_run() (that place has most registers already
overwritten with guest's).
We could disable PEBS just few instructions before vmentry, but
disabling it earlier shouldn't affect host tracing too much.
We also don't need to switch MSR_IA32_PEBS_ENABLE on VMENTRY, but that
optimization isn't worth its code, IMO.
(If you are implementing PEBS for guests, be sure to handle the case
where both host and guest enable PEBS, because this patch doesn't.)
Fixes: 26a4f3c08de4 ("perf/x86: disable PEBS on a guest entry.")
Reported-by: Jiří Olša <jolsa@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Jiri Slaby [Mon, 14 Mar 2016 12:33:30 +0000 (13:33 +0100)]
Revert "drm/radeon: hold reference to fences in radeon_sa_bo_new"
This reverts commit
40df18b49e7fe4ec9ab93f68c33661ee291149bd, commit
f6ff4f67cdf8455d0a4226eeeaf5af17c37d05eb upstream.
It causes oopses:
BUG: unable to handle kernel NULL pointer dereference at
0000000000000008
IP: [<
ffffffffa010345d>] radeon_fence_ref+0xd/0x50 [radeon]
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Nicolai Hähnle <nicolai.haehnle@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Richard Weinberger [Sun, 21 Feb 2016 09:53:03 +0000 (10:53 +0100)]
ubi: Fix out of bounds write in volume update code
commit
e4f6daac20332448529b11f09388f1d55ef2084c upstream.
ubi_start_leb_change() allocates too few bytes.
ubi_more_leb_change_data() will write up to req->upd_bytes +
ubi->min_io_size bytes.
Signed-off-by: Richard Weinberger <richard@nod.at>
Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Maciej W. Rozycki [Fri, 4 Mar 2016 01:42:49 +0000 (01:42 +0000)]
MIPS: traps: Fix SIGFPE information leak from `do_ov' and `do_trap_or_bp'
commit
e723e3f7f9591b79e8c56b3d7c5a204a9c571b55 upstream.
Avoid sending a partially initialised `siginfo_t' structure along SIGFPE
signals issued from `do_ov' and `do_trap_or_bp', leading to information
leaking from the kernel stack.
Signed-off-by: Maciej W. Rozycki <macro@imgtec.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Yegor Yefremov [Mon, 29 Feb 2016 15:39:57 +0000 (16:39 +0100)]
USB: serial: option: add support for Quectel UC20
commit
c0992d0f54847d0d1d85c60fcaa054f175ab1ccd upstream.
Add support for Quectel UC20 and blacklist the QMI interface.
Signed-off-by: Yegor Yefremov <yegorslists@googlemail.com>
[johan: amend commit message ]
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Daniele Palmas [Mon, 29 Feb 2016 14:36:11 +0000 (15:36 +0100)]
USB: serial: option: add support for Telit LE922 PID 0x1045
commit
5deef5551c77e488922cc4bf4bc76df63be650d0 upstream.
This patch adds support for 0x1045 PID of Telit LE922.
Signed-off-by: Daniele Palmas <dnlplm@gmail.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Vittorio Alfieri [Sun, 28 Feb 2016 13:40:24 +0000 (14:40 +0100)]
USB: cp210x: Add ID for Parrot NMEA GPS Flight Recorder
commit
3c4c615d70c8cbdc8ba8c79ed702640930652a79 upstream.
The Parrot NMEA GPS Flight Recorder is a USB composite device
consisting of hub, flash storage, and cp210x usb to serial chip.
It is an accessory to the mass-produced Parrot AR Drone 2.
The device emits standard NMEA messages which make the it compatible
with NMEA compatible software. It was tested using gpsd version 3.11-3
as an NMEA interpreter and using the official Parrot Flight Recorder.
Signed-off-by: Vittorio Alfieri <vittorio88@gmail.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Takashi Iwai [Sun, 28 Feb 2016 10:36:14 +0000 (11:36 +0100)]
ALSA: timer: Fix broken compat timer user status ioctl
commit
3a72494ac2a3bd229db941d51e7efe2f6ccd947b upstream.
The timer user status compat ioctl returned the bogus struct used for
64bit architectures instead of the 32bit one. This patch addresses
it to return the proper struct.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Takashi Iwai [Mon, 29 Feb 2016 13:32:42 +0000 (14:32 +0100)]
ALSA: hdspm: Fix zero-division
commit
c1099c3294c2344110085a38c50e478a5992b368 upstream.
HDSPM driver contains a code issuing zero-division potentially in
system sample rate ctl code. This patch fixes it by not processing
a zero or invalid rate value as a divisor, as well as excluding the
invalid value to be passed via the given ctl element.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Takashi Iwai [Mon, 29 Feb 2016 13:26:43 +0000 (14:26 +0100)]
ALSA: hdsp: Fix wrong boolean ctl value accesses
commit
eab3c4db193f5fcccf70e884de9a922ca2c63d80 upstream.
snd-hdsp driver accesses enum item values (int) instead of boolean
values (long) wrongly for some ctl elements. This patch fixes them.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Takashi Iwai [Mon, 29 Feb 2016 13:25:16 +0000 (14:25 +0100)]
ALSA: hdspm: Fix wrong boolean ctl value accesses
commit
537e48136295c5860a92138c5ea3959b9542868b upstream.
snd-hdspm driver accesses enum item values (int) instead of boolean
values (long) wrongly for some ctl elements. This patch fixes them.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Takashi Iwai [Tue, 1 Mar 2016 17:30:18 +0000 (18:30 +0100)]
ALSA: seq: oss: Don't drain at closing a client
commit
197b958c1e76a575d77038cc98b4bebc2134279f upstream.
The OSS sequencer client tries to drain the pending events at
releasing. Unfortunately, as spotted by syzkaller fuzzer, this may
lead to an unkillable process state when the event has been queued at
the far future. Since the process being released can't be signaled
any longer, it remains and waits for the echo-back event in that far
future.
Back to history, the draining feature was implemented at the time we
misinterpreted POSIX definition for blocking file operation.
Actually, such a behavior is superfluous at release, and we should
just release the device as is instead of keeping it up forever.
This patch just removes the draining call that may block the release
for too long time unexpectedly.
BugLink: http://lkml.kernel.org/r/CACT4Y+Y4kD-aBGj37rf-xBw9bH3GMU6P+MYg4W1e-s-paVD2pg@mail.gmail.com
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Takashi Iwai [Sun, 28 Feb 2016 10:41:47 +0000 (11:41 +0100)]
ALSA: timer: Fix ioctls for X32 ABI
commit
b24e7ad1fdc22177eb3e51584e1cfcb45d818488 upstream.
X32 ABI takes the 64bit timespec, thus the timer user status ioctl becomes
incompatible with IA32. This results in NOTTY error when the ioctl is
issued.
Meanwhile, this struct in X32 is essentially identical with the one in
X86-64, so we can just bypassing to the existing code for this
specific compat ioctl.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Takashi Iwai [Sun, 28 Feb 2016 10:28:08 +0000 (11:28 +0100)]
ALSA: rawmidi: Fix ioctls X32 ABI
commit
2251fbbc1539f05b0b206b37a602d5776be37252 upstream.
Like the previous fixes for ctl and PCM, we need a fix for
incompatible X32 ABI regarding the rawmidi: namely, struct
snd_rawmidi_status has the timespec, and the size and the alignment on
X32 differ from IA32.
This patch fixes the incompatible ioctl for X32.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Takashi Iwai [Sat, 27 Feb 2016 16:52:42 +0000 (17:52 +0100)]
ALSA: ctl: Fix ioctls for X32 ABI
commit
6236d8bb2afcfe71b88ecea554e0dc638090a45f upstream.
The X32 ABI takes the same alignment like x86-64, and this may result
in the incompatible struct size from ia32. Unfortunately, we hit this
in some control ABI: struct snd_ctl_elem_value differs between them
due to the position of 64bit variable array. This ends up with the
unknown ioctl (ENOTTY) error.
The fix is to add the compat entries for the new aligned struct.
Reported-and-tested-by: Steven Newbury <steve@snewbury.org.uk>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
David Woodhouse [Mon, 1 Feb 2016 14:04:46 +0000 (14:04 +0000)]
Fix directory hardlinks from deleted directories
commit
be629c62a603e5935f8177fd8a19e014100a259e upstream.
When a directory is deleted, we don't take too much care about killing off
all the dirents that belong to it — on the basis that on remount, the scan
will conclude that the directory is dead anyway.
This doesn't work though, when the deleted directory contained a child
directory which was moved *out*. In the early stages of the fs build
we can then end up with an apparent hard link, with the child directory
appearing both in its true location, and as a child of the original
directory which are this stage of the mount process we don't *yet* know
is defunct.
To resolve this, take out the early special-casing of the "directories
shall not have hard links" rule in jffs2_build_inode_pass1(), and let the
normal nlink processing happen for directories as well as other inodes.
Then later in the build process we can set ic->pino_nlink to the parent
inode#, as is required for directories during normal operaton, instead
of the nlink. And complain only *then* about hard links which are still
in evidence even after killing off all the unreachable paths.
Reported-by: Liu Song <liu.song11@zte.com.cn>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
David Woodhouse [Mon, 1 Feb 2016 12:37:20 +0000 (12:37 +0000)]
jffs2: Fix page lock / f->sem deadlock
commit
49e91e7079febe59a20ca885a87dd1c54240d0f1 upstream.
With this fix, all code paths should now be obtaining the page lock before
f->sem.
Reported-by: Szabó Tamás <sztomi89@gmail.com>
Tested-by: Thomas Betker <thomas.betker@rohde-schwarz.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Thomas Betker [Tue, 10 Nov 2015 21:18:15 +0000 (22:18 +0100)]
Revert "jffs2: Fix lock acquisition order bug in jffs2_write_begin"
commit
157078f64b8a9cd7011b6b900b2f2498df850748 upstream.
This reverts commit
5ffd3412ae55
("jffs2: Fix lock acquisition order bug in jffs2_write_begin").
The commit modified jffs2_write_begin() to remove a deadlock with
jffs2_garbage_collect_live(), but this introduced new deadlocks found
by multiple users. page_lock() actually has to be called before
mutex_lock(&c->alloc_sem) or mutex_lock(&f->sem) because
jffs2_write_end() and jffs2_readpage() are called with the page locked,
and they acquire c->alloc_sem and f->sem, resp.
In other words, the lock order in jffs2_write_begin() was correct, and
it is the jffs2_garbage_collect_live() path that has to be changed.
Revert the commit to get rid of the new deadlocks, and to clear the way
for a better fix of the original deadlock.
Reported-by: Deng Chao <deng.chao1@zte.com.cn>
Reported-by: Ming Liu <liu.ming50@gmail.com>
Reported-by: wangzaiwei <wangzaiwei@top-vision.cn>
Signed-off-by: Thomas Betker <thomas.betker@rohde-schwarz.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Todd E Brandt [Thu, 3 Mar 2016 00:05:29 +0000 (16:05 -0800)]
PM / sleep / x86: Fix crash on graph trace through x86 suspend
commit
92f9e179a702a6adbc11e2fedc76ecd6ffc9e3f7 upstream.
Pause/unpause graph tracing around do_suspend_lowlevel as it has
inconsistent call/return info after it jumps to the wakeup vector.
The graph trace buffer will otherwise become misaligned and
may eventually crash and hang on suspend.
To reproduce the issue and test the fix:
Run a function_graph trace over suspend/resume and set the graph
function to suspend_devices_and_enter. This consistently hangs the
system without this fix.
Signed-off-by: Todd Brandt <todd.e.brandt@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Harvey Hunt [Wed, 24 Feb 2016 15:16:43 +0000 (15:16 +0000)]
libata: Align ata_device's id on a cacheline
commit
4ee34ea3a12396f35b26d90a094c75db95080baa upstream.
The id buffer in ata_device is a DMA target, but it isn't explicitly
cacheline aligned. Due to this, adjacent fields can be overwritten with
stale data from memory on non coherent architectures. As a result, the
kernel is sometimes unable to communicate with an ATA device.
Fix this by ensuring that the id buffer is cacheline aligned.
This issue is similar to that fixed by Commit
84bda12af31f
("libata: align ap->sector_buf").
Signed-off-by: Harvey Hunt <harvey.hunt@imgtec.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Arnd Bergmann [Thu, 11 Feb 2016 13:16:27 +0000 (14:16 +0100)]
libata: fix HDIO_GET_32BIT ioctl
commit
287e6611ab1eac76c2c5ebf6e345e04c80ca9c61 upstream.
As reported by Soohoon Lee, the HDIO_GET_32BIT ioctl does not
work correctly in compat mode with libata.
I have investigated the issue further and found multiple problems
that all appeared with the same commit that originally introduced
HDIO_GET_32BIT handling in libata back in linux-2.6.8 and presumably
also linux-2.4, as the code uses "copy_to_user(arg, &val, 1)" to copy
a 'long' variable containing either 0 or 1 to user space.
The problems with this are:
* On big-endian machines, this will always write a zero because it
stores the wrong byte into user space.
* In compat mode, the upper three bytes of the variable are updated
by the compat_hdio_ioctl() function, but they now contain
uninitialized stack data.
* The hdparm tool calling this ioctl uses a 'static long' variable
to store the result. This means at least the upper bytes are
initialized to zero, but calling another ioctl like HDIO_GET_MULTCOUNT
would fill them with data that remains stale when the low byte
is overwritten. Fortunately libata doesn't implement any of the
affected ioctl commands, so this would only happen when we query
both an IDE and an ATA device in the same command such as
"hdparm -N -c /dev/hda /dev/sda"
* The libata code for unknown reasons started using ATA_IOC_GET_IO32
and ATA_IOC_SET_IO32 as aliases for HDIO_GET_32BIT and HDIO_SET_32BIT,
while the ioctl commands that were added later use the normal
HDIO_* names. This is harmless but rather confusing.
This addresses all four issues by changing the code to use put_user()
on an 'unsigned long' variable in HDIO_GET_32BIT, like the IDE subsystem
does, and by clarifying the names of the ioctl commands.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reported-by: Soohoon Lee <Soohoon.Lee@f5.com>
Tested-by: Soohoon Lee <Soohoon.Lee@f5.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Timothy Pearson [Fri, 26 Feb 2016 21:29:32 +0000 (15:29 -0600)]
drm/ast: Fix incorrect register check for DRAM width
commit
2d02b8bdba322b527c5f5168ce1ca10c2d982a78 upstream.
During DRAM initialization on certain ASpeed devices, an incorrect
bit (bit 10) was checked in the "SDRAM Bus Width Status" register
to determine DRAM width.
Query bit 6 instead in accordance with the Aspeed AST2050 datasheet v1.05.
Signed-off-by: Timothy Pearson <tpearson@raptorengineeringinc.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Andy Lutomirski [Wed, 24 Feb 2016 20:18:49 +0000 (12:18 -0800)]
x86/entry/compat: Add missing CLAC to entry_INT80_32
commit
3d44d51bd339766f0178f0cf2e8d048b4a4872aa upstream.
This doesn't seem to fix a regression -- I don't think the CLAC was
ever there.
I double-checked in a debugger: entries through the int80 gate do
not automatically clear AC.
Stable maintainers: I can provide a backport to 4.3 and earlier if
needed. This needs to be backported all the way to 3.10.
Reported-by: Brian Gerst <brgerst@gmail.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 63bcff2a307b ("x86, smap: Add STAC and CLAC instructions to control user space access")
Link: http://lkml.kernel.org/r/b02b7e71ae54074be01fc171cbd4b72517055c0e.1456345086.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
[ kamal: backport to 3.10 through 3.19-stable: file rename; context ]
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Suravee Suthikulpanit [Tue, 23 Feb 2016 12:03:30 +0000 (13:03 +0100)]
iommu/amd: Fix boot warning when device 00:00.0 is not iommu covered
commit
38e45d02ea9f194b89d6bf41e52ccafc8e2c2b47 upstream.
The setup code for the performance counters in the AMD IOMMU driver
tests whether the counters can be written. It tests to setup a counter
for device 00:00.0, which fails on systems where this particular device
is not covered by the IOMMU.
Fix this by not relying on device 00:00.0 but only on the IOMMU being
present.
Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Pavel Shilovsky [Sat, 27 Feb 2016 08:58:18 +0000 (11:58 +0300)]
CIFS: Fix SMB2+ interim response processing for read requests
commit
6cc3b24235929b54acd5ecc987ef11a425bd209e upstream.
For interim responses we only need to parse a header and update
a number credits. Now it is done for all SMB2+ command except
SMB2_READ which is wrong. Fix this by adding such processing.
Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Tested-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Justin Maggard [Tue, 9 Feb 2016 23:52:08 +0000 (15:52 -0800)]
cifs: fix out-of-bounds access in lease parsing
commit
deb7deff2f00bdbbcb3d560dad2a89ef37df837d upstream.
When opening a file, SMB2_open() attempts to parse the lease state from the
SMB2 CREATE Response. However, the parsing code was not careful to ensure
that the create contexts are not empty or invalid, which can lead to out-
of-bounds memory access. This can be seen easily by trying
to read a file from a OSX 10.11 SMB3 server. Here is sample crash output:
BUG: unable to handle kernel paging request at
ffff8800a1a77cc6
IP: [<
ffffffff8828a734>] SMB2_open+0x804/0x960
PGD
8f77067 PUD 0
Oops: 0000 [#1] SMP
Modules linked in:
CPU: 3 PID: 2876 Comm: cp Not tainted 4.5.0-rc3.x86_64.1+ #14
Hardware name: NETGEAR ReadyNAS 314 /ReadyNAS 314 , BIOS 4.6.5 10/11/2012
task:
ffff880073cdc080 ti:
ffff88005b31c000 task.ti:
ffff88005b31c000
RIP: 0010:[<
ffffffff8828a734>] [<
ffffffff8828a734>] SMB2_open+0x804/0x960
RSP: 0018:
ffff88005b31fa08 EFLAGS:
00010282
RAX:
0000000000000015 RBX:
0000000000000000 RCX:
0000000000000006
RDX:
0000000000000000 RSI:
0000000000000246 RDI:
ffff88007eb8c8b0
RBP:
ffff88005b31fad8 R08:
666666203d206363 R09:
6131613030383866
R10:
3030383866666666 R11:
00000000000002b0 R12:
ffff8800660fd800
R13:
ffff8800a1a77cc2 R14:
00000000424d53fe R15:
ffff88005f5a28c0
FS:
00007f7c8a2897c0(0000) GS:
ffff88007eb80000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
000000008005003b
CR2:
ffff8800a1a77cc6 CR3:
000000005b281000 CR4:
00000000000006e0
Stack:
ffff88005b31fa70 ffffffff88278789 00000000000001d3 ffff88005f5a2a80
ffffffff00000003 ffff88005d029d00 ffff88006fde05a0 0000000000000000
ffff88005b31fc78 ffff88006fde0780 ffff88005b31fb2f 0000000100000fe0
Call Trace:
[<
ffffffff88278789>] ? cifsConvertToUTF16+0x159/0x2d0
[<
ffffffff8828cf68>] smb2_open_file+0x98/0x210
[<
ffffffff8811e80c>] ? __kmalloc+0x1c/0xe0
[<
ffffffff882685f4>] cifs_open+0x2a4/0x720
[<
ffffffff88122cef>] do_dentry_open+0x1ff/0x310
[<
ffffffff88268350>] ? cifsFileInfo_get+0x30/0x30
[<
ffffffff88123d92>] vfs_open+0x52/0x60
[<
ffffffff88131dd0>] path_openat+0x170/0xf70
[<
ffffffff88097d48>] ? remove_wait_queue+0x48/0x50
[<
ffffffff88133a29>] do_filp_open+0x79/0xd0
[<
ffffffff8813f2ca>] ? __alloc_fd+0x3a/0x170
[<
ffffffff881240c4>] do_sys_open+0x114/0x1e0
[<
ffffffff881241a9>] SyS_open+0x19/0x20
[<
ffffffff8896e257>] entry_SYSCALL_64_fastpath+0x12/0x6a
Code: 4d 8d 6c 07 04 31 c0 4c 89 ee e8 47 6f e5 ff 31 c9 41 89 ce 44 89 f1 48 c7 c7 28 b1 bd 88 31 c0 49 01 cd 4c 89 ee e8 2b 6f e5 ff <45> 0f b7 75 04 48 c7 c7 31 b1 bd 88 31 c0 4d 01 ee 4c 89 f6 e8
RIP [<
ffffffff8828a734>] SMB2_open+0x804/0x960
RSP <
ffff88005b31fa08>
CR2:
ffff8800a1a77cc6
---[ end trace
d9f69ba64feee469 ]---
Signed-off-by: Justin Maggard <jmaggard@netgear.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Michal Marek [Wed, 9 Dec 2015 14:08:21 +0000 (15:08 +0100)]
genksyms: Handle string literals with spaces in reference files
commit
a78f70e8d65e88b9f631d073f68cb26dcd746298 upstream.
The reference files use spaces to separate tokens, however, we must
preserve spaces inside string literals. Currently the only case in the
tree is struct edac_raw_error_desc in <linux/edac.h>:
$ KBUILD_SYMTYPES=1 make -s drivers/edac/amd64_edac.symtypes
$ mv drivers/edac/amd64_edac.{symtypes,symref}
$ KBUILD_SYMTYPES=1 make -s drivers/edac/amd64_edac.symtypes
drivers/edac/amd64_edac.c:527: warning: amd64_get_dram_hole_info: modversion changed because of changes in struct edac_raw_error_desc
Signed-off-by: Michal Marek <mmarek@suse.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Vasu Dev [Mon, 23 Nov 2015 18:31:25 +0000 (10:31 -0800)]
ixgbe: fix broken PFC with X550
commit
cb78cf12d6e90f57f6e7d090867ef19b6a189dde upstream.
PFC is configuration is skipped for X550 devices due to a incorrect
device id check, fixing that to include X550 PFC configuration.
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Vasu Dev [Mon, 23 Nov 2015 18:31:01 +0000 (10:31 -0800)]
ixgbe: use correct FCoE DDP max check
commit
f10166aba2def9bc6443290231c60f7e2f70129b upstream.
Use fcoe_ddp_xid from netdev as this is correctly set for different
device IDs to avoid DDP skip error on X550 as "xid=0x20b out-of-range"
Signed-off-by: Vasu Dev <vasu.dev@intel.com>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Al Cooper [Wed, 2 Dec 2015 19:55:07 +0000 (14:55 -0500)]
usb: Add connected retry on resume for non SS devices
commit
6b82b1223e3b14afd89d167795671a9f4f77b2f0 upstream.
Currently usb_port_resume waits for up to 2 seconds for CONNECT
status for SS devices only. This change will do the same thing for
non-SS devices even though the reason is a little different. This
will fix an issue where VBUS is turned off during system wide
"suspend to ram" and some 2.0 devices take greater than the current
max of 100ms to show connected after VBUS is enabled. This is most
commonly seen on hard drive based devices and USB3.0 devices plugged
into a 2.0 only port.
Signed-off-by: Al Cooper <alcooperx@gmail.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Oliver Neukum [Tue, 27 Oct 2015 11:51:34 +0000 (09:51 -0200)]
[media] usbvision fix overflow of interfaces array
commit
588afcc1c0e45358159090d95bf7b246fb67565f upstream.
This fixes the crash reported in:
http://seclists.org/bugtraq/2015/Oct/35
The interface number needs a sanity check.
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Cc: Vladis Dronov <vdronov@redhat.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Konstantin Shkolnyy [Wed, 28 Oct 2015 21:02:54 +0000 (16:02 -0500)]
USB: cp210x: work around cp2108 GET_LINE_CTL bug
commit
d0bf1ff0ae322aca59b00b9a2ad121d35a77e78f upstream.
Add helper to access line-control register in order to work around a
cp2108 GET_LINE_CTL bug.
cp2108 GET_LINE_CTL returns the 16-bit value with the 2 bytes swapped.
However, SET_LINE_CTL functions properly. When the driver tries to modify
the register, it reads it, modifies some bits and writes back. Because the
read bytes were swapped, this often results in an invalid value to be
written. In turn, this causes cp2108 respond with a stall. The stall
sometimes doesn't clear properly and cp2108 starts responding to following
valid commands also with stalls, effectively failing.
Signed-off-by: Konstantin Shkolnyy <konstantin.shkolnyy@gmail.com>
[johan: amend commit message, modify probe error handling ]
Signed-off-by: Johan Hovold <johan@kernel.org>
Cc: Oliver Neukum <oliver@neukum.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Konstantin Shkolnyy [Wed, 28 Oct 2015 21:02:34 +0000 (16:02 -0500)]
USB: cp210x: relocate private data from USB interface to port
commit
e2ae67a3b55188b0342522d8139acf013feb2a69 upstream.
This change is preparation for implementing a cp2108 bug workaround.
The workaround requires storing some private data. Right now the data is
attached to the USB interface and allocated in the attach() callback.
The bug detection requires USB I/O which is done easier from port_probe()
callback rather than attach(). Since the USB access functions take port
as a parameter, and since the private data is used exclusively by these
functions, it can be allocated in port_probe(). Also, all cp210x devices
have exactly 1 port per USB iterface, so moving private data from the USB
interface to port is trivial.
Signed-off-by: Konstantin Shkolnyy <konstantin.shkolnyy@gmail.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Cc: Oliver Neukum <oliver@neukum.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Konstantin Shkolnyy [Wed, 28 Oct 2015 21:02:03 +0000 (16:02 -0500)]
USB: cp210x: flush device queues at close
commit
ebfb319bb601e501f77809a83b0b69b529c22a8d upstream.
Flush all device queues at close in order to work around a cp2108 Tx
queue bug.
Occasionally, writing data and immediately closing the port makes cp2108
stop responding. The device has to be unplugged to clear the error.
The failure is induced by shutting down the device while its Tx queue
still has unsent data. This condition is avoided by issuing PURGE command
from the close() callback.
This change is applied to all cp210x devices. Clearing internal queues on
close is generally good.
Signed-off-by: Konstantin Shkolnyy <konstantin.shkolnyy@gmail.com>
[johan: amend commit message ]
Signed-off-by: Johan Hovold <johan@kernel.org>
Cc: Oliver Neukum <oliver@neukum.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Hannes Frederic Sowa [Wed, 3 Feb 2016 01:11:03 +0000 (02:11 +0100)]
unix: correctly track in-flight fds in sending process user_struct
commit
415e3d3e90ce9e18727e8843ae343eda5a58fad6 upstream.
The commit referenced in the Fixes tag incorrectly accounted the number
of in-flight fds over a unix domain socket to the original opener
of the file-descriptor. This allows another process to arbitrary
deplete the original file-openers resource limit for the maximum of
open files. Instead the sending processes and its struct cred should
be credited.
To do so, we add a reference counted struct user_struct pointer to the
scm_fp_list and use it to account for the number of inflight unix fds.
Fixes: 712f4aad406bb1 ("unix: properly account for FDs passed over unix sockets")
Reported-by: David Herrmann <dh.herrmann@gmail.com>
Cc: David Herrmann <dh.herrmann@gmail.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Michal Kubecek <mkubecek@suse.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
willy tarreau [Sun, 10 Jan 2016 06:54:56 +0000 (07:54 +0100)]
unix: properly account for FDs passed over unix sockets
[ Upstream commit
712f4aad406bb1ed67f3f98d04c044191f0ff593 ]
It is possible for a process to allocate and accumulate far more FDs than
the process' limit by sending them over a unix socket then closing them
to keep the process' fd count low.
This change addresses this problem by keeping track of the number of FDs
in flight per user and preventing non-privileged processes from having
more FDs in flight than their configured FD limit.
Cc: Michal Kubecek <mkubecek@suse.com>
Reported-by: socketpair@gmail.com
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Mitigates: CVE-2013-4312 (Linux 2.0+)
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
NeilBrown [Mon, 7 Mar 2016 09:19:02 +0000 (10:19 +0100)]
nfsd: fix problem with setting ACL on directories
This is a regression of 3.12 stable commit
ba1816b40a87a3f3c4ca9cd54a593a10d87bd391 (nfsd: fix NFS regression).
If a non-inherited ACL is set on a directory, nfsd will try to set the Posix
default ACL to NULL. This gets converted to "" by generic_setxattr().
As "" is not a valid posix acl attribute value, this results in an error.
So instead of setting the xattr to NULL, remove it.
Fixes: ba1816b40a ("nfsd: fix NFS regression")
Signed-off-by: NeilBrown <neilb@suse.com>
Cc: Sergio Gelato <Sergio.Gelato@astro.su.se>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Jiri Slaby [Thu, 3 Mar 2016 12:28:30 +0000 (13:28 +0100)]
Linux 3.12.56
Hannes Reinecke [Fri, 12 Feb 2016 08:39:15 +0000 (09:39 +0100)]
bio: return EINTR if copying to user space got interrupted
commit
2d99b55d378c996b9692a0c93dd25f4ed5d58934 upstream.
Commit
35dc248383bbab0a7203fca4d722875bc81ef091 introduced a check for
current->mm to see if we have a user space context and only copies data
if we do. Now if an IO gets interrupted by a signal data isn't copied
into user space any more (as we don't have a user space context) but
user space isn't notified about it.
This patch modifies the behaviour to return -EINTR from bio_uncopy_user()
to notify userland that a signal has interrupted the syscall, otherwise
it could lead to a situation where the caller may get a buffer with
no data returned.
This can be reproduced by issuing SG_IO ioctl()s in one thread while
constantly sending signals to it.
[js] backport to 3.12
Fixes: 35dc248 [SCSI] sg: Fix user memory corruption when SG_IO is interrupted by a signal
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Borislav Petkov [Tue, 1 Dec 2015 14:52:36 +0000 (15:52 +0100)]
EDAC, mc_sysfs: Fix freeing bus' name
commit
12e26969b32c79018165d52caff3762135614aa1 upstream.
I get the splat below when modprobing/rmmoding EDAC drivers. It happens
because bus->name is invalid after bus_unregister() has run. The Code: section
below corresponds to:
.loc 1 1108 0
movq 672(%rbx), %rax # mci_1(D)->bus, mci_1(D)->bus
.loc 1 1109 0
popq %rbx #
.loc 1 1108 0
movq (%rax), %rdi # _7->name,
jmp kfree #
and %rax has some funky stuff
2030203020312030 which looks a lot like
something walked over it.
Fix that by saving the name ptr before doing stuff to string it points to.
general protection fault: 0000 [#1] SMP
Modules linked in: ...
CPU: 4 PID: 10318 Comm: modprobe Tainted: G I EN 3.12.51-11-default+ #48
Hardware name: HP ProLiant DL380 G7, BIOS P67 05/05/2011
task:
ffff880311320280 ti:
ffff88030da3e000 task.ti:
ffff88030da3e000
RIP: 0010:[<
ffffffffa019da92>] [<
ffffffffa019da92>] edac_unregister_sysfs+0x22/0x30 [edac_core]
RSP: 0018:
ffff88030da3fe28 EFLAGS:
00010292
RAX:
2030203020312030 RBX:
ffff880311b4e000 RCX:
000000000000095c
RDX:
0000000000000001 RSI:
ffff880327bb9600 RDI:
0000000000000286
RBP:
ffff880311b4e750 R08:
0000000000000000 R09:
ffffffff81296110
R10:
0000000000000400 R11:
0000000000000000 R12:
ffff88030ba1ac68
R13:
0000000000000001 R14:
00000000011b02f0 R15:
0000000000000000
FS:
00007fc9bf8f5700(0000) GS:
ffff8801a7c40000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
000000008005003b
CR2:
0000000000403c90 CR3:
000000019ebdf000 CR4:
00000000000007e0
Stack:
Call Trace:
i7core_unregister_mci.isra.9
i7core_remove
pci_device_remove
__device_release_driver
driver_detach
bus_remove_driver
pci_unregister_driver
i7core_exit
SyS_delete_module
system_call_fastpath
0x7fc9bf426536
Code: 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 53 48 89 fb e8 52 2a 1f e1 48 8b bb a0 02 00 00 e8 46 59 1f e1 48 8b 83 a0 02 00 00 5b <48> 8b 38 e9 26 9a fe e0 66 0f 1f 44 00 00 66 66 66 66 90 48 8b
RIP [<
ffffffffa019da92>] edac_unregister_sysfs+0x22/0x30 [edac_core]
RSP <
ffff88030da3fe28>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
Fixes: 7a623c039075 ("edac: rewrite the sysfs code to use struct device")
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Jeff Layton [Thu, 7 Jan 2016 21:38:10 +0000 (16:38 -0500)]
locks: fix unlock when fcntl_setlk races with a close
commit
7f3697e24dc3820b10f445a4a7d914fc356012d1 upstream.
Dmitry reported that he was able to reproduce the WARN_ON_ONCE that
fires in locks_free_lock_context when the flc_posix list isn't empty.
The problem turns out to be that we're basically rebuilding the
file_lock from scratch in fcntl_setlk when we discover that the setlk
has raced with a close. If the l_whence field is SEEK_CUR or SEEK_END,
then we may end up with fl_start and fl_end values that differ from
when the lock was initially set, if the file position or length of the
file has changed in the interim.
Fix this by just reusing the same lock request structure, and simply
override fl_type value with F_UNLCK as appropriate. That ensures that
we really are unlocking the lock that was initially set.
While we're there, make sure that we do pop a WARN_ON_ONCE if the
removal ever fails. Also return -EBADF in this event, since that's
what we would have returned if the close had happened earlier.
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Fixes: c293621bbf67 (stale POSIX lock handling)
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Acked-by: "J. Bruce Fields" <bfields@fieldses.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Konrad Rzeszutek Wilk [Thu, 11 Feb 2016 21:10:26 +0000 (16:10 -0500)]
xen/pcifront: Fix mysterious crashes when NUMA locality information was extracted.
commit
4d8c8bd6f2062c9988817183a91fe2e623c8aa5e upstream.
Occasionaly PV guests would crash with:
pciback 0000:00:00.1: Xen PCI mapped GSI0 to IRQ16
BUG: unable to handle kernel paging request at
0000000d1a8c0be0
.. snip..
<
ffffffff8139ce1b>] find_next_bit+0xb/0x10
[<
ffffffff81387f22>] cpumask_next_and+0x22/0x40
[<
ffffffff813c1ef8>] pci_device_probe+0xb8/0x120
[<
ffffffff81529097>] ? driver_sysfs_add+0x77/0xa0
[<
ffffffff815293e4>] driver_probe_device+0x1a4/0x2d0
[<
ffffffff813c1ddd>] ? pci_match_device+0xdd/0x110
[<
ffffffff81529657>] __device_attach_driver+0xa7/0xb0
[<
ffffffff815295b0>] ? __driver_attach+0xa0/0xa0
[<
ffffffff81527622>] bus_for_each_drv+0x62/0x90
[<
ffffffff8152978d>] __device_attach+0xbd/0x110
[<
ffffffff815297fb>] device_attach+0xb/0x10
[<
ffffffff813b75ac>] pci_bus_add_device+0x3c/0x70
[<
ffffffff813b7618>] pci_bus_add_devices+0x38/0x80
[<
ffffffff813dc34e>] pcifront_scan_root+0x13e/0x1a0
[<
ffffffff817a0692>] pcifront_backend_changed+0x262/0x60b
[<
ffffffff814644c6>] ? xenbus_gather+0xd6/0x160
[<
ffffffff8120900f>] ? put_object+0x2f/0x50
[<
ffffffff81465c1d>] xenbus_otherend_changed+0x9d/0xa0
[<
ffffffff814678ee>] backend_changed+0xe/0x10
[<
ffffffff81463a28>] xenwatch_thread+0xc8/0x190
[<
ffffffff810f22f0>] ? woken_wake_function+0x10/0x10
which was the result of two things:
When we call pci_scan_root_bus we would pass in 'sd' (sysdata)
pointer which was an 'pcifront_sd' structure. However in the
pci_device_add it expects that the 'sd' is 'struct sysdata' and
sets the dev->node to what is in sd->node (offset 4):
set_dev_node(&dev->dev, pcibus_to_node(bus));
__pcibus_to_node(const struct pci_bus *bus)
{
const struct pci_sysdata *sd = bus->sysdata;
return sd->node;
}
However our structure was pcifront_sd which had nothing at that
offset:
struct pcifront_sd {
int domain; /* 0 4 */
/* XXX 4 bytes hole, try to pack */
struct pcifront_device * pdev; /* 8 8 */
}
That is an hole - filled with garbage as we used kmalloc instead of
kzalloc (the second problem).
This patch fixes the issue by:
1) Use kzalloc to initialize to a well known state.
2) Put 'struct pci_sysdata' at the start of 'pcifront_sd'. That
way access to the 'node' will access the right offset.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Al Viro [Sun, 28 Feb 2016 00:17:33 +0000 (19:17 -0500)]
do_last(): don't let a bogus return value from ->open() et.al. to confuse us
commit
c80567c82ae4814a41287618e315a60ecf513be6 upstream.
... into returning a positive to path_openat(), which would interpret that
as "symlink had been encountered" and proceed to corrupt memory, etc.
It can only happen due to a bug in some ->open() instance or in some LSM
hook, etc., so we report any such event *and* make sure it doesn't trick
us into further unpleasantness.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Simon Guinot [Wed, 9 Sep 2015 22:15:18 +0000 (00:15 +0200)]
kernel/resource.c: fix muxed resource handling in __request_region()
commit
59ceeaaf355fa0fb16558ef7c24413c804932ada upstream.
In __request_region, if a conflict with a BUSY and MUXED resource is
detected, then the caller goes to sleep and waits for the resource to be
released. A pointer on the conflicting resource is kept. At wake-up
this pointer is used as a parent to retry to request the region.
A first problem is that this pointer might well be invalid (if for
example the conflicting resource have already been freed). Another
problem is that the next call to __request_region() fails to detect a
remaining conflict. The previously conflicting resource is passed as a
parameter and __request_region() will look for a conflict among the
children of this resource and not at the resource itself. It is likely
to succeed anyway, even if there is still a conflict.
Instead, the parent of the conflicting resource should be passed to
__request_region().
As a fix, this patch doesn't update the parent resource pointer in the
case we have to wait for a muxed region right after.
Reported-and-tested-by: Vincent Pelletier <plr.vincent@gmail.com>
Signed-off-by: Simon Guinot <simon.guinot@sequanux.org>
Tested-by: Vincent Donnefort <vdonnefort@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Stefan Hajnoczi [Thu, 18 Feb 2016 18:55:54 +0000 (18:55 +0000)]
sunrpc/cache: fix off-by-one in qword_get()
commit
b7052cd7bcf3c1478796e93e3dff2b44c9e82943 upstream.
The qword_get() function NUL-terminates its output buffer. If the input
string is in hex format \xXXXX... and the same length as the output
buffer, there is an off-by-one:
int qword_get(char **bpp, char *dest, int bufsize)
{
...
while (len < bufsize) {
...
*dest++ = (h << 4) | l;
len++;
}
...
*dest = '\0';
return len;
}
This patch ensures the NUL terminator doesn't fall outside the output
buffer.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Steven Rostedt (Red Hat) [Wed, 24 Feb 2016 14:04:24 +0000 (09:04 -0500)]
tracing: Fix showing function event in available_events
commit
d045437a169f899dfb0f6f7ede24cc042543ced9 upstream.
The ftrace:function event is only displayed for parsing the function tracer
data. It is not used to enable function tracing, and does not include an
"enable" file in its event directory.
Originally, this event was kept separate from other events because it did
not have a ->reg parameter. But perf added a "reg" parameter for its use
which caused issues, because it made the event available to functions where
it was not compatible for.
Commit
9b63776fa3ca9 "tracing: Do not enable function event with enable"
added a TRACE_EVENT_FL_IGNORE_ENABLE flag that prevented the function event
from being enabled by normal trace events. But this commit missed keeping
the function event from being displayed by the "available_events" directory,
which is used to show what events can be enabled by set_event.
One documented way to enable all events is to:
cat available_events > set_event
But because the function event is displayed in the available_events, this
now causes an INVALID error:
cat: write error: Invalid argument
Reported-by: Chunyu Hu <chuhu@redhat.com>
Fixes: 9b63776fa3ca9 "tracing: Do not enable function event with enable"
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Christian Borntraeger [Fri, 19 Feb 2016 12:11:46 +0000 (13:11 +0100)]
KVM: async_pf: do not warn on page allocation failures
commit
d7444794a02ff655eda87e3cc54e86b940e7736f upstream.
In async_pf we try to allocate with NOWAIT to get an element quickly
or fail. This code also handle failures gracefully. Lets silence
potential page allocation failures under load.
qemu-system-s39: page allocation failure: order:0,mode:0x2200000
[...]
Call Trace:
([<
00000000001146b8>] show_trace+0xf8/0x148)
[<
000000000011476a>] show_stack+0x62/0xe8
[<
00000000004a36b8>] dump_stack+0x70/0x98
[<
0000000000272c3a>] warn_alloc_failed+0xd2/0x148
[<
000000000027709e>] __alloc_pages_nodemask+0x94e/0xb38
[<
00000000002cd36a>] new_slab+0x382/0x400
[<
00000000002cf7ac>] ___slab_alloc.constprop.30+0x2dc/0x378
[<
00000000002d03d0>] kmem_cache_alloc+0x160/0x1d0
[<
0000000000133db4>] kvm_setup_async_pf+0x6c/0x198
[<
000000000013dee8>] kvm_arch_vcpu_ioctl_run+0xd48/0xd58
[<
000000000012fcaa>] kvm_vcpu_ioctl+0x372/0x690
[<
00000000002f66f6>] do_vfs_ioctl+0x3be/0x510
[<
00000000002f68ec>] SyS_ioctl+0xa4/0xb8
[<
0000000000781c5e>] system_call+0xd6/0x264
[<
000003ffa24fa06a>] 0x3ffa24fa06a
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Benjamin Coddington [Wed, 17 Feb 2016 15:41:41 +0000 (10:41 -0500)]
NFSv4: Fix a dentry leak on alias use
commit
d9dfd8d741683347ee159d25f5b50c346a0df557 upstream.
In the case where d_add_unique() finds an appropriate alias to use it will
have already incremented the reference count. An additional dget() to swap
the open context's dentry is unnecessary and will leak a reference.
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Fixes: 275bb307865a3 ("NFSv4: Move dentry instantiation into the NFSv4-...")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Christoph Hellwig [Mon, 8 Feb 2016 20:11:50 +0000 (21:11 +0100)]
nfs: fix nfs_size_to_loff_t
commit
50ab8ec74a153eb30db26529088bc57dd700b24c upstream.
See http: //www.infradead.org/rpr.html
X-Evolution-Source:
1451162204.2173.11@leira.trondhjem.org
Content-Transfer-Encoding: 8bit
Mime-Version: 1.0
We support OFFSET_MAX just fine, so don't round down below it. Also
switch to using min_t to make the helper more readable.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Fixes: 433c92379d9c ("NFS: Clean up nfs_size_to_loff_t()")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>