git.fsl.cs.sunysb.edu Git - wrapfs-3.14.y.git/log

powerpc/MSI: Fix race condition in tearing down MSI interrupts

[ Upstream commit e297c939b745e420ef0b9dc989cb87bda617b399 ]

This fixes a race which can result in the same virtual IRQ number
being assigned to two different MSI interrupts.  The most visible
consequence of that is usually a warning and stack trace from the
sysfs code about an attempt to create a duplicate entry in sysfs.

The race happens when one CPU (say CPU 0) is disposing of an MSI
while another CPU (say CPU 1) is setting up an MSI.  CPU 0 calls
(for example) pnv_teardown_msi_irqs(), which calls
msi_bitmap_free_hwirqs() to indicate that the MSI (i.e. its
hardware IRQ number) is no longer in use.  Then, before CPU 0 gets
to calling irq_dispose_mapping() to free up the virtal IRQ number,
CPU 1 comes in and calls msi_bitmap_alloc_hwirqs() to allocate an
MSI, and gets the same hardware IRQ number that CPU 0 just freed.
CPU 1 then calls irq_create_mapping() to get a virtual IRQ number,
which sees that there is currently a mapping for that hardware IRQ
number and returns the corresponding virtual IRQ number (which is
the same virtual IRQ number that CPU 0 was using).  CPU 0 then
calls irq_dispose_mapping() and frees that virtual IRQ number.
Now, if another CPU comes along and calls irq_create_mapping(), it
is likely to get the virtual IRQ number that was just freed,
resulting in the same virtual IRQ number apparently being used for
two different hardware interrupts.

To fix this race, we just move the call to msi_bitmap_free_hwirqs()
to after the call to irq_dispose_mapping().  Since virq_to_hw()
doesn't work for the virtual IRQ number after irq_dispose_mapping()
has been called, we need to call it before irq_dispose_mapping() and
remember the result for the msi_bitmap_free_hwirqs() call.

The pattern of calling msi_bitmap_free_hwirqs() before
irq_dispose_mapping() appears in 5 places under arch/powerpc, and
appears to have originated in commit 05af7bd2d75e ("[POWERPC] MPIC
U3/U4 MSI backend") from 2007.

Fixes: 05af7bd2d75e ("[POWERPC] MPIC U3/U4 MSI backend")
Cc: stable@vger.kernel.org # v2.6.22+
Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

tools lib traceevent: Fix string handling in heterogeneous arch environments

[ Upstream commit c2e4b24ff848bb180f9b9cd873a38327cd219ad2 ]

When a trace recorded on a 32-bit device is processed with a 64-bit
binary, the higher 32-bits of the address need to ignored.

The lack of this results in the output of the 64-bit pointer
value to the trace as the 32-bit address lookup fails in find_printk().

Before:

burn-1778 [003] 548.600305: bputs: 0xc0046db2s: 2cec5c058d98c

After:

burn-1778 [003] 548.600305: bputs: 0xc0046db2s: RT throttling activated

The problem occurs in PRINT_FIELD when the field is recognized as a
pointer to a string (of the type const char *)

Heterogeneous architectures cases below can arise and should be handled:

* Traces recorded using 32-bit addresses processed on a 64-bit machine
* Traces recorded using 64-bit addresses processed on a 32-bit machine

Reported-by: Juri Lelli <juri.lelli@arm.com>
Signed-off-by: Kapileshwar Singh <kapileshwar.singh@arm.com>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Javi Merino <javi.merino@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1442928123-13824-1-git-send-email-kapileshwar.singh@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

batman-adv: Fix potentially broken skb network header access

[ Upstream commit 53cf037bf846417fd92dc92ddf97267f69b110f4 ]

The two commits noted below added calls to ip_hdr() and ipv6_hdr(). They
need a correctly set skb network header.

Unfortunately we cannot rely on the device drivers to set it for us.
Therefore setting it in the beginning of the according ndo_start_xmit
handler.

Fixes: 1d8ab8d3c176 ("batman-adv: Modified forwarding behaviour for multicast packets")
Fixes: ab49886e3da7 ("batman-adv: Add IPv4 link-local/IPv6-ll-all-nodes multicast support")
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

batman-adv: Make TT capability changes atomic

[ Upstream commit ac4eebd48461ec993e7cb614d5afe7df8c72e6b7 ]

Bitwise OR/AND assignments in C aren't guaranteed to be atomic. One
OGM handler might undo the set/clear of a specific bit from another
handler run in between.

Fix this by using the atomic set_bit()/clear_bit()/test_bit() functions.

Fixes: e17931d1a61d ("batman-adv: introduce capability initialization bitfield")
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

batman-adv: Make NC capability changes atomic

[ Upstream commit 4635469f5c617282f18c69643af36cd8c0acf707 ]

Bitwise OR/AND assignments in C aren't guaranteed to be atomic. One
OGM handler might undo the set/clear of a specific bit from another
handler run in between.

Fix this by using the atomic set_bit()/clear_bit()/test_bit() functions.

Fixes: 3f4841ffb336 ("batman-adv: tvlv - add network coding container")
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

MIPS: dma-default: Fix 32-bit fall back to GFP_DMA

[ Upstream commit 53960059d56ecef67d4ddd546731623641a3d2d1 ]

If there is a DMA zone (usually 24bit = 16MB I believe), but no DMA32
zone, as is the case for some 32-bit kernels, then massage_gfp_flags()
will cause DMA memory allocated for devices with a 32..63-bit
coherent_dma_mask to fall back to using __GFP_DMA, even though there may
only be 32-bits of physical address available anyway.

Correct that case to compare against a mask the size of phys_addr_t
instead of always using a 64-bit mask.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Fixes: a2e715a86c6d ("MIPS: DMA: Fix computation of DMA flags from device's coherent_dma_mask.")
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: <stable@vger.kernel.org> # 2.6.36+
Patchwork: https://patchwork.linux-mips.org/patch/9610/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

cpufreq: dt: Tolerance applies on both sides of target voltage

[ Upstream commit a2022001cebd0825b96aa0f3345ea3ad44ae79d4 ]

Tolerance applies on both sides of the target voltage, i.e. both min and
max sides. But while checking if a voltage is supported by the regulator
or not, we haven't taken care of tolerance on the lower side. Fix that.

Cc: Lucas Stach <l.stach@pengutronix.de>
Fixes: 045ee45c4ff2 ("cpufreq: cpufreq-dt: disable unsupported OPPs")
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

USB: Add reset-resume quirk for two Plantronics usb headphones.

[ Upstream commit 8484bf2981b3d006426ac052a3642c9ce1d8d980 ]

These two headphones need a reset-resume quirk to properly resume to
original volume level.

Signed-off-by: Yao-Wen Mao <yaowen@google.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

usb: Add device quirk for Logitech PTZ cameras

[ Upstream commit 72194739f54607bbf8cfded159627a2015381557 ]

Add a device quirk for the Logitech PTZ Pro Camera and its sibling the
ConferenceCam CC3000e Camera.
This fixes the failed camera enumeration on some boot, particularly on
machines with fast CPU.

Tested by connecting a Logitech PTZ Pro Camera to a machine with a
Haswell Core i7-4600U CPU @ 2.10GHz, and doing thousands of reboot cycles
while recording the kernel logs and taking camera picture after each boot.
Before the patch, more than 7% of the boots show some enumeration transfer
failures and in a few of them, the kernel is giving up before actually
enumerating the webcam. After the patch, the enumeration has been correct
on every reboot.

Signed-off-by: Vincent Palatin <vpalatin@chromium.org>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

usb: musb: cppi41: allow it to work again

[ Upstream commit b0a688ddcc5015eb26000c63841db7c46cfb380a ]

since commit 33c300cb90a6 ("usb: musb: dsps:
don't fake of_node to musb core") we have been
preventing CPPI 4.1 from probing due to NULL
of_node. We can't revert said commit otherwise
a different regression would show up, so the fix
is to look for the parent device's (glue layer's)
of_node instead, since that's the thing which
is actually described in DTS.

Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

usb: Use the USB_SS_MULT() macro to get the burst multiplier.

[ Upstream commit ff30cbc8da425754e8ab96904db1d295bd034f27 ]

Bits 1:0 of the bmAttributes are used for the burst multiplier.
The rest of the bits used to be reserved (zero), but USB3.1 takes bit 7
into use.

Use the existing USB_SS_MULT() macro instead to make sure the mult value
and hence max packet calculations are correct for USB3.1 devices.

Note that burst multiplier in bmAttributes is zero based and that
the USB_SS_MULT() macro adds one.

Cc: <stable@vger.kernel.org>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

usb: chipidea: udc: using the correct stall implementation

[ Upstream commit 56ffa1d154c7e12af16273f0cdc42690dd05caf5 ]

According to spec, there are functional and protocol stalls.

For functional stall, it is for bulk and interrupt endpoints,
below are cases for it:
- Host sends SET_FEATURE request for Set-Halt, the udc driver
needs to set stall, and return true unconditionally.
- The gadget driver may call usb_ep_set_halt to stall certain
endpoints, if there is a transfer in pending, the udc driver
should not set stall, and return -EAGAIN accordingly.
These two kinds of stall need to be cleared by host using CLEAR_FEATURE
request (Clear-Halt).

For protocol stall, it is for control endpoint, this stall will
be set if the control request has failed. This stall will be
cleared by next setup request (hardware will do it).

It fixed usbtest (drivers/usb/misc/usbtest.c) Test 13 "set/clear halt"
test failure, meanwhile, this change has been verified by
USB2 CV Compliance Test and MSC Tests.

Cc: <stable@vger.kernel.org> #3.10+
Cc: Alan Stern <stern@rowland.harvard.edu>
Cc: Felipe Balbi <balbi@ti.com>
Signed-off-by: Peter Chen <peter.chen@freescale.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

security: fix typo in security_task_prctl

[ Upstream commit b7f76ea2ef6739ee484a165ffbac98deb855d3d3 ]

Signed-off-by: Jann Horn <jann@thejh.net>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

regmap: debugfs: Don't bother actually printing when calculating max length

[ Upstream commit 176fc2d5770a0990eebff903ba680d2edd32e718 ]

The in kernel snprintf() will conveniently return the actual length of
the printed string even if not given an output beffer at all so just do
that rather than relying on the user to pass in a suitable buffer,
ensuring that we don't need to worry if the buffer was truncated due to
the size of the buffer passed in.

Reported-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

regmap: debugfs: Ensure we don't underflow when printing access masks

[ Upstream commit b763ec17ac762470eec5be8ebcc43e4f8b2c2b82 ]

If a read is attempted which is smaller than the line length then we may
underflow the subtraction we're doing with the unsigned size_t type so
move some of the calculation to be additions on the right hand side
instead in order to avoid this.

Reported-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

PM / AVS: rockchip-io: depend on CONFIG_POWER_AVS

[ Upstream commit 28c1f1628ee4b163e615eefe1b6463e3d229a873 ]

The rockchip io-domain driver currently only depends on ARCH_ROCKCHIP
itself. This makes it possible to select the power-domain driver, but
not the POWER_AVS class and results in the iodomain-driver not getting
build in this case.

So add the additional dependency, which also results in the driver
config option now being placed nicely into the AVS submenu.

Fixes: 662a958638bd ("PM / AVS: rockchip-io: add driver handling Rockchip io domains")
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Acked-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

mtd: pxa3xx_nand: add a default chunk size

[ Upstream commit bc3e00f04cc1fe033a289c2fc2e5c73c0168d360 ]

When keeping the configuration set by the bootloader (by using
the marvell,nand-keep-config property), the pxa3xx_nand_detect_config()
function is called and set the chunk size to 512 as a default value if
NDCR_PAGE_SZ is not set.

In the other case, when not keeping the bootloader configuration, no
chunk size is set. Fix this by adding a default chunk size of 512.

Fixes: 70ed85232a93 ("mtd: nand: pxa3xx: Introduce multiple page I/O
support")

Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Acked-by: Robert Jarzmik <robert.jarzmik@free>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

docs: update HOWTO for 3.x -> 4.x versioning

[ Upstream commit e4144fe5d47c91c92d36cdbd5f31ed8d6e3a57ab ]

The HOWTO document needed updating for the new kernel versioning.

Signed-off-by: Mario Carrillo <mario.alfredo.c.arevalo@intel.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

cifs: use server timestamp for ntlmv2 authentication

[ Upstream commit 98ce94c8df762d413b3ecb849e2b966b21606d04 ]

Linux cifs mount with ntlmssp against an Mac OS X (Yosemite
10.10.5) share fails in case the clocks differ more than +/-2h:

digest-service: digest-request: od failed with 2 proto=ntlmv2
digest-service: digest-request: kdc failed with -1561745592 proto=ntlmv2

Fix this by (re-)using the given server timestamp for the
ntlmv2 authentication (as Windows 7 does).

A related problem was also reported earlier by Namjae Jaen (see below):

Windows machine has extended security feature which refuse to allow
authentication when there is time difference between server time and
client time when ntlmv2 negotiation is used. This problem is prevalent
in embedded enviornment where system time is set to default 1970.

Modern servers send the server timestamp in the TargetInfo Av_Pair
structure in the challenge message [see MS-NLMP 2.2.2.1]
In [MS-NLMP 3.1.5.1.2] it is explicitly mentioned that the client must
use the server provided timestamp if present OR current time if it is
not

Reported-by: Namjae Jeon <namjae.jeon@samsung.com>
Signed-off-by: Peter Seiderer <ps.report@gmx.net>
Signed-off-by: Steve French <smfrench@gmail.com>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

dts: imx25: fix sd card gpio polarity specified in device tree

[ Upstream commit cf75eb15be2bdd054a76aeef6458beaa4a6ab770 ]

cd-gpios polarity should be changed to GPIO_ACTIVE_LOW and wp-gpios
should be changed to GPIO_ACTIVE_HIGH.
Otherwise, the SD may not work properly due to wrong polarity inversion
specified in DT after switch to common parsing function mmc_of_parse().

Signed-off-by: Dong Aisheng <aisheng.dong@freescale.com>
Acked-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

dts: imx53: fix sd card gpio polarity specified in device tree

[ Upstream commit 94d76946859b4bcfa0da373357f14feda2af0622 ]

cd-gpios polarity should be changed to GPIO_ACTIVE_LOW and wp-gpios
should be changed to GPIO_ACTIVE_HIGH.
Otherwise, the SD may not work properly due to wrong polarity inversion
specified in DT after switch to common parsing function mmc_of_parse().

Signed-off-by: Dong Aisheng <aisheng.dong@freescale.com>
Acked-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

dts: imx51: fix sd card gpio polarity specified in device tree

[ Upstream commit aca45c0e95dad1c4ba4d38da192756b0e10cbbbd ]

cd-gpios polarity should be changed to GPIO_ACTIVE_LOW and wp-gpios
should be changed to GPIO_ACTIVE_HIGH.
Otherwise, the SD may not work properly due to wrong polarity inversion
specified in DT after switch to common parsing function mmc_of_parse().

Signed-off-by: Dong Aisheng <aisheng.dong@freescale.com>
Acked-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

batman-adv: Make DAT capability changes atomic

[ Upstream commit 65d7d46050704bcdb8121ddbf4110bfbf2b38baa ]

Bitwise OR/AND assignments in C aren't guaranteed to be atomic. One
OGM handler might undo the set/clear of a specific bit from another
handler run in between.

Fix this by using the atomic set_bit()/clear_bit()/test_bit() functions.

Fixes: 17cf0ea455f1 ("batman-adv: tvlv - add distributed arp table container")
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

batman-adv: protect tt_local_entry from concurrent delete events

[ Upstream commit ef72706a0543d0c3a5ab29bd6378fdfb368118d9 ]

The tt_local_entry deletion performed in batadv_tt_local_remove() was neither
protecting against simultaneous deletes nor checking whether the element was
still part of the list before calling hlist_del_rcu().

Replacing the hlist_del_rcu() call with batadv_hash_remove() provides adequate
protection via hash spinlocks as well as an is-element-still-in-hash check to
avoid 'blind' hash removal.

Fixes: 068ee6e204e1 ("batman-adv: roaming handling mechanism redesign")
Reported-by: alfonsname@web.de
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

fbdev: select versatile helpers for the integrator

[ Upstream commit 2701fa0864ecb9e49d47a4aa1c02f172ab79639a ]

Commit 11c32d7b6274cb0f554943d65bd4a126c4a86dcd
"video: move Versatile CLCD helpers" missed the fact
that the Integrator/CP is also using the helper, and
as a result the platform got only stubs and no graphics.
Add this as a default selection to Kconfig so we have
graphics again.

Fixes: 11c32d7b6274 (video: move Versatile CLCD helpers)
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ti.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

ipvs: fix crash with sync protocol v0 and FTP

[ Upstream commit 56184858d1fc95c46723436b455cb7261cd8be6f ]

Fix crash in 3.5+ if FTP is used after switching
sync_version to 0.

Fixes: 749c42b620a9 ("ipvs: reduce sync rate with time thresholds")
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

ipvs: skb_orphan in case of forwarding

[ Upstream commit 71563f3414e917c62acd8e0fb0edf8ed6af63e4b ]

It is possible that we bind against a local socket in early_demux when we
are actually going to want to forward it. In this case, the socket serves
no purpose and only serves to confuse things (particularly functions which
implicitly expect sk_fullsock to be true, like ip_local_out).
Additionally, skb_set_owner_w is totally broken for non full-socks.

Signed-off-by: Alex Gartrell <agartrell@fb.com>
Fixes: 41063e9dd119 ("ipv4: Early TCP socket demux.")
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

ipvs: fix crash if scheduler is changed

[ Upstream commit 05f00505a89acd21f5d0d20f5797dfbc4cf85243 ]

I overlooked the svc->sched_data usage from schedulers
when the services were converted to RCU in 3.10. Now
the rare ipvsadm -E command can change the scheduler
but due to the reverse order of ip_vs_bind_scheduler
and ip_vs_unbind_scheduler we provide new sched_data
to the old scheduler resulting in a crash.

To fix it without changing the scheduler methods we
have to use synchronize_rcu() only for the editing case.
It means all svc->scheduler readers should expect a
NULL value. To avoid breakage for the service listing
and ipvsadm -R we can use the "none" name to indicate
that scheduler is not assigned, a state when we drop
new connections.

Reported-by: Alexander Vasiliev <a.vasylev@404-group.com>
Fixes: ceec4c381681 ("ipvs: convert services to rcu")
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

ipvs: do not use random local source address for tunnels

[ Upstream commit 4754957f04f5f368792a0eb7dab0ae89fb93dcfd ]

Michael Vallaly reports about wrong source address used
in rare cases for tunneled traffic. Looks like
__ip_vs_get_out_rt in 3.10+ is providing uninitialized
dest_dst->dst_saddr.ip because ip_vs_dest_dst_alloc uses
kmalloc. While we retry after seeing EINVAL from routing
for data that does not look like valid local address, it
still succeeded when this memory was previously used from
other dests and with different local addresses. As result,
we can use valid local address that is not suitable for
our real server.

Fix it by providing 0.0.0.0 every time our cache is refreshed.
By this way we will get preferred source address from routing.

Reported-by: Michael Vallaly <lvs@nolatency.com>
Fixes: 026ace060dfe ("ipvs: optimize dst usage for real server")
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

sched/fair: Prevent throttling in early pick_next_task_fair()

[ Upstream commit 54d27365cae88fbcc853b391dcd561e71acb81fa ]

The optimized task selection logic optimistically selects a new task
to run without first doing a full put_prev_task(). This is so that we
can avoid a put/set on the common ancestors of the old and new task.

Similarly, we should only call check_cfs_rq_runtime() to throttle
eligible groups if they're part of the common ancestry, otherwise it
is possible to end up with no eligible task in the simple task
selection.

Imagine:
/root
/prev /next
/A /B

If our optimistic selection ends up throttling /next, we goto simple
and our put_prev_task() ends up throttling /prev, after which we're
going to bug out in set_next_entity() because there aren't any tasks
left.

Avoid this scenario by only throttling common ancestors.

Reported-by: Mohammed Naser <mnaser@vexxhost.com>
Reported-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Ben Segall <bsegall@google.com>
[ munged Changelog ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Roman Gushchin <klamm@yandex-team.ru>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: pjt@google.com
Fixes: 678d5718d8d0 ("sched/fair: Optimize cgroup pick_next_task_fair()")
Link: http://lkml.kernel.org/r/xm26wq1oswoq.fsf@sword-of-the-dawn.mtv.corp.google.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

Initialize msg/shm IPC objects before doing ipc_addid()

[ Upstream commit b9a532277938798b53178d5a66af6e2915cb27cf ]

As reported by Dmitry Vyukov, we really shouldn't do ipc_addid() before
having initialized the IPC object state. Yes, we initialize the IPC
object in a locked state, but with all the lockless RCU lookup work,
that IPC object lock no longer means that the state cannot be seen.

We already did this for the IPC semaphore code (see commit e8577d1f0329:
"ipc/sem.c: fully initialize sem_array before making it visible") but we
clearly forgot about msg and shm.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

usb: xhci: Add support for URB_ZERO_PACKET to bulk/sg transfers

[ Upstream commit 4758dcd19a7d9ba9610b38fecb93f65f56f86346 ]

This commit checks for the URB_ZERO_PACKET flag and creates an extra
zero-length td if the urb transfer length is a multiple of the endpoint's
max packet length.

Signed-off-by: Reyad Attiyat <reyad.attiyat@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

xhci: init command timeout timer earlier to avoid deleting it uninitialized

[ Upstream commit cc8e4fc0c3b5e8340bc8358990515d116a3c274c ]

Don't check if timer is running with a timer_pending() before
deleting it with del_timer_sync(), this defies the whole point of
the sync part and can cause a possible race.

Instead we just want to make sure the timer is initialized early enough
before we have a chance to delete it.

Cc: <stable@vger.kernel.org>
Reported-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

xhci: change xhci 1.0 only restrictions to support xhci 1.1

[ Upstream commit dca7794539eff04b786fb6907186989e5eaaa9c2 ]

Some changes between xhci 0.96 and xhci 1.0 specifications forced us to
check the hci version in code, some of these checks were implemented as
hci_version == 1.0, which will not work with new xhci 1.1 controllers.

xhci 1.1 behaves similar to xhci 1.0 in these cases, so change these
checks to hci_version >= 1.0

Cc: <stable@vger.kernel.org>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

usb: xhci: exit early in xhci_setup_device() if we're halted or dying

[ Upstream commit 448116bfa856d3c076fa7178ed96661a008a5d45 ]

During quick plug/removal of OTG adapter during dual-role testing
it can happen that xhci_alloc_device() is called for the newly
detected device after the DRD library has called xhci_stop to
remove the HCD.

If that is the case, just fail early to prevent the following warning.

[  154.732649] hub 4-0:1.0: USB hub found
[  154.742204] hub 4-0:1.0: 1 port detected
[  154.824458] hub 3-0:1.0: state 7 ports 1 chg 0002 evt 0000
[  154.854609] hub 4-0:1.0: state 7 ports 1 chg 0000 evt 0000
[  154.944430] usb 3-1: new high-speed USB device number 2 using xhci-hcd
[  154.951009] xhci-hcd xhci-hcd.0.auto: xhci_setup_device
[  155.038191] xhci-hcd xhci-hcd.0.auto: remove, state 4
[  155.043315] usb usb4: USB disconnect, device number 1
[  155.055270] xhci-hcd xhci-hcd.0.auto: xhci_stop
[  155.060094] xhci-hcd xhci-hcd.0.auto: USB bus 4 deregistered
[  155.066576] xhci-hcd xhci-hcd.0.auto: remove, state 1
[  155.071710] usb usb3: USB disconnect, device number 1
[  155.077124] xhci-hcd xhci-hcd.0.auto: xhci_setup_device
[  155.082389] ------------[ cut here ]------------
[  155.087690] WARNING: CPU: 0 PID: 72 at drivers/usb/host/xhci.c:3800 xhci_setup_device+0x410/0x484 [xhci_hcd]()
[  155.097861] Modules linked in: sd_mod usb_storage scsi_mod usb_f_ss_lb g_zero libcomposite ipv6 xhci_plat_hcd xhci_hcd usbcore dwc3 udc_core evdev ti_am335x_adc joydev kfifo_buf industrialio snd_soc_simple_cc
[  155.146734] CPU: 0 PID: 72 Comm: kworker/0:3 Tainted: G        W       4.1.4-00834-gcd9380b-dirty #50
[  155.156073] Hardware name: Generic AM43 (Flattened Device Tree)
[  155.162117] Workqueue: usb_hub_wq hub_event [usbcore]
[  155.167249] Backtrace:
[  155.169751] [<c0012af0>] (dump_backtrace) from [<c0012c8c>] (show_stack+0x18/0x1c)
[  155.177390]  r6:c089d4a4 r5:ffffffff r4:00000000 r3:ee46c000
[  155.183137] [<c0012c74>] (show_stack) from [<c05f7c14>] (dump_stack+0x84/0xd0)
[  155.190446] [<c05f7b90>] (dump_stack) from [<c00439ac>] (warn_slowpath_common+0x80/0xbc)
[  155.198605]  r7:00000009 r6:00000ed8 r5:bf27eb70 r4:00000000
[  155.204348] [<c004392c>] (warn_slowpath_common) from [<c0043a0c>] (warn_slowpath_null+0x24/0x2c)
[  155.213202]  r8:ee49f000 r7:ee7c0004 r6:00000000 r5:ee7c0158 r4:ee7c0000
[  155.220051] [<c00439e8>] (warn_slowpath_null) from [<bf27eb70>] (xhci_setup_device+0x410/0x484 [xhci_hcd])
[  155.229816] [<bf27e760>] (xhci_setup_device [xhci_hcd]) from [<bf27ec10>] (xhci_address_device+0x14/0x18 [xhci_hcd])
[  155.240415]  r10:ee598200 r9:00000001 r8:00000002 r7:00000001 r6:00000003 r5:00000002
[  155.248363]  r4:ee49f000
[  155.250978] [<bf27ebfc>] (xhci_address_device [xhci_hcd]) from [<bf20cb94>] (hub_port_init+0x1b8/0xa9c [usbcore])
[  155.261403] [<bf20c9dc>] (hub_port_init [usbcore]) from [<bf2101e0>] (hub_event+0x738/0x1020 [usbcore])
[  155.270874]  r10:ee598200 r9:ee7c0000 r8:ee7c0038 r7:ee518800 r6:ee49f000 r5:00000001
[  155.278822]  r4:00000000
[  155.281426] [<bf20faa8>] (hub_event [usbcore]) from [<c005754c>] (process_one_work+0x128/0x340)
[  155.290196]  r10:00000000 r9:00000003 r8:00000000 r7:fedfa000 r6:eeec5400 r5:ee598314
[  155.298151]  r4:ee434380
[  155.300718] [<c0057424>] (process_one_work) from [<c00578f8>] (worker_thread+0x158/0x49c)
[  155.308963]  r10:ee434380 r9:00000003 r8:eeec5400 r7:00000008 r6:ee434398 r5:eeec5400
[  155.316913]  r4:eeec5414
[  155.319482] [<c00577a0>] (worker_thread) from [<c005cc40>] (kthread+0xdc/0xf8)
[  155.326765]  r10:00000000 r9:00000000 r8:00000000 r7:c00577a0 r6:ee434380 r5:ee4441c0
[  155.334713]  r4:00000000 r3:00000000
[  155.338341] [<c005cb64>] (kthread) from [<c000fc08>] (ret_from_fork+0x14/0x2c)
[  155.345626]  r7:00000000 r6:00000000 r5:c005cb64 r4:ee4441c0
[  155.356108] ---[ end trace a58d34c223b190e6 ]---
[  155.360783] xhci-hcd xhci-hcd.0.auto: Virt dev invalid for slot_id 0x1!
[  155.574404] xhci-hcd xhci-hcd.0.auto: xhci_setup_device
[  155.579667] ------------[ cut here ]------------

Cc: <stable@vger.kernel.org>
Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

usb: xhci: Clear XHCI_STATE_DYING on start

[ Upstream commit e5bfeab0ad515b4f6df39fe716603e9dc6d3dfd0 ]

For whatever reason if XHCI died in the previous instant
then it will never recover on the next xhci_start unless we
clear the DYING flag.

Cc: <stable@vger.kernel.org>
Signed-off-by: Roger Quadros <rogerq@ti.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

USB: whiteheat: fix potential null-deref at probe

[ Upstream commit cbb4be652d374f64661137756b8f357a1827d6a4 ]

Fix potential null-pointer dereference at probe by making sure that the
required endpoints are present.

The whiteheat driver assumes there are at least five pairs of bulk
endpoints, of which the final pair is used for the "command port". An
attempt to bind to an interface with fewer bulk endpoints would
currently lead to an oops.

Fixes CVE-2015-5257.

Reported-by: Moein Ghasemzadeh <moein@istuary.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

drm/amdgpu: Restore LCD backlight level on resume

[ Upstream commit 74b3112e95073b351e3b0b9799795bc76f8415fa ]

Instead of only enabling the backlight (which seems to set it to max
brightness), just re-set the current backlight level, which also takes
care of enabling the backlight if necessary.

Port of radeon commit:
drm/radeon: Restore LCD backlight level on resume (>= R5xx)

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

drm: Reject DRI1 hw lock ioctl functions for kms drivers

[ Upstream commit da168d81b44898404d281d5dbe70154ab5f117c1 ]

I've done some extensive history digging across libdrm, mesa and
xf86-video-{intel,nouveau,ati}. The only potential user of this with
kms drivers I could find was ttmtest, which once used drmGetLock
still. But that mistake was quickly fixed up. Even the intel xvmc
library (which otherwise was really good with using dri1 stuff in kms
mode) managed to never take the hw lock for dri2 (and hence kms).

Hence it should be save to unconditionally disallow this.

Cc: Peter Antoine <peter.antoine@intel.com>
Reviewed-by: Peter Antoine <peter.antoine@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

drm/i915/bios: handle MIPI Sequence Block v3+ gracefully

[ Upstream commit cd67d226ebd909d239d2c6e5a6abd6e2a338d1cd ]

The VBT MIPI Sequence Block version 3 has forward incompatible changes:

First, the block size in the header has been specified reserved, and the
actual size is a separate 32-bit value within the block. The current
find_section() function to will only look at the size in the block
header, and, depending on what's in that now reserved size field,
continue looking for other sections in the wrong place.

Fix this by taking the new block size field into account. This will
ensure that the lookups for other sections will work properly, as long
as the new 32-bit size does not go beyond the opregion VBT mailbox size.

Second, the contents of the block have been completely
changed. Gracefully refuse parsing the yet unknown data version.

Cc: Deepak M <m.deepak@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Deepak M <m.deepak@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

drm/qxl: recreate the primary surface when the bo is not primary

[ Upstream commit 8d0d94015e96b8853c4f7f06eac3f269e1b3d866 ]

When disabling/enabling a crtc the primary area must be updated
independently of which crtc has been disabled/enabled.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1264735

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

drm/qxl: only report first monitor as connected if we have no state

[ Upstream commit 69e5d3f893e19613486f300fd6e631810338aa4b ]

If the server isn't new enough to give us state, report the first
monitor as always connected, otherwise believe the server side.

Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

[SMB3] Do not fall back to SMBWriteX in set_file_size error cases

[ Upstream commit 646200a041203f440fb6fcf9cacd9efeda9de74c ]

The error paths in set_file_size for cifs and smb3 are incorrect.

In the unlikely event that a server did not support set file info
of the file size, the code incorrectly falls back to trying SMBWriteX
(note that only the original core SMB Write, used for example by DOS,
can set the file size this way - this actually  does not work for the more
recent SMBWriteX).  The idea was since the old DOS SMB Write could set
the file size if you write zero bytes at that offset then use that if
server rejects the normal set file info call.

Fortunately the SMBWriteX will never be sent on the wire (except when
file size is zero) since the length and offset fields were reversed
in the two places in this function that call SMBWriteX causing
the fall back path to return an error. It is also important to never call
an SMB request from an SMB2/sMB3 session (which theoretically would
be possible, and can cause a brief session drop, although the client
recovers) so this should be fixed.  In practice this path does not happen
with modern servers but the error fall back to SMBWriteX is clearly wrong.

Removing the calls to SMBWriteX in the error paths in cifs_set_file_size

Pointed out by PaX/grsecurity team

Signed-off-by: Steve French <steve.french@primarydata.com>
Reported-by: PaX Team <pageexec@freemail.hu>
CC: Emese Revfy <re.emese@gmail.com>
CC: Brad Spengler <spender@grsecurity.net>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

disabling oplocks/leases via module parm enable_oplocks broken for SMB3

[ Upstream commit e0ddde9d44e37fbc21ce893553094ecf1a633ab5 ]

leases (oplocks) were always requested for SMB2/SMB3 even when oplocks
disabled in the cifs.ko module.

Signed-off-by: Steve French <steve.french@primarydata.com>
Reviewed-by: Chandrika Srinivasan <chandrika.srinivasan@citrix.com>
CC: Stable <stable@vger.kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

nfs: fix pg_test page count calculation

[ Upstream commit 048883e0b934d9a5103d40e209cb14b7f33d2933 ]

We really want sizeof(struct page *) instead. Otherwise we limit
maximum IO size to 64 pages rather than 512 pages on a 64bit system.

Fixes 2e11f829(nfs: cap request size to fit a kmalloced page array).

Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Peng Tao <tao.peng@primarydata.com>
Fixes: 2e11f8296d22 ("nfs: cap request size to fit a kmalloced page array")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

netfilter: nf_log: don't zap all loggers on unregister

[ Upstream commit 205ee117d4dc4a11ac3bd9638bb9b2e839f4de9a ]

like nf_log_unset, nf_log_unregister must not reset the list of loggers.
Otherwise, a call to nf_log_unregister() will render loggers of other nf
protocols unusable:

iptables -A INPUT -j LOG
modprobe nf_log_arp ; rmmod nf_log_arp
iptables -A INPUT -j LOG
iptables: No chain/target/match by that name

Fixes: 30e0c6a6be ("netfilter: nf_log: prepare net namespace support for loggers")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

netfilter: nf_log: Introduce nft_log_dereference() macro

[ Upstream commit 0c26ed1c07f13ca27e2638ffdd1951013ed96c48 ]

Wrap up a common call pattern in an easier to handle call.

Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

netfilter: nft_compat: skip family comparison in case of NFPROTO_UNSPEC

[ Upstream commit ba378ca9c04a5fc1b2cf0f0274a9d02eb3d1bad9 ]

Fix lookup of existing match/target structures in the corresponding list
by skipping the family check if NFPROTO_UNSPEC is used.

This is resulting in the allocation and insertion of one match/target
structure for each use of them. So this not only bloats memory
consumption but also severely affects the time to reload the ruleset
from the iptables-compat utility.

After this patch, iptables-compat-restore and iptables-compat take
almost the same time to reload large rulesets.

Fixes: 0ca743a55991 ("netfilter: nf_tables: add compatibility layer for x_tables")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

netfilter: nf_log: wait for rcu grace after logger unregistration

[ Upstream commit ad5001cc7cdf9aaee5eb213fdee657e4a3c94776 ]

The nf_log_unregister() function needs to call synchronize_rcu() to make sure
that the objects are not dereferenced anymore on module removal.

Fixes: 5962815a6a56 ("netfilter: nf_log: use an array of loggers instead of list")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

netfilter: ctnetlink: put back references to master ct and expect objects

[ Upstream commit 95dd8653de658143770cb0e55a58d2aab97c79d2 ]

We have to put back the references to the master conntrack and the expectation
that we just created, otherwise we'll leak them.

Fixes: 0ef71ee1a5b9 ("netfilter: ctnetlink: refactor ctnetlink_create_expect")
Reported-by: Tim Wiess <Tim.Wiess@watchguard.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

netfilter: nf_conntrack: Support expectations in different zones

[ Upstream commit 4b31814d20cbe5cd4ccf18089751e77a04afe4f2 ]

When zones were originally introduced, the expectation functions were
all extended to perform lookup using the zone. However, insertion was
not modified to check the zone. This means that two expectations which
are intended to apply for different connections that have the same tuple
but exist in different zones cannot both be tracked.

Fixes: 5d0aa2ccd4 (netfilter: nf_conntrack: add support for "conntrack zones")
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

netfilter: nfnetlink: work around wrong endianess in res_id field

[ Upstream commit a9de9777d613500b089a7416f936bf3ae5f070d2 ]

The convention in nfnetlink is to use network byte order in every header field
as well as in the attribute payload. The initial version of the batching
infrastructure assumes that res_id comes in host byte order though.

The only client of the batching infrastructure is nf_tables, so let's add a
workaround to address this inconsistency. We currently have 11 nfnetlink
subsystems according to NFNL_SUBSYS_COUNT, so we can assume that the subsystem
2560, ie. htons(10), will not be allocated anytime soon, so it can be an alias
of nf_tables from the nfnetlink batching path when interpreting the res_id
field.

Based on original patch from Florian Westphal.

Reported-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

dm raid: fix round up of default region size

[ Upstream commit 042745ee53a0a7c1f5aff191a4a24213c6dcfb52 ]

Commit 3a0f9aaee028 ("dm raid: round region_size to power of two")
intended to make sure that the default region size is a power of two.
However, the logic in that commit is incorrect and sets the variable
region_size to 0 or 1, depending on whether min_region_size is a power
of two.

Fix this logic, using roundup_pow_of_two(), so that region_size is
properly rounded up to the next power of two.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Fixes: 3a0f9aaee028 ("dm raid: round region_size to power of two")
Cc: stable@vger.kernel.org # v3.8+
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

USB: option: add ZTE PIDs

[ Upstream commit 19ab6bc5674a30fdb6a2436b068d19a3c17dc73e ]

This is intended to add ZTE device PIDs on kernel.

Signed-off-by: Liu.Zhao <lzsos369@163.com>
Cc: stable <stable@vger.kernel.org>
[johan: sort the new entries ]
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

staging: ion: fix corruption of ion_import_dma_buf

[ Upstream commit 6fa92e2bcf6390e64895b12761e851c452d87bd8 ]

we found this issue but still exit in lastest kernel. Simply
keep ion_handle_create under mutex_lock to avoid this race.

WARNING: CPU: 2 PID: 2648 at drivers/staging/android/ion/ion.c:512 ion_handle_add+0xb4/0xc0()
ion_handle_add: buffer already found.
Modules linked in: iwlmvm iwlwifi mac80211 cfg80211 compat
CPU: 2 PID: 2648 Comm: TimedEventQueue Tainted: G        W    3.14.0 #7
00000000 00000000 9a3efd2c 80faf273 9a3efd6c 9a3efd5c 80935dc9 811d7fd3
9a3efd88 00000a58 812208a0 00000200 80e128d4 80e128d4 8d4ae00c a8cd8600
a8cd8094 9a3efd74 80935e0e 00000009 9a3efd6c 811d7fd3 9a3efd88 9a3efd9c
Call Trace:
  [<80faf273>] dump_stack+0x48/0x69
  [<80935dc9>] warn_slowpath_common+0x79/0x90
  [<80e128d4>] ? ion_handle_add+0xb4/0xc0
  [<80e128d4>] ? ion_handle_add+0xb4/0xc0
  [<80935e0e>] warn_slowpath_fmt+0x2e/0x30
  [<80e128d4>] ion_handle_add+0xb4/0xc0
  [<80e144cc>] ion_import_dma_buf+0x8c/0x110
  [<80c517c4>] reg_init+0x364/0x7d0
  [<80993363>] ? futex_wait+0x123/0x210
  [<80992e0e>] ? get_futex_key+0x16e/0x1e0
  [<8099308f>] ? futex_wake+0x5f/0x120
  [<80c51e19>] vpu_service_ioctl+0x1e9/0x500
  [<80994aec>] ? do_futex+0xec/0x8e0
  [<80971080>] ? prepare_to_wait_event+0xc0/0xc0
  [<80c51c30>] ? reg_init+0x7d0/0x7d0
  [<80a22562>] do_vfs_ioctl+0x2d2/0x4c0
  [<80b198ad>] ? inode_has_perm.isra.41+0x2d/0x40
  [<80b199cf>] ? file_has_perm+0x7f/0x90
  [<80b1a5f7>] ? selinux_file_ioctl+0x47/0xf0
  [<80a227a8>] SyS_ioctl+0x58/0x80
  [<80fb45e8>] syscall_call+0x7/0x7
  [<80fb0000>] ? mmc_do_calc_max_discard+0xab/0xe4

Fixes: 83271f626 ("ion: hold reference to handle...")
Signed-off-by: Shawn Lin <shawn.lin@rock-chips.com>
Reviewed-by: Laura Abbott <labbott@redhat.com>
Cc: stable <stable@vger.kernel.org> # 3.14+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

dm btree: add ref counting ops for the leaves of top level btrees

[ Upstream commit b0dc3c8bc157c60b1d470163882be8c13e1950af ]

When using nested btrees, the top leaves of the top levels contain
block addresses for the root of the next tree down. If we shadow a
shared leaf node the leaf values (sub tree roots) should be incremented
accordingly.

This is only an issue if there is metadata sharing in the top levels.
Which only occurs if metadata snapshots are being used (as is possible
with dm-thinp). And could result in a block from the thinp metadata
snap being reused early, thus corrupting the thinp metadata snap.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

svcrdma: Fix send_reply() scatter/gather set-up

[ Upstream commit 9d11b51ce7c150a69e761e30518f294fc73d55ff ]

The Linux NFS server returns garbage in the data payload of inline
NFS/RDMA READ replies. These are READs of under 1000 bytes or so
where the client has not provided either a reply chunk or a write
list.

The NFS server delivers the data payload for an NFS READ reply to
the transport in an xdr_buf page list. If the NFS client did not
provide a reply chunk or a write list, send_reply() is supposed to
set up a separate sge for the page containing the READ data, and
another sge for XDR padding if needed, then post all of the sges via
a single SEND Work Request.

The problem is send_reply() does not advance through the xdr_buf
when setting up scatter/gather entries for SEND WR. It always calls
dma_map_xdr with xdr_off set to zero. When there's more than one
sge, dma_map_xdr() sets up the SEND sge's so they all point to the
xdr_buf's head.

The current Linux NFS/RDMA client always provides a reply chunk or
a write list when performing an NFS READ over RDMA. Therefore, it
does not exercise this particular case. The Linux server has never
had to use more than one extra sge for building RPC/RDMA replies
with a Linux client.

However, an NFS/RDMA client _is_ allowed to send small NFS READs
without setting up a write list or reply chunk. The NFS READ reply
fits entirely within the inline reply buffer in this case. This is
perhaps a more efficient way of performing NFS READs that the Linux
NFS/RDMA client may some day adopt.

Fixes: b432e6b3d9c1 ('svcrdma: Change DMA mapping logic to . . .')
BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=285
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

ath10k: fix dma_mapping_error() handling

[ Upstream commit 5e55e3cbd1042cffa6249f22c10585e63f8a29bf ]

The function returns 1 when DMA mapping fails. The
driver would return bogus values and could
possibly confuse itself if DMA failed.

Fixes: 767d34fc67af ("ath10k: remove DMA mapping wrappers")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Michal Kazior <michal.kazior@tieto.com>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

Btrfs: update fix for read corruption of compressed and shared extents

[ Upstream commit 808f80b46790f27e145c72112189d6a3be2bc884 ]

My previous fix in commit 005efedf2c7d ("Btrfs: fix read corruption of
compressed and shared extents") was effective only if the compressed
extents cover a file range with a length that is not a multiple of 16
pages. That's because the detection of when we reached a different range
of the file that shares the same compressed extent as the previously
processed range was done at extent_io.c:__do_contiguous_readpages(),
which covers subranges with a length up to 16 pages, because
extent_readpages() groups the pages in clusters no larger than 16 pages.
So fix this by tracking the start of the previously processed file
range's extent map at extent_readpages().

The following test case for fstests reproduces the issue:

  seq=`basename $0`
  seqres=$RESULT_DIR/$seq
  echo "QA output created by $seq"
  tmp=/tmp/$$
  status=1 # failure is the default!
  trap "_cleanup; exit \$status" 0 1 2 3 15

  _cleanup()
  {
      rm -f $tmp.*
  }

  # get standard environment, filters and checks
  . ./common/rc
  . ./common/filter

  # real QA test starts here
  _need_to_be_root
  _supported_fs btrfs
  _supported_os Linux
  _require_scratch
  _require_cloner

  rm -f $seqres.full

  test_clone_and_read_compressed_extent()
  {
      local mount_opts=$1

      _scratch_mkfs >>$seqres.full 2>&1
      _scratch_mount $mount_opts

      # Create our test file with a single extent of 64Kb that is going to
      # be compressed no matter which compression algo is used (zlib/lzo).
      $XFS_IO_PROG -f -c "pwrite -S 0xaa 0K 64K" \
          $SCRATCH_MNT/foo | _filter_xfs_io

      # Now clone the compressed extent into an adjacent file offset.
      $CLONER_PROG -s 0 -d $((64 * 1024)) -l $((64 * 1024)) \
          $SCRATCH_MNT/foo $SCRATCH_MNT/foo

      echo "File digest before unmount:"
      md5sum $SCRATCH_MNT/foo | _filter_scratch

      # Remount the fs or clear the page cache to trigger the bug in
      # btrfs. Because the extent has an uncompressed length that is a
      # multiple of 16 pages, all the pages belonging to the second range
      # of the file (64K to 128K), which points to the same extent as the
      # first range (0K to 64K), had their contents full of zeroes instead
      # of the byte 0xaa. This was a bug exclusively in the read path of
      # compressed extents, the correct data was stored on disk, btrfs
      # just failed to fill in the pages correctly.
      _scratch_remount

      echo "File digest after remount:"
      # Must match the digest we got before.
      md5sum $SCRATCH_MNT/foo | _filter_scratch
  }

  echo -e "\nTesting with zlib compression..."
  test_clone_and_read_compressed_extent "-o compress=zlib"

  _scratch_unmount

  echo -e "\nTesting with lzo compression..."
  test_clone_and_read_compressed_extent "-o compress=lzo"

  status=0
  exit

Cc: stable@vger.kernel.org
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Tested-by: Timofey Titovets <nefelim4ag@gmail.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

Btrfs: fix read corruption of compressed and shared extents

[ Upstream commit 005efedf2c7d0a270ffbe28d8997b03844f3e3e7 ]

If a file has a range pointing to a compressed extent, followed by
another range that points to the same compressed extent and a read
operation attempts to read both ranges (either completely or part of
them), the pages that correspond to the second range are incorrectly
filled with zeroes.

Consider the following example:

  File layout
  [0 - 8K]                      [8K - 24K]
      |                             |
      |                             |
   points to extent X,         points to extent X,
   offset 4K, length of 8K     offset 0, length 16K

  [extent X, compressed length = 4K uncompressed length = 16K]

If a readpages() call spans the 2 ranges, a single bio to read the extent
is submitted - extent_io.c:submit_extent_page() would only create a new
bio to cover the second range pointing to the extent if the extent it
points to had a different logical address than the extent associated with
the first range. This has a consequence of the compressed read end io
handler (compression.c:end_compressed_bio_read()) finish once the extent
is decompressed into the pages covering the first range, leaving the
remaining pages (belonging to the second range) filled with zeroes (done
by compression.c:btrfs_clear_biovec_end()).

So fix this by submitting the current bio whenever we find a range
pointing to a compressed extent that was preceded by a range with a
different extent map. This is the simplest solution for this corner
case. Making the end io callback populate both ranges (or more, if we
have multiple pointing to the same extent) is a much more complex
solution since each bio is tightly coupled with a single extent map and
the extent maps associated to the ranges pointing to the shared extent
can have different offsets and lengths.

The following test case for fstests triggers the issue:

  seq=`basename $0`
  seqres=$RESULT_DIR/$seq
  echo "QA output created by $seq"
  tmp=/tmp/$$
  status=1 # failure is the default!
  trap "_cleanup; exit \$status" 0 1 2 3 15

  _cleanup()
  {
      rm -f $tmp.*
  }

  # get standard environment, filters and checks
  . ./common/rc
  . ./common/filter

  # real QA test starts here
  _need_to_be_root
  _supported_fs btrfs
  _supported_os Linux
  _require_scratch
  _require_cloner

  rm -f $seqres.full

  test_clone_and_read_compressed_extent()
  {
      local mount_opts=$1

      _scratch_mkfs >>$seqres.full 2>&1
      _scratch_mount $mount_opts

      # Create a test file with a single extent that is compressed (the
      # data we write into it is highly compressible no matter which
      # compression algorithm is used, zlib or lzo).
      $XFS_IO_PROG -f -c "pwrite -S 0xaa 0K 4K"        \
                      -c "pwrite -S 0xbb 4K 8K"        \
                      -c "pwrite -S 0xcc 12K 4K"       \
                      $SCRATCH_MNT/foo | _filter_xfs_io

      # Now clone our extent into an adjacent offset.
      $CLONER_PROG -s $((4 * 1024)) -d $((16 * 1024)) -l $((8 * 1024)) \
          $SCRATCH_MNT/foo $SCRATCH_MNT/foo

      # Same as before but for this file we clone the extent into a lower
      # file offset.
      $XFS_IO_PROG -f -c "pwrite -S 0xaa 8K 4K"         \
                      -c "pwrite -S 0xbb 12K 8K"        \
                      -c "pwrite -S 0xcc 20K 4K"        \
                      $SCRATCH_MNT/bar | _filter_xfs_io

      $CLONER_PROG -s $((12 * 1024)) -d 0 -l $((8 * 1024)) \
          $SCRATCH_MNT/bar $SCRATCH_MNT/bar

      echo "File digests before unmounting filesystem:"
      md5sum $SCRATCH_MNT/foo | _filter_scratch
      md5sum $SCRATCH_MNT/bar | _filter_scratch

      # Evicting the inode or clearing the page cache before reading
      # again the file would also trigger the bug - reads were returning
      # all bytes in the range corresponding to the second reference to
      # the extent with a value of 0, but the correct data was persisted
      # (it was a bug exclusively in the read path). The issue happened
      # only if the same readpages() call targeted pages belonging to the
      # first and second ranges that point to the same compressed extent.
      _scratch_remount

      echo "File digests after mounting filesystem again:"
      # Must match the same digests we got before.
      md5sum $SCRATCH_MNT/foo | _filter_scratch
      md5sum $SCRATCH_MNT/bar | _filter_scratch
  }

  echo -e "\nTesting with zlib compression..."
  test_clone_and_read_compressed_extent "-o compress=zlib"

  _scratch_unmount

  echo -e "\nTesting with lzo compression..."
  test_clone_and_read_compressed_extent "-o compress=lzo"

  status=0
  exit

Cc: stable@vger.kernel.org
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Qu Wenruo<quwenruo@cn.fujitsu.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

btrfs: skip waiting on ordered range for special files

[ Upstream commit a30e577c96f59b1e1678ea5462432b09bf7d5cbc ]

In btrfs_evict_inode, we properly truncate the page cache for evicted
inodes but then we call btrfs_wait_ordered_range for every inode as well.
It's the right thing to do for regular files but results in incorrect
behavior for device inodes for block devices.

filemap_fdatawrite_range gets called with inode->i_mapping which gets
resolved to the block device inode before getting passed to
wbc_attach_fdatawrite_inode and ultimately to inode_to_bdi.  What happens
next depends on whether there's an open file handle associated with the
inode.  If there is, we write to the block device, which is unexpected
behavior.  If there isn't, we through normally and inode->i_data is used.
We can also end up racing against open/close which can result in crashes
when i_mapping points to a block device inode that has been closed.

Since there can't be any page cache associated with special file inodes,
it's safe to skip the btrfs_wait_ordered_range call entirely and avoid
the problem.

Cc: <stable@vger.kernel.org>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=100911
Tested-by: Christoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de>
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

ASoC: dwc: correct irq clear method

[ Upstream commit 4873867e5f2bd90faad861dd94865099fc3140f3 ]

from Designware I2S datasheet, tx/rx XRUN irq is cleared by
reading register TOR/ROR, rather than by writing into them.

Signed-off-by: Yitian Bu <yitian.bu@tangramtek.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

ASoC: fix broken pxa SoC support

[ Upstream commit 3c8f7710c1c44fb650bc29b6ef78ed8b60cfaa28 ]

The previous fix of pxa library support, which was introduced to fix the
library dependency, broke the previous SoC behavior, where a machine
code binding pxa2xx-ac97 with a coded relied on :
- sound/soc/pxa/pxa2xx-ac97.c
- sound/soc/codecs/XXX.c

For example, the mioa701_wm9713.c machine code is currently broken. The
"select ARM" statement wrongly selects the soc/arm/pxa2xx-ac97 for
compilation, as per an unfortunate fate SND_PXA2XX_AC97 is both declared
in sound/arm/Kconfig and sound/soc/pxa/Kconfig.

Fix this by ensuring that SND_PXA2XX_SOC correctly triggers the correct
pxa2xx-ac97 compilation.

Fixes: 846172dfe33c ("ASoC: fix SND_PXA2XX_LIB Kconfig warning")
Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

ASoC: pxa: pxa2xx-ac97: fix dma requestor lines

[ Upstream commit 8811191fdf7ed02ee07cb8469428158572d355a2 ]

PCM receive and transmit DMA requestor lines were reverted, breaking the
PCM playback interface for PXA platforms using the sound/soc/ variant
instead of the sound/arm variant.

The commit below shows the inversion in the requestor lines.

Fixes: d65a14587a9b ("ASoC: pxa: use snd_dmaengine_dai_dma_data")
Signed-off-by: Robert Jarzmik <robert.jarzmik@free.fr>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

ALSA: hda - Apply SPDIF pin ctl to MacBookPro 12,1

[ Upstream commit e8ff581f7ac2bc3b8886094b7ca635dcc4d1b0e9 ]

The MacBookPro 12,1 has the same setup as the 11 for controlling the
status of the optical audio light. Simply apply the existing workaround
to the subsystem ID for the 12,1.

[sorted the fixup entry by tiwai]

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=105401
Signed-off-by: John Flatness <john@zerocrates.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

ALSA: hda: Add dock support for ThinkPad T550

[ Upstream commit d05ea7da0e8f6df3c62cfee75538f347cb3d89ef ]

Much like all the other Lenovo laptops, add a quirk to make
sound work with docking.

Reported-and-tested-by: lacknerflo@gmail.com
Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

ALSA: synth: Fix conflicting OSS device registration on AWE32

[ Upstream commit 225db5762dc1a35b26850477ffa06e5cd0097243 ]

When OSS emulation is loaded on ISA SB AWE32 chip, we get now kernel
warnings like:
  WARNING: CPU: 0 PID: 2791 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x51/0x80()
  sysfs: cannot create duplicate filename '/devices/isa/sbawe.0/sound/card0/seq-oss-0-0'

It's because both emux synth and opl3 drivers try to register their
OSS device object with the same static index number 0.  This hasn't
been a big problem until the recent rewrite of device management code
(that exposes sysfs at the same time), but it's been an obvious bug.

This patch works around it just by using a different index number of
emux synth object.  There can be a more elegant way to fix, but it's
enough for now, as this code won't be touched so often, in anyway.

Reported-and-tested-by: Michael Shell <list1@michaelshell.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

mm: hugetlbfs: skip shared VMAs when unmapping private pages to satisfy a fault

[ Upstream commit 2f84a8990ebbe235c59716896e017c6b2ca1200f ]

SunDong reported the following on

  https://bugzilla.kernel.org/show_bug.cgi?id=103841

I think I find a linux bug, I have the test cases is constructed. I
can stable recurring problems in fedora22(4.0.4) kernel version,
arch for x86_64.  I construct transparent huge page, when the parent
and child process with MAP_SHARE, MAP_PRIVATE way to access the same
huge page area, it has the opportunity to lead to huge page copy on
write failure, and then it will munmap the child corresponding mmap
area, but then the child mmap area with VM_MAYSHARE attributes, child
process munmap this area can trigger VM_BUG_ON in set_vma_resv_flags
functions (vma - > vm_flags & VM_MAYSHARE).

There were a number of problems with the report (e.g.  it's hugetlbfs that
triggers this, not transparent huge pages) but it was fundamentally
correct in that a VM_BUG_ON in set_vma_resv_flags() can be triggered that
looks like this

vma ffff8804651fd0d0 start 00007fc474e00000 end 00007fc475e00000
next ffff8804651fd018 prev ffff8804651fd188 mm ffff88046b1b1800
prot 8000000000000027 anon_vma           (null) vm_ops ffffffff8182a7a0
pgoff 0 file ffff88106bdb9800 private_data           (null)
flags: 0x84400fb(read|write|shared|mayread|maywrite|mayexec|mayshare|dontexpand|hugetlb)
------------
kernel BUG at mm/hugetlb.c:462!
SMP
Modules linked in: xt_pkttype xt_LOG xt_limit [..]
CPU: 38 PID: 26839 Comm: map Not tainted 4.0.4-default #1
Hardware name: Dell Inc. PowerEdge R810/0TT6JF, BIOS 2.7.4 04/26/2012
set_vma_resv_flags+0x2d/0x30

The VM_BUG_ON is correct because private and shared mappings have
different reservation accounting but the warning clearly shows that the
VMA is shared.

When a private COW fails to allocate a new page then only the process
that created the VMA gets the page -- all the children unmap the page.
If the children access that data in the future then they get killed.

The problem is that the same file is mapped shared and private.  During
the COW, the allocation fails, the VMAs are traversed to unmap the other
private pages but a shared VMA is found and the bug is triggered.  This
patch identifies such VMAs and skips them.

Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Reported-by: SunDong <sund_sky@126.com>
Reviewed-by: Michal Hocko <mhocko@suse.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: David Rientjes <rientjes@google.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

ocfs2/dlm: fix deadlock when dispatch assert master

[ Upstream commit 012572d4fc2e4ddd5c8ec8614d51414ec6cae02a ]

The order of the following three spinlocks should be:
dlm_domain_lock < dlm_ctxt->spinlock < dlm_lock_resource->spinlock

But dlm_dispatch_assert_master() is called while holding
dlm_ctxt->spinlock and dlm_lock_resource->spinlock, and then it calls
dlm_grab() which will take dlm_domain_lock.

Once another thread (for example, dlm_query_join_handler) has already
taken dlm_domain_lock, and tries to take dlm_ctxt->spinlock deadlock
happens.

Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: "Junxiao Bi" <junxiao.bi@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

spi: spi-pxa2xx: Check status register to determine if SSSR_TINT is disabled

[ Upstream commit 02bc933ebb59208f42c2e6305b2c17fd306f695d ]

On Intel Baytrail, there is case when interrupt handler get called, no SPI
message is captured. The RX FIFO is indeed empty when RX timeout pending
interrupt (SSSR_TINT) happens.

Use the BIOS version where both HSUART and SPI are on the same IRQ. Both
drivers are using IRQF_SHARED when calling the request_irq function. When
running two separate and independent SPI and HSUART application that
generate data traffic on both components, user will see messages like
below on the console:

pxa2xx-spi pxa2xx-spi.0: bad message state in interrupt handler

This commit will fix this by first checking Receiver Time-out Interrupt,
if it is disabled, ignore the request and return without servicing.

Signed-off-by: Tan, Jui Nee <jui.nee.tan@intel.com>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

spi: xtensa-xtfpga: fix register endianness

[ Upstream commit b0b4855099e301c8603ea37da9a0103a96c2e0b1 ]

XTFPGA SPI controller has native endian registers.
Fix register acessors so that they work in big-endian configurations.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

spi: Fix documentation of spi_alloc_master()

[ Upstream commit a394d635193b641f2c86ead5ada5b115d57c51f8 ]

Actually, spi_master_put() after spi_alloc_master() must _not_ be followed
by kfree(). The memory is already freed with the call to spi_master_put()
through spi_master_class, which registers a release function. Calling both
spi_master_put() and kfree() results in often nasty (and delayed) crashes
elsewhere in the kernel, often in the networking stack.

This reverts commit eb4af0f5349235df2e4a5057a72fc8962d00308a.

Link to patch and concerns: https://lkml.org/lkml/2012/9/3/269
or
http://lkml.iu.edu/hypermail/linux/kernel/1209.0/00790.html

Alexey Klimov: This revert becomes valid after
94c69f765f1b4a658d96905ec59928e3e3e07e6a when spi-imx.c
has been fixed and there is no need to call kfree() so comment
for spi_alloc_master() should be fixed.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Alexey Klimov <alexey.klimov@linaro.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

s390/boot/decompression: disable floating point in decompressor

[ Upstream commit adc0b7fbf6fe9967505c0254d9535ec7288186ae ]

my gcc 5.1 used an ldgr instruction with a register != 0,2,4,6 for
spilling/filling into a floating point register in our decompressor.

This will cause an AFP-register data exception as the decompressor
did not setup the additional floating point registers via cr0.
That causes a program check loop that looked like a hang with
one "Uncompressing Linux... " message (directly booted via kvm)
or a loop of "Uncompressing Linux... " messages (when booted via
zipl boot loader).

The offending code in my build was

   48e400:       e3 c0 af ff ff 71       lay     %r12,-1(%r10)
-->48e406:       b3 c1 00 1c             ldgr    %f1,%r12
   48e40a:       ec 6c 01 22 02 7f       clij    %r6,2,12,0x48e64e

but gcc could do spilling into an fpr at any function. We can
simply disable floating point support at that early stage.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

s390/compat: correct uc_sigmask of the compat signal frame

[ Upstream commit 8d4bd0ed0439dfc780aab801a085961925ed6838 ]

The uc_sigmask in the ucontext structure is an array of words to keep
the 64 signal bits (or 1024 if you ask glibc but the kernel sigset_t
only has 64 bits).

For 64 bit the sigset_t contains a single 8 byte word, but for 31 bit
there are two 4 byte words. The compat signal handler code uses a
simple copy of the 64 bit sigset_t to the 31 bit compat_sigset_t.
As s390 is a big-endian architecture this is incorrect, the two words
in the 31 bit sigset_t array need to be swapped.

Cc: <stable@vger.kernel.org>
Reported-by: Stefan Liebler <stli@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

sched/core: Fix TASK_DEAD race in finish_task_switch()

[ Upstream commit 95913d97914f44db2b81271c2e2ebd4d2ac2df83 ]

So the problem this patch is trying to address is as follows:

        CPU0                            CPU1

        context_switch(A, B)
                                        ttwu(A)
                                          LOCK A->pi_lock
                                          A->on_cpu == 0
        finish_task_switch(A)
          prev_state = A->state  <-.
          WMB                      |
          A->on_cpu = 0;           |
          UNLOCK rq0->lock         |
                                   |    context_switch(C, A)
                                   `--  A->state = TASK_DEAD
          prev_state == TASK_DEAD
            put_task_struct(A)
                                        context_switch(A, C)
                                        finish_task_switch(A)
                                          A->state == TASK_DEAD
                                            put_task_struct(A)

The argument being that the WMB will allow the load of A->state on CPU0
to cross over and observe CPU1's store of A->state, which will then
result in a double-drop and use-after-free.

Now the comment states (and this was true once upon a long time ago)
that we need to observe A->state while holding rq->lock because that
will order us against the wakeup; however the wakeup will not in fact
acquire (that) rq->lock; it takes A->pi_lock these days.

We can obviously fix this by upgrading the WMB to an MB, but that is
expensive, so we'd rather avoid that.

The alternative this patch takes is: smp_store_release(&A->on_cpu, 0),
which avoids the MB on some archs, but not important ones like ARM.

Reported-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: <stable@vger.kernel.org> # v3.1+
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Cc: manfred@colorfullife.com
Cc: will.deacon@arm.com
Fixes: e4a52bcb9a18 ("sched: Remove rq->lock from the first half of ttwu()")
Link: http://lkml.kernel.org/r/20150929124509.GG3816@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

x86/xen: Support kexec/kdump in HVM guests by doing a soft reset

[ Upstream commit 0b34a166f291d255755be46e43ed5497cdd194f2 ]

Currently there is a number of issues preventing PVHVM Xen guests from
doing successful kexec/kdump:

  - Bound event channels.
  - Registered vcpu_info.
  - PIRQ/emuirq mappings.
  - shared_info frame after XENMAPSPACE_shared_info operation.
  - Active grant mappings.

Basically, newly booted kernel stumbles upon already set up Xen
interfaces and there is no way to reestablish them. In Xen-4.7 a new
feature called 'soft reset' is coming. A guest performing kexec/kdump
operation is supposed to call SCHEDOP_shutdown hypercall with
SHUTDOWN_soft_reset reason before jumping to new kernel. Hypervisor
(with some help from toolstack) will do full domain cleanup (but
keeping its memory and vCPU contexts intact) returning the guest to
the state it had when it was first booted and thus allowing it to
start over.

Doing SHUTDOWN_soft_reset on Xen hypervisors which don't support it is
probably OK as by default all unknown shutdown reasons cause domain
destroy with a message in toolstack log: 'Unknown shutdown reason code
5. Destroying domain.'  which gives a clue to what the problem is and
eliminates false expectations.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

x86/mm: Set NX on gap between __ex_table and rodata

[ Upstream commit ab76f7b4ab2397ffdd2f1eb07c55697d19991d10 ]

Unused space between the end of __ex_table and the start of
rodata can be left W+x in the kernel page tables.  Extend the
setting of the NX bit to cover this gap by starting from
text_end rather than rodata_start.

  Before:
  ---[ High Kernel Mapping ]---
  0xffffffff80000000-0xffffffff81000000          16M                               pmd
  0xffffffff81000000-0xffffffff81600000           6M     ro         PSE     GLB x  pmd
  0xffffffff81600000-0xffffffff81754000        1360K     ro                 GLB x  pte
  0xffffffff81754000-0xffffffff81800000         688K     RW                 GLB x  pte
  0xffffffff81800000-0xffffffff81a00000           2M     ro         PSE     GLB NX pmd
  0xffffffff81a00000-0xffffffff81b3b000        1260K     ro                 GLB NX pte
  0xffffffff81b3b000-0xffffffff82000000        4884K     RW                 GLB NX pte
  0xffffffff82000000-0xffffffff82200000           2M     RW         PSE     GLB NX pmd
  0xffffffff82200000-0xffffffffa0000000         478M                               pmd

  After:
  ---[ High Kernel Mapping ]---
  0xffffffff80000000-0xffffffff81000000          16M                               pmd
  0xffffffff81000000-0xffffffff81600000           6M     ro         PSE     GLB x  pmd
  0xffffffff81600000-0xffffffff81754000        1360K     ro                 GLB x  pte
  0xffffffff81754000-0xffffffff81800000         688K     RW                 GLB NX pte
  0xffffffff81800000-0xffffffff81a00000           2M     ro         PSE     GLB NX pmd
  0xffffffff81a00000-0xffffffff81b3b000        1260K     ro                 GLB NX pte
  0xffffffff81b3b000-0xffffffff82000000        4884K     RW                 GLB NX pte
  0xffffffff82000000-0xffffffff82200000           2M     RW         PSE     GLB NX pmd
  0xffffffff82200000-0xffffffffa0000000         478M                               pmd

Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: <stable@vger.kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/1443704662-3138-1-git-send-email-sds@tycho.nsa.gov
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

x86/process: Add proper bound checks in 64bit get_wchan()

[ Upstream commit eddd3826a1a0190e5235703d1e666affa4d13b96 ]

Dmitry Vyukov reported the following using trinity and the memory
error detector AddressSanitizer
(https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerForKernel).

[ 124.575597] ERROR: AddressSanitizer: heap-buffer-overflow on
address ffff88002e280000
[ 124.576801] ffff88002e280000 is located 131938492886538 bytes to
the left of 28857600-byte region [ffffffff81282e0a, ffffffff82e0830a)
[ 124.578633] Accessed by thread T10915:
[ 124.579295] inlined in describe_heap_address
./arch/x86/mm/asan/report.c:164
[ 124.579295] #0 ffffffff810dd277 in asan_report_error
./arch/x86/mm/asan/report.c:278
[ 124.580137] #1 ffffffff810dc6a0 in asan_check_region
./arch/x86/mm/asan/asan.c:37
[ 124.581050] #2 ffffffff810dd423 in __tsan_read8 ??:0
[ 124.581893] #3 ffffffff8107c093 in get_wchan
./arch/x86/kernel/process_64.c:444

The address checks in the 64bit implementation of get_wchan() are
wrong in several ways:

- The lower bound of the stack is not the start of the stack
   page. It's the start of the stack page plus sizeof (struct
   thread_info)

- The upper bound must be:

       top_of_stack - TOP_OF_KERNEL_STACK_PADDING - 2 * sizeof(unsigned long).

   The 2 * sizeof(unsigned long) is required because the stack pointer
   points at the frame pointer. The layout on the stack is: ... IP FP
   ... IP FP. So we need to make sure that both IP and FP are in the
   bounds.

Fix the bound checks and get rid of the mix of numeric constants, u64
and unsigned long. Making all unsigned long allows us to use the same
function for 32bit as well.

Use READ_ONCE() when accessing the stack. This does not prevent a
concurrent wakeup of the task and the stack changing, but at least it
avoids TOCTOU.

Also check task state at the end of the loop. Again that does not
prevent concurrent changes, but it avoids walking for nothing.

Add proper comments while at it.

Reported-by: Dmitry Vyukov <dvyukov@google.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Based-on-patch-from: Wolfram Gloger <wmglo@dent.med.uni-muenchen.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@alien8.de>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Kostya Serebryany <kcc@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: kasan-dev <kasan-dev@googlegroups.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Wolfram Gloger <wmglo@dent.med.uni-muenchen.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20150930083302.694788319@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

x86/asm/entry: Create and use a 'TOP_OF_KERNEL_STACK_PADDING' macro

[ Upstream commit 3ee4298f440c81638cbb5ec06f2497fb7a9a9eb4 ]

x86_32, unlike x86_64, pads the top of the kernel stack, because the
hardware stack frame formats are variable in size.

Document this padding and give it a name.

This should make no change whatsoever to the compiled kernel
image. It also doesn't fix any of the current bugs in this area.

Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/02bf2f54b8dcb76a62a142b6dfe07d4ef7fc582e.1426009661.git.luto@amacapital.net
[ Fixed small details, such as a missed magic constant in entry_32.S pointed out by Denys Vlasenko. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

x86/kexec: Fix kexec crash in syscall kexec_file_load()

[ Upstream commit e3c41e37b0f4b18cbd4dac76cbeece5a7558b909 ]

The original bug is a page fault crash that sometimes happens
on big machines when preparing ELF headers:

    BUG: unable to handle kernel paging request at ffffc90613fc9000
    IP: [<ffffffff8103d645>] prepare_elf64_ram_headers_callback+0x165/0x260

The bug is caused by us under-counting the number of memory ranges
and subsequently not allocating enough ELF header space for them.
The bug is typically masked on smaller systems, because the ELF header
allocation is rounded up to the next page.

This patch modifies the code in fill_up_crash_elf_data() by using
walk_system_ram_res() instead of walk_system_ram_range() to correctly
count the max number of crash memory ranges. That's because the
walk_system_ram_range() filters out small memory regions that
reside in the same page, but walk_system_ram_res() does not.

Here's how I found the bug:

After tracing prepare_elf64_headers() and prepare_elf64_ram_headers_callback(),
the code uses walk_system_ram_res() to fill-in crash memory regions information
to the program header, so it counts those small memory regions that
reside in a page area.

But, when the kernel was using walk_system_ram_range() in
fill_up_crash_elf_data() to count the number of crash memory regions,
it filters out small regions.

I printed those small memory regions, for example:

  kexec: Get nr_ram ranges. vaddr=0xffff880077592258 paddr=0x77592258, sz=0xdc0

Based on the code in walk_system_ram_range(), this memory region
will be filtered out:

  pfn = (0x77592258 + 0x1000 - 1) >> 12 = 0x77593
  end_pfn = (0x77592258 + 0xfc0 -1 + 1) >> 12 = 0x77593
  end_pfn - pfn = 0x77593 - 0x77593 = 0  <=== if (end_pfn > pfn) is FALSE

So, the max_nr_ranges that's counted by the kernel doesn't include
small memory regions - causing us to under-allocate the required space.
That causes the page fault crash that happens in a later code path
when preparing ELF headers.

This bug is not easy to reproduce on small machines that have few
CPUs, because the allocated page aligned ELF buffer has more free
space to cover those small memory regions' PT_LOAD headers.

Signed-off-by: Lee, Chun-Yi <jlee@suse.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Viresh Kumar <viresh.kumar@linaro.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: kexec@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/1443531537-29436-1-git-send-email-jlee@suse.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

x86/efi: Fix boot crash by mapping EFI memmap entries bottom-up at runtime, instead of top-down

[ Upstream commit a5caa209ba9c29c6421292e7879d2387a2ef39c9 ]

Beginning with UEFI v2.5 EFI_PROPERTIES_TABLE was introduced
that signals that the firmware PE/COFF loader supports splitting
code and data sections of PE/COFF images into separate EFI
memory map entries. This allows the kernel to map those regions
with strict memory protections, e.g. EFI_MEMORY_RO for code,
EFI_MEMORY_XP for data, etc.

Unfortunately, an unwritten requirement of this new feature is
that the regions need to be mapped with the same offsets
relative to each other as observed in the EFI memory map. If
this is not done crashes like this may occur,

  BUG: unable to handle kernel paging request at fffffffefe6086dd
  IP: [<fffffffefe6086dd>] 0xfffffffefe6086dd
  Call Trace:
   [<ffffffff8104c90e>] efi_call+0x7e/0x100
   [<ffffffff81602091>] ? virt_efi_set_variable+0x61/0x90
   [<ffffffff8104c583>] efi_delete_dummy_variable+0x63/0x70
   [<ffffffff81f4e4aa>] efi_enter_virtual_mode+0x383/0x392
   [<ffffffff81f37e1b>] start_kernel+0x38a/0x417
   [<ffffffff81f37495>] x86_64_start_reservations+0x2a/0x2c
   [<ffffffff81f37582>] x86_64_start_kernel+0xeb/0xef

Here 0xfffffffefe6086dd refers to an address the firmware
expects to be mapped but which the OS never claimed was mapped.
The issue is that included in these regions are relative
addresses to other regions which were emitted by the firmware
toolchain before the "splitting" of sections occurred at
runtime.

Needless to say, we don't satisfy this unwritten requirement on
x86_64 and instead map the EFI memory map entries in reverse
order. The above crash is almost certainly triggerable with any
kernel newer than v3.13 because that's when we rewrote the EFI
runtime region mapping code, in commit d2f7cbe7b26a ("x86/efi:
Runtime services virtual mapping"). For kernel versions before
v3.13 things may work by pure luck depending on the
fragmentation of the kernel virtual address space at the time we
map the EFI regions.

Instead of mapping the EFI memory map entries in reverse order,
where entry N has a higher virtual address than entry N+1, map
them in the same order as they appear in the EFI memory map to
preserve this relative offset between regions.

This patch has been kept as small as possible with the intention
that it should be applied aggressively to stable and
distribution kernels. It is very much a bugfix rather than
support for a new feature, since when EFI_PROPERTIES_TABLE is
enabled we must map things as outlined above to even boot - we
have no way of asking the firmware not to split the code/data
regions.

In fact, this patch doesn't even make use of the more strict
memory protections available in UEFI v2.5. That will come later.

Suggested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Cc: <stable@vger.kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Chun-Yi <jlee@suse.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: James Bottomley <JBottomley@Odin.com>
Cc: Lee, Chun-Yi <jlee@suse.com>
Cc: Leif Lindholm <leif.lindholm@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Jones <pjones@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/1443218539-7610-2-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

Use WARN_ON_ONCE for missing X86_FEATURE_NRIPS

[ Upstream commit d2922422c48df93f3edff7d872ee4f3191fefb08 ]

The cpu feature flags are not ever going to change, so warning
everytime can cause a lot of kernel log spam
(in our case more than 10GB/hour).

The warning seems to only occur when nested virtualization is
enabled, so it's probably triggered by a KVM bug. This is a
sensible and safe change anyway, and the KVM bug fix might not
be suitable for stable releases anyway.

Cc: stable@vger.kernel.org
Signed-off-by: Dirk Mueller <dmueller@suse.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

x86/paravirt: Replace the paravirt nop with a bona fide empty function

[ Upstream commit fc57a7c68020dcf954428869eafd934c0ab1536f ]

PARAVIRT_ADJUST_EXCEPTION_FRAME generates this code (using nmi as an
example, trimmed for readability):

    ff 15 00 00 00 00       callq  *0x0(%rip)        # 2796 <nmi+0x6>
              2792: R_X86_64_PC32     pv_irq_ops+0x2c

That's a call through a function pointer to regular C function that
does nothing on native boots, but that function isn't protected
against kprobes, isn't marked notrace, and is certainly not
guaranteed to preserve any registers if the compiler is feeling
perverse.  This is bad news for a CLBR_NONE operation.

Of course, if everything works correctly, once paravirt ops are
patched, it gets nopped out, but what if we hit this code before
paravirt ops are patched in?  This can potentially cause breakage
that is very difficult to debug.

A more subtle failure is possible here, too: if _paravirt_nop uses
the stack at all (even just to push RBP), it will overwrite the "NMI
executing" variable if it's called in the NMI prologue.

The Xen case, perhaps surprisingly, is fine, because it's already
written in asm.

Fix all of the cases that default to paravirt_nop (including
adjust_exception_frame) with a big hammer: replace paravirt_nop with
an asm function that is just a ret instruction.

The Xen case may have other problems, so document them.

This is part of a fix for some random crashes that Sasha saw.

Reported-and-tested-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/8f5d2ba295f9d73751c33d97fda03e0495d9ade0.1442791737.git.luto@kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

x86/platform: Fix Geode LX timekeeping in the generic x86 build

[ Upstream commit 03da3ff1cfcd7774c8780d2547ba0d995f7dc03d ]

In 2007, commit 07190a08eef36 ("Mark TSC on GeodeLX reliable")
bypassed verification of the TSC on Geode LX. However, this code
(now in the check_system_tsc_reliable() function in
arch/x86/kernel/tsc.c) was only present if CONFIG_MGEODE_LX was
set.

OpenWRT has recently started building its generic Geode target
for Geode GX, not LX, to include support for additional
platforms. This broke the timekeeping on LX-based devices,
because the TSC wasn't marked as reliable:
https://dev.openwrt.org/ticket/20531

By adding a runtime check on is_geode_lx(), we can also include
the fix if CONFIG_MGEODEGX1 or CONFIG_X86_GENERIC are set, thus
fixing the problem.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Cc: Andres Salomon <dilinger@queued.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Marcelo Tosatti <marcelo@kvack.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1442409003.131189.87.camel@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

x86/apic: Serialize LVTT and TSC_DEADLINE writes

[ Upstream commit 5d7c631d926b59aa16f3c56eaeb83f1036c81dc7 ]

The APIC LVTT register is MMIO mapped but the TSC_DEADLINE register is an
MSR. The write to the TSC_DEADLINE MSR is not serializing, so it's not
guaranteed that the write to LVTT has reached the APIC before the
TSC_DEADLINE MSR is written. In such a case the write to the MSR is
ignored and as a consequence the local timer interrupt never fires.

The SDM decribes this issue for xAPIC and x2APIC modes. The
serialization methods recommended by the SDM differ.

xAPIC:
"1. Memory-mapped write to LVT Timer Register, setting bits 18:17 to 10b.
  2. WRMSR to the IA32_TSC_DEADLINE MSR a value much larger than current time-stamp counter.
  3. If RDMSR of the IA32_TSC_DEADLINE MSR returns zero, go to step 2.
  4. WRMSR to the IA32_TSC_DEADLINE MSR the desired deadline."

x2APIC:
"To allow for efficient access to the APIC registers in x2APIC mode,
  the serializing semantics of WRMSR are relaxed when writing to the
  APIC registers. Thus, system software should not use 'WRMSR to APIC
  registers in x2APIC mode' as a serializing instruction. Read and write
  accesses to the APIC registers will occur in program order. A WRMSR to
  an APIC register may complete before all preceding stores are globally
  visible; software can prevent this by inserting a serializing
  instruction, an SFENCE, or an MFENCE before the WRMSR."

The xAPIC method is to just wait for the memory mapped write to hit
the LVTT by checking whether the MSR write has reached the hardware.
There is no reason why a proper MFENCE after the memory mapped write would
not do the same. Andi Kleen confirmed that MFENCE is sufficient for the
xAPIC case as well.

Issue MFENCE before writing to the TSC_DEADLINE MSR. This can be done
unconditionally as all CPUs which have TSC_DEADLINE also have MFENCE
support.

[ tglx: Massaged the changelog ]

Signed-off-by: Shaohua Li <shli@fb.com>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Cc: <Kernel-team@fb.com>
Cc: <lenb@kernel.org>
Cc: <fenghua.yu@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: stable@vger.kernel.org #v3.7+
Link: http://lkml.kernel.org/r/20150909041352.GA2059853@devbig257.prn2.facebook.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

dmaengine: dw: properly read DWC_PARAMS register

[ Upstream commit 6bea0f6d1c47b07be88dfd93f013ae05fcb3d8bf ]

In case we have less than maximum allowed channels (8) and autoconfiguration is
enabled the DWC_PARAMS read is wrong because it uses different arithmetic to
what is needed for channel priority setup.

Re-do the caclulations properly. This now works on AVR32 board well.

Fixes: fed2574b3c9f (dw_dmac: introduce software emulation of LLP transfers)
Cc: yitian.bu@tangramtek.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

ARM: dts: fix usb pin control for imx-rex dts

[ Upstream commit 0af822110871400908d5b6f83a8908c45f881d8f ]

This fixes a duplicated pin control causing this error:

imx6q-pinctrl 20e0000.iomuxc: pin MX6Q_PAD_GPIO_1 already
requested by regulators:regulator@2; cannot claim for 2184000.usb
imx6q-pinctrl 20e0000.iomuxc: pin-137 (2184000.usb) status -22
imx6q-pinctrl 20e0000.iomuxc: could not request pin 137
(MX6Q_PAD_GPIO_1) from group usbotggrp on device 20e0000.iomuxc
imx_usb 2184000.usb: Error applying setting, reverse things
back
imx6q-pinctrl 20e0000.iomuxc: pin MX6Q_PAD_EIM_D31 already
requested by regulators:regulator@1; cannot claim for 2184200.usb
imx6q-pinctrl 20e0000.iomuxc: pin-52 (2184200.usb) status -22
imx6q-pinctrl 20e0000.iomuxc: could not request pin 52 (MX6Q_PAD_EIM_D31)
from group usbh1grp on device 20e0000.iomuxc
imx_usb 2184200.usb: Error applying setting, reverse things
back

Signed-off-by: Felipe F. Tonello <eu@felipetonello.com>
Fixes: e2047e33f2bd ("ARM: dts: add initial Rex Pro board support")
Cc: <stable@vger.kernel.org>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

ARM: dts: omap3-beagle: make i2c3, ddc and tfp410 gpio work again

[ Upstream commit 3a2fa775bd1d0579113666c1a2e37654a34018a0 ]

Let's fix pinmux address of gpio 170 used by tfp410 powerdown-gpio.

According to the OMAP35x Technical Reference Manual
  CONTROL_PADCONF_I2C3_SDA[15:0]  0x480021C4 mode0: i2c3_sda
  CONTROL_PADCONF_I2C3_SDA[31:16] 0x480021C4 mode4: gpio_170
the pinmux address of gpio 170 must be 0x480021C6.

The former wrong address broke i2c3 (used by hdmi ddc), resulting in
kernel message:
  omap_i2c 48060000.i2c: controller timed out

Fixes: 8cecf52befd7 ("ARM: omap3-beagle.dts: add display information")
Cc: stable@vger.kernel.org # v3.15+
Signed-off-by: Carl Frederik Werner <frederik@cfbw.eu>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

ARM: dts: omap5-uevm.dts: fix i2c5 pinctrl offsets

[ Upstream commit 1dbdad75074d16c3e3005180f81a01cdc04a7872 ]

The i2c5 pinctrl offsets are wrong. If the bootloader doesn't set the
pins up, communication with tca6424a doesn't work (controller timeouts)
and it is not possible to enable HDMI.

Fixes: 9be495c42609 ("ARM: dts: omap5-evm: Add I2c pinctrl data")
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

windfarm: decrement client count when unregistering

[ Upstream commit fe2b592173ff0274e70dc44d1d28c19bb995aa7c ]

wf_unregister_client() increments the client count when a client
unregisters. That is obviously incorrect. Decrement that client count
instead.

Fixes: 75722d3992f5 ("[PATCH] ppc64: Thermal control for SMU based machines")
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

ARM: 8429/1: disable GCC SRA optimization

[ Upstream commit a077224fd35b2f7fbc93f14cf67074fc792fbac2 ]

While working on the 32-bit ARM port of UEFI, I noticed a strange
corruption in the kernel log. The following snprintf() statement
(in drivers/firmware/efi/efi.c:efi_md_typeattr_format())

snprintf(pos, size, "|%3s|%2s|%2s|%2s|%3s|%2s|%2s|%2s|%2s]",

was producing the following output in the log:

|    |   |   |   |    |WB|WT|WC|UC]
|    |   |   |   |    |WB|WT|WC|UC]
|    |   |   |   |    |WB|WT|WC|UC]
|RUN|   |   |   |    |WB|WT|WC|UC]*
|RUN|   |   |   |    |WB|WT|WC|UC]*
|    |   |   |   |    |WB|WT|WC|UC]
|RUN|   |   |   |    |WB|WT|WC|UC]*
|    |   |   |   |    |WB|WT|WC|UC]
|RUN|   |   |   |    |   |   |   |UC]
|RUN|   |   |   |    |   |   |   |UC]

As it turns out, this is caused by incorrect code being emitted for
the string() function in lib/vsprintf.c. The following code

if (!(spec.flags & LEFT)) {
while (len < spec.field_width--) {
if (buf < end)
*buf = ' ';
++buf;
}
}
for (i = 0; i < len; ++i) {
if (buf < end)
*buf = *s;
++buf; ++s;
}
while (len < spec.field_width--) {
if (buf < end)
*buf = ' ';
++buf;
}

when called with len == 0, triggers an issue in the GCC SRA optimization
pass (Scalar Replacement of Aggregates), which handles promotion of signed
struct members incorrectly. This is a known but as yet unresolved issue.
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65932). In this particular
case, it is causing the second while loop to be executed erroneously a
single time, causing the additional space characters to be printed.

So disable the optimization by passing -fno-ipa-sra.

Cc: <stable@vger.kernel.org>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

ARM: fix Thumb2 signal handling when ARMv6 is enabled

[ Upstream commit 9b55613f42e8d40d5c9ccb8970bde6af4764b2ab ]

When a kernel is built covering ARMv6 to ARMv7, we omit to clear the
IT state when entering a signal handler.  This can cause the first
few instructions to be conditionally executed depending on the parent
context.

In any case, the original test for >= ARMv7 is broken - ARMv6 can have
Thumb-2 support as well, and an ARMv6T2 specific build would omit this
code too.

Relax the test back to ARMv6 or greater.  This results in us always
clearing the IT state bits in the PSR, even on CPUs where these bits
are reserved.  However, they're reserved for the IT state, so this
should cause no harm.

Cc: <stable@vger.kernel.org>
Fixes: d71e1352e240 ("Clear the IT state when invoking a Thumb-2 signal handler")
Acked-by: Tony Lindgren <tony@atomide.com>
Tested-by: H. Nikolaus Schaller <hns@goldelico.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

hwmon: (nct6775) Swap STEP_UP_TIME and STEP_DOWN_TIME registers for most chips

[ Upstream commit 728d29400488d54974d3317fe8a232b45fdb42ee ]

The STEP_UP_TIME and STEP_DOWN_TIME registers are swapped for all chips but
NCT6775.

Reported-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Jean Delvare <jdelvare@suse.de>
Cc: stable@vger.kernel.org # v3.10+
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

sched: access local runqueue directly in single_task_running

[ Upstream commit 00cc1633816de8c95f337608a1ea64e228faf771 ]

Commit 2ee507c47293 ("sched: Add function single_task_running to let a task
check if it is the only task running on a cpu") referenced the current
runqueue with the smp_processor_id. When CONFIG_DEBUG_PREEMPT is enabled,
that is only allowed if preemption is disabled or the currrent task is
bound to the local cpu (e.g. kernel worker).

With commit f78195129963 ("kvm: add halt_poll_ns module parameter") KVM
calls single_task_running. If CONFIG_DEBUG_PREEMPT is enabled that
generates a lot of kernel messages.

To avoid adding preemption in that cases, as it would limit the usefulness,
we change single_task_running to access directly the cpu local runqueue.

Cc: Tim Chen <tim.c.chen@linux.intel.com>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Fixes: 2ee507c472939db4b146d545352b8a7c79ef47f8
Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

watchdog: sunxi: fix activation of system reset

[ Upstream commit 0919e4445190da18496d31aac08b90828a47d45f ]

Commit f2147de33470 ("watchdog: sunxi: support parameterized compatible
strings") introduced a regression in sunxi_wdt_start(), by which
the system reset function of the watchdog is not enabled upon
starting the watchdog. As a result, the system is not reset when the
watchdog expires. Fix it.

Fixes: f2147de33470 ("watchdog: sunxi: support parameterized compatible strings")
Signed-off-by: Francesco Lavra <francescolavra.fl@gmail.com>
Acked-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

perf header: Fixup reading of HEADER_NRCPUS feature

[ Upstream commit caa470475d9b59eeff093ae650800d34612c4379 ]

The original patch introducing this header wrote the number of CPUs available
and online in one order and then swapped those values when reading, fix it.

Before:

  # perf record usleep 1
  # perf report --header-only | grep 'nrcpus $online\|avail$'
  # nrcpus online : 4
  # nrcpus avail : 4
  # echo 0 > /sys/devices/system/cpu/cpu2/online
  # perf record usleep 1
  # perf report --header-only | grep 'nrcpus $online\|avail$'
  # nrcpus online : 4
  # nrcpus avail : 3
  # echo 0 > /sys/devices/system/cpu/cpu1/online
  # perf record usleep 1
  # perf report --header-only | grep 'nrcpus $online\|avail$'
  # nrcpus online : 4
  # nrcpus avail : 2

After the fix, bringing back the CPUs online:

  # perf report --header-only | grep 'nrcpus $online\|avail$'
  # nrcpus online : 2
  # nrcpus avail : 4
  # echo 1 > /sys/devices/system/cpu/cpu2/online
  # perf record usleep 1
  # perf report --header-only | grep 'nrcpus $online\|avail$'
  # nrcpus online : 3
  # nrcpus avail : 4
  # echo 1 > /sys/devices/system/cpu/cpu1/online
  # perf record usleep 1
  # perf report --header-only | grep 'nrcpus $online\|avail$'
  # nrcpus online : 4
  # nrcpus avail : 4

Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: fbe96f29ce4b ("perf tools: Make perf.data more self-descriptive (v8)")
Link: http://lkml.kernel.org/r/20150911153323.GP23511@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

perf stat: Get correct cpu id for print_aggr

[ Upstream commit 601083cffb7cabdcc55b8195d732f0f7028570fa ]

print_aggr() fails to print per-core/per-socket statistics after commit
582ec0829b3d ("perf stat: Fix per-socket output bug for uncore events")
if events have differnt cpus. Because in print_aggr(), aggr_get_id needs
index (not cpu id) to find core/pkg id. Also, evsel cpu maps should be
used to get aggregated id.

Here is an example:

Counting events cycles,uncore_imc_0/cas_count_read/. (Uncore event has
cpumask 0,18)

  $ perf stat -e cycles,uncore_imc_0/cas_count_read/ -C0,18 --per-core sleep 2

Without this patch, it failes to get CPU 18 result.

   Performance counter stats for 'CPU(s) 0,18':

  S0-C0           1            7526851      cycles
  S0-C0           1               1.05 MiB  uncore_imc_0/cas_count_read/
  S1-C0           0      <not counted>      cycles
  S1-C0           0      <not counted> MiB  uncore_imc_0/cas_count_read/

With this patch, it can get both CPU0 and CPU18 result.

   Performance counter stats for 'CPU(s) 0,18':

  S0-C0           1            6327768      cycles
  S0-C0           1               0.47 MiB  uncore_imc_0/cas_count_read/
  S1-C0           1             330228      cycles
  S1-C0           1               0.29 MiB  uncore_imc_0/cas_count_read/

Signed-off-by: Kan Liang <kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Stephane Eranian <eranian@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Fixes: 582ec0829b3d ("perf stat: Fix per-socket output bug for uncore events")
Link: http://lkml.kernel.org/r/1435820925-51091-1-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

perf report: Add support for srcfile sort key

[ Upstream commit 31191a85fb875cf123cea56bbfd34f4b941f3c79 ]

In some cases it's useful to characterize samples by file. This is
useful to get a higher level categorization, for example to map cost to
subsystems.

Add a srcfile sort key to perf report. It builds on top of the existing
srcline support.

Commiter notes:

E.g.:

  # perf record -F 10000 usleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.016 MB perf.data (13 samples) ]
  [root@zoo ~]# perf report -s srcfile --stdio
  # Total Lost Samples: 0
  #
  # Samples: 13  of event 'cycles'
  # Event count (approx.): 869878
  #
  # Overhead  Source File
  # ........  ...........
      60.99%  .
      20.62%  paravirt.h
      14.23%  rmap.c
       4.04%  signal.c
       0.11%  msr.h

  #

The first line is collecting all the files for which srcfiles couldn't somehow
get resolved to:

  # perf report -s srcfile,dso --stdio
  # Total Lost Samples: 0
  #
  # Samples: 13  of event 'cycles'
  # Event count (approx.): 869878
  #
  # Overhead  Source File  Shared Object
  # ........  ...........  ................
      40.97%  .            ld-2.20.so
      20.62%  paravirt.h   [kernel.vmlinux]
      20.02%  .            libc-2.20.so
      14.23%  rmap.c       [kernel.vmlinux]
       4.04%  signal.c     [kernel.vmlinux]
       0.11%  msr.h        [kernel.vmlinux]

  #

XXX: Investigate why that is not resolving on Fedora 21, Andi says he hasn't
     seen this on Fedora 22.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1438988064-21834-1-git-send-email-andi@firstfloor.org
[ Added column length update, from 0e65bdb3f90f ('perf hists: Update the column width for the "srcline" sort key') ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

perf tools: Fix copying of /proc/kcore

[ Upstream commit b5cabbcbd157a4bf5a92dfc85134999a3b55342d ]

A copy of /proc/kcore containing the kernel text can be made to the
buildid cache. e.g.

perf buildid-cache -v -k /proc/kcore

To workaround objdump limitations, a copy is also made when annotating
against /proc/kcore.

The copying process stops working from libelf about v1.62 onwards (the
problem was found with v1.63).

The cause is that a call to gelf_getphdr() in kcore__add_phdr() fails
because additional validation has been added to gelf_getphdr().

The use of gelf_getphdr() is a misguided attempt to get default
initialization of the Gelf_Phdr structure. That should not be
necessary because every member of the Gelf_Phdr structure is
subsequently assigned. So just remove the call to gelf_getphdr().

Similarly, a call to gelf_getehdr() in gelf_kcore__init() can be
removed also.

Committer notes:

Note to stable@kernel.org, from Adrian in the cover letter for this
patchkit:

The "Fix copying of /proc/kcore" problem goes back to v3.13 if you think
it is important enough for stable.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@kernel.org
Link: http://lkml.kernel.org/r/1443089122-19082-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

iser-target: remove command with state ISTATE_REMOVE

[ Upstream commit a4c15cd957cbd728f685645de7a150df5912591a ]

As documented in iscsit_sequence_cmd:
/*
* Existing callers for iscsit_sequence_cmd() will silently
* ignore commands with CMDSN_LOWER_THAN_EXP, so force this
* return for CMDSN_MAXCMDSN_OVERRUN as well..
*/

We need to silently finish a command when it's in ISTATE_REMOVE.
This fixes an teardown hang we were seeing where a mis-behaved
initiator (triggered by allocation error injections) sent us a
cmdsn which was lower than expected.

Signed-off-by: Jenny Derzhavetz <jennyf@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Cc: <stable@vger.kernel.org> # v3.10+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>