wrapfs-4.14.y.git
10 years agotracing/filter: Do not WARN on operand count going below zero
Steven Rostedt (Red Hat) [Thu, 25 Jun 2015 22:02:29 +0000 (18:02 -0400)]
tracing/filter: Do not WARN on operand count going below zero

[ Upstream commit b4875bbe7e68f139bd3383828ae8e994a0df6d28 ]

When testing the fix for the trace filter, I could not come up with
a scenario where the operand count goes below zero, so I added a
WARN_ON_ONCE(cnt < 0) to the logic. But there is legitimate case
that it can happen (although the filter would be wrong).

 # echo '>' > /sys/kernel/debug/events/ext4/ext4_truncate_exit/filter

That is, a single operation without any operands will hit the path
where the WARN_ON_ONCE() can trigger. Although this is harmless,
and the filter is reported as a error. But instead of spitting out
a warning to the kernel dmesg, just fail nicely and report it via
the proper channels.

Link: http://lkml.kernel.org/r/558C6082.90608@oracle.com
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Cc: stable@vger.kernel.org # 2.6.33+
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agolibata: force disable trim for SuperSSpeed S238
Arne Fitzenreiter [Wed, 15 Jul 2015 11:54:37 +0000 (13:54 +0200)]
libata: force disable trim for SuperSSpeed S238

[ Upstream commit cda57b1b05cf7b8b99ab4b732bea0b05b6c015cc ]

This device loses blocks, often the partition table area, on trim.
Disable TRIM.
http://pcengines.ch/msata16a.htm

Signed-off-by: Arne Fitzenreiter <arne_f@ipfire.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agolibata: add ATA_HORKAGE_NOTRIM
Arne Fitzenreiter [Wed, 15 Jul 2015 11:54:36 +0000 (13:54 +0200)]
libata: add ATA_HORKAGE_NOTRIM

[ Upstream commit 71d126fd28de2d4d9b7b2088dbccd7ca62fad6e0 ]

Some devices lose data on TRIM whether queued or not.  This patch adds
a horkage to disable TRIM.

tj: Collapsed unnecessary if() nesting.

Signed-off-by: Arne Fitzenreiter <arne_f@ipfire.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoevm: labeling pseudo filesystems exception
Mimi Zohar [Tue, 21 Apr 2015 17:59:31 +0000 (13:59 -0400)]
evm: labeling pseudo filesystems exception

[ Upstream commit 5101a1850bb7ccbf107929dee9af0cd2f400940f ]

To prevent offline stripping of existing file xattrs and relabeling of
them at runtime, EVM allows only newly created files to be labeled.  As
pseudo filesystems are not persistent, stripping of xattrs is not a
concern.

Some LSMs defer file labeling on pseudo filesystems.  This patch
permits the labeling of existing files on pseudo files systems.

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoKEYS: ensure we free the assoc array edit if edit is valid
Colin Ian King [Mon, 27 Jul 2015 14:23:43 +0000 (15:23 +0100)]
KEYS: ensure we free the assoc array edit if edit is valid

[ Upstream commit HEAD ]

commit ca4da5dd1f99fe9c59f1709fb43e818b18ad20e0 upstream.

__key_link_end is not freeing the associated array edit structure
and this leads to a 512 byte memory leak each time an identical
existing key is added with add_key().

The reason the add_key() system call returns okay is that
key_create_or_update() calls __key_link_begin() before checking to see
whether it can update a key directly rather than adding/replacing - which
it turns out it can.  Thus __key_link() is not called through
__key_instantiate_and_link() and __key_link_end() must cancel the edit.

CVE-2015-1333

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
(cherry picked from commit c9cd9b18dac801040ada16562dc579d5ac366d75)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agodrm: add a check for x/y in drm_mode_setcrtc
Zhao Junwang [Tue, 7 Jul 2015 09:08:35 +0000 (17:08 +0800)]
drm: add a check for x/y in drm_mode_setcrtc

[ Upstream commit 01447e9f04ba1c49a9534ae6a5a6f26c2bb05226 ]

legacy setcrtc ioctl does take a 32 bit value which might indeed
overflow

the checks of crtc_req->x > INT_MAX and crtc_req->y > INT_MAX aren't
needed any more with this

v2: -polish the annotation according to Daniel's comment

Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Zhao Junwang <zhjwpku@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agodrm/radeon: add a dpm quirk for Sapphire Radeon R9 270X 2GB GDDR5
Alex Deucher [Fri, 10 Jul 2015 01:08:17 +0000 (21:08 -0400)]
drm/radeon: add a dpm quirk for Sapphire Radeon R9 270X 2GB GDDR5

[ Upstream commit 5dfc71bc44d91d1620505c064fa22b0b3db58a9d ]

bug:
https://bugs.freedesktop.org/show_bug.cgi?id=76490

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agommc: block: Add missing mmc_blk_put() in power_ro_lock_show()
Tomas Winkler [Thu, 16 Jul 2015 13:50:45 +0000 (15:50 +0200)]
mmc: block: Add missing mmc_blk_put() in power_ro_lock_show()

[ Upstream commit 9098f84cced870f54d8c410dd2444cfa61467fa0 ]

Enclosing mmc_blk_put() is missing in power_ro_lock_show() sysfs handler,
let's add it.

Fixes: add710eaa886 ("mmc: boot partition ro lock support")
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agodm btree: silence lockdep lock inversion in dm_btree_del()
Joe Thornber [Fri, 3 Jul 2015 13:51:32 +0000 (14:51 +0100)]
dm btree: silence lockdep lock inversion in dm_btree_del()

[ Upstream commit 1c7518794a3647eb345d59ee52844e8a40405198 ]

Allocate memory using GFP_NOIO when deleting a btree.  dm_btree_del()
can be called via an ioctl and we don't want to recurse into the FS or
block layer.

Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agodm btree remove: fix bug in redistribute3
Dennis Yang [Fri, 26 Jun 2015 14:25:48 +0000 (15:25 +0100)]
dm btree remove: fix bug in redistribute3

[ Upstream commit 4c7e309340ff85072e96f529582d159002c36734 ]

redistribute3() shares entries out across 3 nodes.  Some entries were
being moved the wrong way, breaking the ordering.  This manifested as a
BUG() in dm-btree-remove.c:shift() when entries were removed from the
btree.

For additional context see:
https://www.redhat.com/archives/dm-devel/2015-May/msg00113.html

Signed-off-by: Dennis Yang <shinrairis@gmail.com>
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agousb: xhci: Bugfix for NULL pointer deference in xhci_endpoint_init() function
AMAN DEEP [Tue, 21 Jul 2015 14:20:27 +0000 (17:20 +0300)]
usb: xhci: Bugfix for NULL pointer deference in xhci_endpoint_init() function

[ Upstream commit 3496810663922617d4b706ef2780c279252ddd6a ]

virt_dev->num_cached_rings counts on freed ring and is not updated
correctly. In xhci_free_or_cache_endpoint_ring() function, the free ring
is added into cache and then num_rings_cache is incremented as below:
virt_dev->ring_cache[rings_cached] =
virt_dev->eps[ep_index].ring;
virt_dev->num_rings_cached++;
here, free ring pointer is added to a current index and then
index is incremented.
So current index always points to empty location in the ring cache.
For getting available free ring, current index should be decremented
first and then corresponding ring buffer value should be taken from ring
cache.

But In function xhci_endpoint_init(), the num_rings_cached index is
accessed before decrement.
virt_dev->eps[ep_index].new_ring =
virt_dev->ring_cache[virt_dev->num_rings_cached];
virt_dev->ring_cache[virt_dev->num_rings_cached] = NULL;
virt_dev->num_rings_cached--;
This is bug in manipulating the index of ring cache.
And it should be as below:
virt_dev->num_rings_cached--;
virt_dev->eps[ep_index].new_ring =
virt_dev->ring_cache[virt_dev->num_rings_cached];
virt_dev->ring_cache[virt_dev->num_rings_cached] = NULL;

Cc: <stable@vger.kernel.org>
Signed-off-by: Aman Deep <aman.deep@samsung.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoUSB: serial: Destroy serial_minors IDR on module exit
Johannes Thumshirn [Wed, 8 Jul 2015 15:26:37 +0000 (17:26 +0200)]
USB: serial: Destroy serial_minors IDR on module exit

[ Upstream commit d23f47d4927fd2f61b3a754d83c7bcec215b5cfe ]

Destroy serial_minors IDR on module exit, reclaiming the allocated memory.

This was detected by the following semantic patch (written by Luis
Rodriguez <mcgrof@suse.com>)

<SmPL>
@ defines_module_init @
declarer name module_init, module_exit;
declarer name DEFINE_IDR;
identifier init;
@@

module_init(init);

@ defines_module_exit @
identifier exit;
@@

module_exit(exit);

@ declares_idr depends on defines_module_init && defines_module_exit @
identifier idr;
@@

DEFINE_IDR(idr);

@ on_exit_calls_destroy depends on declares_idr && defines_module_exit @
identifier declares_idr.idr, defines_module_exit.exit;
@@

exit(void)
{
 ...
 idr_destroy(&idr);
 ...
}

@ missing_module_idr_destroy depends on declares_idr && defines_module_exit && !on_exit_calls_destroy @
identifier declares_idr.idr, defines_module_exit.exit;
@@

exit(void)
{
 ...
 +idr_destroy(&idr);
}
</SmPL>

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Cc: stable <stable@vger.kernel.org> # v3.11
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoUSB: option: add 2020:4000 ID
Claudio Cappelli [Wed, 10 Jun 2015 18:38:30 +0000 (20:38 +0200)]
USB: option: add 2020:4000 ID

[ Upstream commit f6d7fb37f92622479ef6da604f27561f5045ba1e ]

Add device Olivetti Olicard 300 (Network Connect: MT6225) - IDs 2020:4000.

T:  Bus=01 Lev=02 Prnt=04 Port=00 Cnt=01 Dev#= 10 Spd=480 MxCh= 0
D:  Ver= 2.00 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs=  1
P:  Vendor=2020 ProdID=4000 Rev=03.00
S:  Manufacturer=Network Connect
S:  Product=MT6225
C:  #Ifs= 7 Cfg#= 1 Atr=a0 MxPwr=500mA
I:  If#= 0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim
I:  If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim
I:  If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=02 Prot=01 Driver=option
I:  If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I:  If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I:  If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
I:  If#= 6 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=usb-storage

Signed-off-by: Claudio Cappelli <claudio.cappelli.linux@gmail.com>
Suggested-by: Lars Melin <larsm17@gmail.com>
Cc: stable <stable@vger.kernel.org>
[johan: amend commit message with devices info ]
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoUSB: cp210x: add ID for Aruba Networks controllers
Peter Sanford [Fri, 26 Jun 2015 00:40:05 +0000 (17:40 -0700)]
USB: cp210x: add ID for Aruba Networks controllers

[ Upstream commit f98a7aa81eeeadcad25665c3501c236d531d4382 ]

Add the USB serial console device ID for Aruba Networks 7xxx series
controllers which have a USB port for their serial console.

Signed-off-by: Peter Sanford <peter@sanford.io>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agousb: musb: host: rely on port_mode to call musb_start()
Felipe Balbi [Tue, 2 Jun 2015 18:03:36 +0000 (13:03 -0500)]
usb: musb: host: rely on port_mode to call musb_start()

[ Upstream commit be9d39881fc4fa39a64b6eed6bab5d9ee5125344 ]

Currently, we're calling musb_start() twice for DRD ports
in some situations. This has been observed to cause enumeration
issues after suspend/resume cycles with AM335x.

In order to fix the problem, we just have to fix the check
on musb_has_gadget() so that it only returns true if
current mode is Host and ignore the fact that we have or
not a gadget driver loaded.

Fixes: ae44df2e21b5 (usb: musb: call musb_start() only once in OTG mode)
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: <stable@vger.kernel.org> # v3.11+
Tested-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoUSB: devio: fix a condition in async_completed()
Dan Carpenter [Mon, 18 May 2015 12:29:51 +0000 (15:29 +0300)]
USB: devio: fix a condition in async_completed()

[ Upstream commit 83ed07c5db71bc02bd646d6eb60b48908235cdf9 ]

Static checkers complain that the current condition is never true.  It
seems pretty likely that it's a typo and "URB" was intended instead of
"USB".

Fixes: 3d97ff63f899 ('usbdevfs: Use scatter-gather lists for large bulk transfers')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agousb: dwc3: Reset the transfer resource index on SET_INTERFACE
John Youn [Mon, 17 Sep 2001 07:00:00 +0000 (00:00 -0700)]
usb: dwc3: Reset the transfer resource index on SET_INTERFACE

[ Upstream commit aebda618718157a69c0dc0adb978d69bc2b8723c ]

This fixes an issue introduced in commit b23c843992b6 (usb: dwc3:
gadget: fix DEPSTARTCFG for non-EP0 EPs) that made sure we would
only use DEPSTARTCFG once per SetConfig.

The trick is that we should use one DEPSTARTCFG per SetConfig *OR*
SetInterface. SetInterface was completely missed from the original
patch.

This problem became aparent after commit 76e838c9f776 (usb: dwc3:
gadget: return error if command sent to DEPCMD register fails)
added checking of the return status of device endpoint commands.

'Set Endpoint Transfer Resource' command was caught failing
occasionally. This is because the Transfer Resource
Index was not getting reset during a SET_INTERFACE request.

Finally, to fix the issue, was we have to do is make sure that
our start_config_issued flag gets reset whenever we receive a
SetInterface request.

To verify the problem (and its fix), all we have to do is run
test 9 from testusb with 'testusb -t 9 -s 2048 -a -c 5000'.

Tested-by: Huang Rui <ray.huang@amd.com>
Tested-by: Subbaraya Sundeep Bhatta <subbaraya.sundeep.bhatta@xilinx.com>
Fixes: b23c843992b6 (usb: dwc3: gadget: fix DEPSTARTCFG for non-EP0 EPs)
Cc: <stable@vger.kernel.org> # v3.2+
Signed-off-by: John Youn <johnyoun@synopsys.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agolibata: increase the timeout when setting transfer mode
Mikulas Patocka [Wed, 8 Jul 2015 17:06:12 +0000 (13:06 -0400)]
libata: increase the timeout when setting transfer mode

[ Upstream commit d531be2ca2f27cca5f041b6a140504999144a617 ]

I have a ST4000DM000 disk. If Linux is booted while the disk is spun down,
the command that sets transfer mode causes the disk to spin up. The
spin-up takes longer than the default 5s timeout, so the command fails and
timeout is reported.

Fix this by increasing the timeout to 15s, which is enough for the disk to
spin up.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agolibata: add ATA_HORKAGE_BROKEN_FPDMA_AA quirk for HP 250GB SATA disk VB0250EAVER
Aleksei Mamlin [Wed, 1 Jul 2015 10:48:30 +0000 (13:48 +0300)]
libata: add ATA_HORKAGE_BROKEN_FPDMA_AA quirk for HP 250GB SATA disk VB0250EAVER

[ Upstream commit 08c85d2a599d967ede38a847f5594447b6100642 ]

Enabling AA on HP 250GB SATA disk VB0250EAVER causes errors:

[    3.788362] ata3.00: failed to enable AA (error_mask=0x1)
[    3.789243] ata3.00: failed to enable AA (error_mask=0x1)

Add the ATA_HORKAGE_BROKEN_FPDMA_AA for this specific harddisk.

tj: Collected FPDMA_AA entries and updated comment.

Signed-off-by: Aleksei Mamlin <mamlinav@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoASoC: imx-wm8962: Add a missing error check
Dan Carpenter [Wed, 10 Jun 2015 15:37:23 +0000 (18:37 +0300)]
ASoC: imx-wm8962: Add a missing error check

[ Upstream commit 474ff0ae23b834e9fc18374d14bb5f3e7b3828b4 ]

My static checker complains that:

sound/soc/fsl/imx-wm8962.c:196 imx_wm8962_probe() warn:
we tested 'ret' before and it was 'false'

The intent was that we use "ret" to check imx_audmux_v2_configure_port().

Fixes: 8de2ae2a7f1f ('ASoC: fsl: add imx-wm8962 machine driver')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Otherwise, Acked-by: Nicolin Chen <nicoleotsuka@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoiio: adc: at91_adc: allow to use full range of startup time
Jan Leupold [Wed, 17 Jun 2015 16:21:36 +0000 (18:21 +0200)]
iio: adc: at91_adc: allow to use full range of startup time

[ Upstream commit 815983e9afcb21552e1b9d77e3755d394cbf5249 ]

The DT-Property "atmel,adc-startup-time" is stored in an u8 for a microsecond
value. When trying to increase the value of STARTUP in Register AT91_ADC_MR
some higher values can't be reached.

Change the type in function parameter and private structure field from u8 to
u32.

Signed-off-by: Jan Leupold <leupold@rsi-elektrotechnik.de>
[nicolas.ferre@atmel.com: change commit message, increase u16 to u32 for startup time]
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoiio: tmp006: Check channel info on write
Peter Meerwald [Sun, 21 Jun 2015 21:50:21 +0000 (23:50 +0200)]
iio: tmp006: Check channel info on write

[ Upstream commit f451957dafe712ab2cd4b7ca8ba9197798115fbe ]

only SAMP_FREQ is writable

Will lead to SAMP_FREQ being written by any attempt to write
to the other exported attributes and hence a rather unexpected
result!

Signed-off-by: Peter Meerwald <pmeerw@pmeerw.net>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoiio: DAC: ad5624r_spi: fix bit shift of output data value
JM Friedt [Fri, 19 Jun 2015 12:48:06 +0000 (14:48 +0200)]
iio: DAC: ad5624r_spi: fix bit shift of output data value

[ Upstream commit 09f4dcc9443c4731b6669066bd1bc4312ef17bcc ]

The value sent on the SPI bus is shifted by an erroneous number of bits.
The shift value was already computed in the iio_chan_spec structure and
hence subtracting this argument to 16 yields an erroneous data position
in the SPI stream.

Signed-off-by: JM Friedt <jmfriedt@femto-st.fr>
Acked-by: Lars-Peter Clausen <lars@metafoo.de>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoext4: replace open coded nofail allocation in ext4_free_blocks()
Michal Hocko [Sun, 5 Jul 2015 16:33:44 +0000 (12:33 -0400)]
ext4: replace open coded nofail allocation in ext4_free_blocks()

[ Upstream commit 7444a072c387a93ebee7066e8aee776954ab0e41 ]

ext4_free_blocks is looping around the allocation request and mimics
__GFP_NOFAIL behavior without any allocation fallback strategy. Let's
remove the open coded loop and replace it with __GFP_NOFAIL. Without the
flag the allocator has no way to find out never-fail requirement and
cannot help in any way.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoext4: correctly migrate a file with a hole at the beginning
Eryu Guan [Sat, 4 Jul 2015 04:03:44 +0000 (00:03 -0400)]
ext4: correctly migrate a file with a hole at the beginning

[ Upstream commit 8974fec7d72e3e02752fe0f27b4c3719c78d9a15 ]

Currently ext4_ind_migrate() doesn't correctly handle a file which
contains a hole at the beginning of the file.  This caused the migration
to be done incorrectly, and then if there is a subsequent following
delayed allocation write to the "hole", this would reclaim the same data
blocks again and results in fs corruption.

  # assmuing 4k block size ext4, with delalloc enabled
  # skip the first block and write to the second block
  xfs_io -fc "pwrite 4k 4k" -c "fsync" /mnt/ext4/testfile

  # converting to indirect-mapped file, which would move the data blocks
  # to the beginning of the file, but extent status cache still marks
  # that region as a hole
  chattr -e /mnt/ext4/testfile

  # delayed allocation writes to the "hole", reclaim the same data block
  # again, results in i_blocks corruption
  xfs_io -c "pwrite 0 4k" /mnt/ext4/testfile
  umount /mnt/ext4
  e2fsck -nf /dev/sda6
  ...
  Inode 53, i_blocks is 16, should be 8.  Fix? no
  ...

Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoext4: be more strict when migrating to non-extent based file
Eryu Guan [Sat, 4 Jul 2015 03:56:50 +0000 (23:56 -0400)]
ext4: be more strict when migrating to non-extent based file

[ Upstream commit d6f123a9297496ad0b6335fe881504c4b5b2a5e5 ]

Currently the check in ext4_ind_migrate() is not enough before doing the
real conversion:

a) delayed allocated extents could bypass the check on eh->eh_entries
   and eh->eh_depth

This can be demonstrated by this script

  xfs_io -fc "pwrite 0 4k" -c "pwrite 8k 4k" /mnt/ext4/testfile
  chattr -e /mnt/ext4/testfile

where testfile has two extents but still be converted to non-extent
based file format.

b) only extent length is checked but not the offset, which would result
   in data lose (delalloc) or fs corruption (nodelalloc), because
   non-extent based file only supports at most (12 + 2^10 + 2^20 + 2^30)
   blocks

This can be demostrated by

  xfs_io -fc "pwrite 5T 4k" /mnt/ext4/testfile
  chattr -e /mnt/ext4/testfile
  sync

If delalloc is enabled, dmesg prints
  EXT4-fs warning (device dm-4): ext4_block_to_path:105: block 1342177280 > max in inode 53
  EXT4-fs (dm-4): Delayed block allocation failed for inode 53 at logical offset 1342177280 with max blocks 1 with error 5
  EXT4-fs (dm-4): This should not happen!! Data will be lost

If delalloc is disabled, e2fsck -nf shows corruption
  Inode 53, i_size is 5497558142976, should be 4096.  Fix? no

Fix the two issues by

a) forcing all delayed allocation blocks to be allocated before checking
   eh->eh_depth and eh->eh_entries
b) limiting the last logical block of the extent is within direct map

Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoext4: fix reservation release on invalidatepage for delalloc fs
Lukas Czerner [Sat, 4 Jul 2015 01:13:55 +0000 (21:13 -0400)]
ext4: fix reservation release on invalidatepage for delalloc fs

[ Upstream commit 9705acd63b125dee8b15c705216d7186daea4625 ]

On delalloc enabled file system on invalidatepage operation
in ext4_da_page_release_reservation() we want to clear the delayed
buffer and remove the extent covering the delayed buffer from the extent
status tree.

However currently there is a bug where on the systems with page size >
block size we will always remove extents from the start of the page
regardless where the actual delayed buffers are positioned in the page.
This leads to the errors like this:

EXT4-fs warning (device loop0): ext4_da_release_space:1225:
ext4_da_release_space: ino 13, to_free 1 with only 0 reserved data
blocks

This however can cause data loss on writeback time if the file system is
in ENOSPC condition because we're releasing reservation for someones
else delayed buffer.

Fix this by only removing extents that corresponds to the part of the
page we want to invalidate.

This problem is reproducible by the following fio receipt (however I was
only able to reproduce it with fio-2.1 or older.

[global]
bs=8k
iodepth=1024
iodepth_batch=60
randrepeat=1
size=1m
directory=/mnt/test
numjobs=20
[job1]
ioengine=sync
bs=1k
direct=1
rw=randread
filename=file1:file2
[job2]
ioengine=libaio
rw=randwrite
direct=1
filename=file1:file2
[job3]
bs=1k
ioengine=posixaio
rw=randwrite
direct=1
filename=file1:file2
[job5]
bs=1k
ioengine=sync
rw=randread
filename=file1:file2
[job7]
ioengine=libaio
rw=randwrite
filename=file1:file2
[job8]
ioengine=posixaio
rw=randwrite
filename=file1:file2
[job10]
ioengine=mmap
rw=randwrite
bs=1k
filename=file1:file2
[job11]
ioengine=mmap
rw=randwrite
direct=1
filename=file1:file2

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: stable@vger.kernel.org
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agospi: pl022: Specify 'num-cs' property as required in devicetree binding
Ezequiel Garcia [Mon, 11 May 2015 15:20:18 +0000 (12:20 -0300)]
spi: pl022: Specify 'num-cs' property as required in devicetree binding

[ Upstream commit ea6055c46eda1e19e02209814955e13f334bbe1b ]

Since commit 39a6ac11df65 ("spi/pl022: Devicetree support w/o platform data")
the 'num-cs' parameter cannot be passed through platform data when probing
with devicetree. Instead, it's a required devicetree property.

Fix the binding documentation so the property is properly specified.

Fixes: 39a6ac11df65 ("spi/pl022: Devicetree support w/o platform data")
Signed-off-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoRevert "can: fix loss of CAN frames in raw_rcv"
Sasha Levin [Tue, 4 Aug 2015 17:32:39 +0000 (13:32 -0400)]
Revert "can: fix loss of CAN frames in raw_rcv"

This reverts commit c215cf258214858a5a6c3e63cd7ee78b92d210b2.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoLinux 3.18.19 v3.18.19
Sasha Levin [Tue, 21 Jul 2015 01:13:10 +0000 (21:13 -0400)]
Linux 3.18.19

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agox86/xen: allow privcmd hypercalls to be preempted
David Vrabel [Thu, 19 Feb 2015 15:23:17 +0000 (15:23 +0000)]
x86/xen: allow privcmd hypercalls to be preempted

[ Upstream commit db0e2aa47f0a45fc56592c25ad370f8836cbb128 ]

Hypercalls submitted by user space tools via the privcmd driver can
take a long time (potentially many 10s of seconds) if the hypercall
has many sub-operations.

A fully preemptible kernel may deschedule such as task in any upcall
called from a hypercall continuation.

However, in a kernel with voluntary or no preemption, hypercall
continuations in Xen allow event handlers to be run but the task
issuing the hypercall will not be descheduled until the hypercall is
complete and the ioctl returns to user space.  These long running
tasks may also trigger the kernel's soft lockup detection.

Add xen_preemptible_hcall_begin() and xen_preemptible_hcall_end() to
bracket hypercalls that may be preempted.  Use these in the privcmd
driver.

When returning from an upcall, call xen_maybe_preempt_hcall() which
adds a schedule point if if the current task was within a preemptible
hypercall.

Since _cond_resched() can move the task to a different CPU, clear and
set xen_in_preemptible_hcall around the call.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agortnl: restore notifications for deleted interfaces
Nicolas Dichtel [Thu, 16 Jul 2015 09:31:43 +0000 (11:31 +0200)]
rtnl: restore notifications for deleted interfaces

The commit 984ff7a3e060 is an upstream backport. In fact, it depends on
commit 395eea6ccf2b ("rtnetlink: delay RTM_DELLINK notification until after ndo_uninit()")
which has not been backported in 3.18.y.

Before commit 395eea6ccf2b, rollback_registered_many() uses rtmsg_ifinfo().
The call to this function is done with dev->reg_state set to
NETREG_UNREGISTERING, thus testing this reg_state in rtmsg_ifinfo() is
wrong.

This patch partially reverts commit 984ff7a3e060.

Fixes: 984ff7a3e060 ("rtnl/bond: don't send rtnl msg for unregistered iface")
Reported-by: Kristian Evensen <kristian.evensen@gmail.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoInput: synaptics - add min/max quirk for Lenovo S540
Peter Hutterer [Mon, 8 Jun 2015 17:17:32 +0000 (10:17 -0700)]
Input: synaptics - add min/max quirk for Lenovo S540

[ Upstream commit 7f2ca8b55aeff1fe51ed3570200ef88a96060917 ]

https://bugzilla.redhat.com/show_bug.cgi?id=1223051#c2

Cc: stable@vger.kernel.org
Tested-by: tommy.gagnes@gmail.com
Signed-off-by: Peter Hutterer <peter.hutterer@who-t.net>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoRevert "Input: synaptics - add min/max quirk for Lenovo S540"
Sasha Levin [Tue, 14 Jul 2015 18:58:30 +0000 (14:58 -0400)]
Revert "Input: synaptics - add min/max quirk for Lenovo S540"

This reverts commit e73268af3e137b1f69d0f82d80aa88e0e8c7cbbe.

Seems like an incorrect merge on my end, will pick the upstream
version again next.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoRevert "nfs: take extra reference to fl->fl_file when running a LOCKU operation"
Sasha Levin [Mon, 13 Jul 2015 12:56:15 +0000 (08:56 -0400)]
Revert "nfs: take extra reference to fl->fl_file when running a LOCKU operation"

This reverts commit ed7f7f145ec1445a130513db9ad8f1547f77a578.

Reverting from stable tree as fix was found to be buggy. New fix pending.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agofs/ufs: restore s_lock mutex_init()
Fabian Frederick [Wed, 17 Jun 2015 16:15:45 +0000 (18:15 +0200)]
fs/ufs: restore s_lock mutex_init()

[ Upstream commit e4f95517f18271b1da36cfc5d700e46844396d6e ]

Add last missing line in commit "cdd9eefdf905"
("fs/ufs: restore s_lock mutex")

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoufs: Fix possible deadlock when looking up directories
Jan Kara [Tue, 2 Jun 2015 09:26:34 +0000 (11:26 +0200)]
ufs: Fix possible deadlock when looking up directories

[ Upstream commit 514d748f69c97a51a2645eb198ac5c6218f22ff9 ]

Commit e4502c63f56aeca88 (ufs: deal with nfsd/iget races) made ufs
create inodes with I_NEW flag set. However ufs_mkdir() never cleared
this flag. Thus if someone ever tried to lookup the directory by inode
number, he would deadlock waiting for I_NEW to be cleared. Luckily this
mostly happens only if the filesystem is exported over NFS since
otherwise we have the inode attached to dentry and don't look it up by
inode number. In rare cases dentry can get freed without inode being
freed and then we'd hit the deadlock even without NFS export.

Fix the problem by clearing I_NEW before instantiating new directory
inode.

Fixes: e4502c63f56aeca887ced37f24e0def1ef11cec8
Reported-by: Fabian Frederick <fabf@skynet.be>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoufs: Fix warning from unlock_new_inode()
Jan Kara [Mon, 1 Jun 2015 12:52:04 +0000 (14:52 +0200)]
ufs: Fix warning from unlock_new_inode()

[ Upstream commit 12ecbb4b1d765a5076920999298d9625439dbe58 ]

Commit e4502c63f56aeca88 (ufs: deal with nfsd/iget races) introduced
unlock_new_inode() call into ufs_add_nondir(). However that function
gets called also from ufs_link() which hands it already initialized
inode and thus unlock_new_inode() complains. The problem is harmless but
annoying.

Fix the problem by opencoding necessary stuff in ufs_link()

Fixes: e4502c63f56aeca887ced37f24e0def1ef11cec8
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agovfs: Ignore unlocked mounts in fs_fully_visible
Eric W. Biederman [Wed, 7 Jan 2015 14:10:09 +0000 (08:10 -0600)]
vfs: Ignore unlocked mounts in fs_fully_visible

[ Upstream commit c89d4319ae55186496c43b7a6e510aa1d09dd387 ]

commit ceeb0e5d39fcdf4dca2c997bf225c7fc49200b37 upstream.

Limit the mounts fs_fully_visible considers to locked mounts.
Unlocked can always be unmounted so considering them adds hassle
but no security benefit.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoKVM: x86: properly restore LVT0
Radim Krčmář [Tue, 30 Jun 2015 20:19:17 +0000 (22:19 +0200)]
KVM: x86: properly restore LVT0

[ Upstream commit 58382447b9a9989da551a7b17e72756f6e238bb8 ]

commit db1385624c686fe99fe2d1b61a36e1537b915d08 upstream.

Legacy NMI watchdog didn't work after migration/resume, because
vapics_in_nmi_mode was left at 0.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoKVM: s390: virtio-ccw: don't overwrite config space values
Cornelia Huck [Mon, 29 Jun 2015 14:44:01 +0000 (16:44 +0200)]
KVM: s390: virtio-ccw: don't overwrite config space values

[ Upstream commit 431dae778aea4eed31bd12e5ee82edc571cd4d70 ]

Eric noticed problems with vhost-scsi and virtio-ccw: vhost-scsi
complained about overwriting values in the config space, which
was triggered by a broken implementation of virtio-ccw's config
get/set routines. It was probably sheer luck that we did not hit
this before.

When writing a value to the config space, the WRITE_CONF ccw will
always write from the beginning of the config space up to and
including the value to be set. If the config space up to the value
has not yet been retrieved from the device, however, we'll end up
overwriting values. Keep track of the known config space and update
if needed to avoid this.

Moreover, READ_CONF will only read the number of bytes it has been
instructed to retrieve, so we must not copy more than that to the
buffer, or we might overwrite trailing values.

Reported-by: Eric Farman <farman@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Eric Farman <farman@linux.vnet.ibm.com>
Tested-by: Eric Farman <farman@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoselinux: fix setting of security labels on NFS
J. Bruce Fields [Thu, 4 Jun 2015 19:57:25 +0000 (15:57 -0400)]
selinux: fix setting of security labels on NFS

[ Upstream commit 9fc2b4b436cff7d8403034676014f1be9d534942 ]

Before calling into the filesystem, vfs_setxattr calls
security_inode_setxattr, which ends up calling selinux_inode_setxattr in
our case.  That returns -EOPNOTSUPP whenever SBLABEL_MNT is not set.
SBLABEL_MNT was supposed to be set by sb_finish_set_opts, which sets it
only if selinux_is_sblabel_mnt returns true.

The selinux_is_sblabel_mnt logic was broken by eadcabc697e9 "SELinux: do
all flags twiddling in one place", which didn't take into the account
the SECURITY_FS_USE_NATIVE behavior that had been introduced for nfs
with eb9ae686507b "SELinux: Add new labeling type native labels".

This caused setxattr's of security labels over NFSv4.2 to fail.

Cc: stable@kernel.org # 3.13
Cc: Eric Paris <eparis@redhat.com>
Cc: David Quigley <dpquigl@davequigley.com>
Reported-by: Richard Chan <rc556677@outlook.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
[PM: added the stable dependency]
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agousb: gadget: f_fs: add extra check before unregister_gadget_item
Rui Miguel Silva [Wed, 20 May 2015 13:52:40 +0000 (14:52 +0100)]
usb: gadget: f_fs: add extra check before unregister_gadget_item

[ Upstream commit f14e9ad17f46051b02bffffac2036486097de19e ]

ffs_closed can race with configfs_rmdir which will call config_item_release, so
add an extra check to avoid calling the unregister_gadget_item with an null
gadget item.

Signed-off-by: Rui Miguel Silva <rui.silva@linaro.org>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoperf/x86: Honor the architectural performance monitoring version
Palik, Imre [Mon, 8 Jun 2015 12:46:49 +0000 (14:46 +0200)]
perf/x86: Honor the architectural performance monitoring version

[ Upstream commit 2c33645d366d13b969d936b68b9f4875b1fdddea ]

Architectural performance monitoring, version 1, doesn't support fixed counters.

Currently, even if a hypervisor advertises support for architectural
performance monitoring version 1, perf may still try to use the fixed
counters, as the constraints are set up based on the CPU model.

This patch ensures that perf honors the architectural performance monitoring
version returned by CPUID, and it only uses the fixed counters for version 2
and above.

(Some of the ideas in this patch came from Peter Zijlstra.)

Signed-off-by: Imre Palik <imrep@amazon.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Anthony Liguori <aliguori@amazon.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1433767609-1039-1-git-send-email-imrep.amz@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoperf: Fix ring_buffer_attach() RCU sync, again
Oleg Nesterov [Sat, 30 May 2015 20:04:25 +0000 (22:04 +0200)]
perf: Fix ring_buffer_attach() RCU sync, again

[ Upstream commit 2f993cf093643b98477c421fa2b9a98dcc940323 ]

While looking for other users of get_state/cond_sync. I Found
ring_buffer_attach() and it looks obviously buggy?

Don't we need to ensure that we have "synchronize" _between_
list_del() and list_add() ?

IOW. Suppose that ring_buffer_attach() preempts right_after
get_state_synchronize_rcu() and gp completes before spin_lock().

In this case cond_synchronize_rcu() does nothing and we reuse
->rb_entry without waiting for gp in between?

It also moves the ->rcu_pending check under "if (rb)", to make it
more readable imo.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dave@stgolabs.net
Cc: der.herr@hofr.at
Cc: josh@joshtriplett.org
Cc: tj@kernel.org
Fixes: b69cf53640da ("perf: Fix a race between ring_buffer_detach() and ring_buffer_attach()")
Link: http://lkml.kernel.org/r/20150530200425.GA15748@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agox86/boot: Fix overflow warning with 32-bit binutils
Borislav Petkov [Fri, 19 Jun 2015 11:49:06 +0000 (13:49 +0200)]
x86/boot: Fix overflow warning with 32-bit binutils

[ Upstream commit 04c17341b42699a5859a8afa05e64ba08a4e5235 ]

When building the kernel with 32-bit binutils built with support
only for the i386 target, we get the following warning:

  arch/x86/kernel/head_32.S:66: Warning: shift count out of range (32 is not between 0 and 31)

The problem is that in that case, binutils' internal type
representation is 32-bit wide and the shift range overflows.

In order to fix this, manipulate the shift expression which
creates the 4GiB constant to not overflow the shift count.

Suggested-by: Michael Matz <matz@suse.de>
Reported-and-tested-by: Enrico Mioso <mrkiko.rs@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agovfs: Remove incorrect debugging WARN in prepend_path
Eric W. Biederman [Sun, 24 May 2015 14:25:00 +0000 (09:25 -0500)]
vfs: Remove incorrect debugging WARN in prepend_path

[ Upstream commit 93e3bce6287e1fb3e60d3324ed08555b5bbafa89 ]

The warning message in prepend_path is unclear and outdated.  It was
added as a warning that the mechanism for generating names of pseudo
files had been removed from prepend_path and d_dname should be used
instead.  Unfortunately the warning reads like a general warning,
making it unclear what to do with it.

Remove the warning.  The transition it was added to warn about is long
over, and I added code several years ago which in rare cases causes
the warning to fire on legitimate code, and the warning is now firing
and scaring people for no good reason.

Cc: stable@vger.kernel.org
Reported-by: Ivan Delalande <colona@arista.com>
Reported-by: Omar Sandoval <osandov@osandov.com>
Fixes: f48cfddc6729e ("vfs: In d_path don't call d_dname on a mount point")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoKVM: x86: make vapics_in_nmi_mode atomic
Radim Krčmář [Wed, 1 Jul 2015 13:31:49 +0000 (15:31 +0200)]
KVM: x86: make vapics_in_nmi_mode atomic

[ Upstream commit 42720138b06301cc8a7ee8a495a6d021c4b6a9bc ]

Writes were a bit racy, but hard to turn into a bug at the same time.
(Particularly because modern Linux doesn't use this feature anymore.)

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
[Actually the next patch makes it much, much easier to trigger the race
 so I'm including this one for stable@ as well. - Paolo]
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoarm: KVM: force execution of HCPTR access on VM exit
Marc Zyngier [Mon, 16 Mar 2015 10:59:43 +0000 (10:59 +0000)]
arm: KVM: force execution of HCPTR access on VM exit

[ Upstream commit 85e84ba31039595995dae80b277378213602891b ]

On VM entry, we disable access to the VFP registers in order to
perform a lazy save/restore of these registers.

On VM exit, we restore access, test if we did enable them before,
and save/restore the guest/host registers if necessary. In this
sequence, the FPEXC register is always accessed, irrespective
of the trapping configuration.

If the guest didn't touch the VFP registers, then the HCPTR access
has now enabled such access, but we're missing a barrier to ensure
architectural execution of the new HCPTR configuration. If the HCPTR
access has been delayed/reordered, the subsequent access to FPEXC
will cause a trap, which we aren't prepared to handle at all.

The same condition exists when trapping to enable VFP for the guest.

The fix is to introduce a barrier after enabling VFP access. In the
vmexit case, it can be relaxed to only takes place if the guest hasn't
accessed its view of the VFP registers, making the access to FPEXC safe.

The set_hcptr macro is modified to deal with both vmenter/vmexit and
vmtrap operations, and now takes an optional label that is branched to
when the guest hasn't touched the VFP registers.

Reported-by: Vikram Sethi <vikrams@codeaurora.org>
Cc: stable@kernel.org # v3.9+
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoiommu/amd: Handle large pages correctly in free_pagetable
Joerg Roedel [Thu, 18 Jun 2015 08:48:34 +0000 (10:48 +0200)]
iommu/amd: Handle large pages correctly in free_pagetable

[ Upstream commit 0b3fff54bc01e8e6064d222a33e6fa7adabd94cd ]

Make sure that we are skipping over large PTEs while walking
the page-table tree.

Cc: stable@kernel.org
Fixes: 5c34c403b723 ("iommu/amd: Fix memory leak in free_pagetable")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoarm64/kvm: Fix assembler compatibility of macros
Geoff Levand [Fri, 31 Oct 2014 23:06:47 +0000 (23:06 +0000)]
arm64/kvm: Fix assembler compatibility of macros

[ Upstream commit 286fb1cc32b11c18da3573a8c8c37a4f9da16e30 ]

Some of the macros defined in kvm_arm.h are useful in assembly files, but are
not compatible with the assembler.  Change any C language integer constant
definitions using appended U, UL, or ULL to the UL() preprocessor macro.  Also,
add a preprocessor include of the asm/memory.h file which defines the UL()
macro.

Fixes build errors like these when using kvm_arm.h in assembly
source files:

  Error: unexpected characters following instruction at operand 3 -- `and x0,x1,#((1U<<25)-1)'

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoKVM: nSVM: Check for NRIPS support before updating control field
Bandan Das [Thu, 11 Jun 2015 06:05:33 +0000 (02:05 -0400)]
KVM: nSVM: Check for NRIPS support before updating control field

[ Upstream commit f104765b4f81fd74d69e0eb161e89096deade2db ]

If hardware doesn't support DecodeAssist - a feature that provides
more information about the intercept in the VMCB, KVM decodes the
instruction and then updates the next_rip vmcb control field.
However, NRIP support itself depends on cpuid Fn8000_000A_EDX[NRIPS].
Since skip_emulated_instruction() doesn't verify nrip support
before accepting control.next_rip as valid, avoid writing this
field if support isn't present.

Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoARM: clk-imx6q: refine sata's parent
Sebastien Szymanski [Wed, 20 May 2015 14:30:37 +0000 (16:30 +0200)]
ARM: clk-imx6q: refine sata's parent

[ Upstream commit 3a3ac04f0e0ef48c2d30ae6df7b8906421c2b35d ]

commit da946aeaeadcd24ff0cda9984c6fb8ed2bfd462a upstream.

According to IMX6D/Q RM, table 18-3, sata clock's parent is ahb, not ipg.

Signed-off-by: Sebastien Szymanski <sebastien.szymanski@armadeus.com>
Reviewed-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
[dirk.behme: Adjust moved file]
Signed-off-by: Dirk Behme <dirk.behme@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoBtrfs: make xattr replace operations atomic
Filipe Manana [Sun, 9 Nov 2014 08:38:39 +0000 (08:38 +0000)]
Btrfs: make xattr replace operations atomic

[ Upstream commit 02590fd855d1690568b2fa439c942e933221b57a ]

commit 5f5bc6b1e2d5a6f827bc860ef2dc5b6f365d1339 upstream.

Replacing a xattr consists of doing a lookup for its existing value, delete
the current value from the respective leaf, release the search path and then
finally insert the new value. This leaves a time window where readers (getxattr,
listxattrs) won't see any value for the xattr. Xattrs are used to store ACLs,
so this has security implications.

This change also fixes 2 other existing issues which were:

*) Deleting the old xattr value without verifying first if the new xattr will
   fit in the existing leaf item (in case multiple xattrs are packed in the
   same item due to name hash collision);

*) Returning -EEXIST when the flag XATTR_CREATE is given and the xattr doesn't
   exist but we have have an existing item that packs muliple xattrs with
   the same name hash as the input xattr. In this case we should return ENOSPC.

A test case for xfstests follows soon.

Thanks to Alexandre Oliva for reporting the non-atomicity of the xattr replace
implementation.

Reported-by: Alexandre Oliva <oliva@gnu.org>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agox86/microcode/intel: Guard against stack overflow in the loader
Quentin Casasnovas [Tue, 3 Feb 2015 12:00:22 +0000 (13:00 +0100)]
x86/microcode/intel: Guard against stack overflow in the loader

[ Upstream commit f84598bd7c851f8b0bf8cd0d7c3be0d73c432ff4 ]

mc_saved_tmp is a static array allocated on the stack, we need to make
sure mc_saved_count stays within its bounds, otherwise we're overflowing
the stack in _save_mc(). A specially crafted microcode header could lead
to a kernel crash or potentially kernel execution.

Signed-off-by: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Link: http://lkml.kernel.org/r/1422964824-22056-1-git-send-email-quentin.casasnovas@oracle.com
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agonetfilter: nf_tables: allow to change chain policy without hook if it exists
Pablo Neira Ayuso [Tue, 17 Mar 2015 12:21:42 +0000 (13:21 +0100)]
netfilter: nf_tables: allow to change chain policy without hook if it exists

[ Upstream commit d6b6cb1d3e6f78d55c2d4043d77d0d8def3f3b99 ]

If there's an existing base chain, we have to allow to change the
default policy without indicating the hook information.

However, if the chain doesn't exists, we have to enforce the presence of
the hook attribute.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agonetfilter: nft_compat: set IP6T_F_PROTO flag if protocol is set
Pablo Neira Ayuso [Sat, 21 Mar 2015 18:25:05 +0000 (19:25 +0100)]
netfilter: nft_compat: set IP6T_F_PROTO flag if protocol is set

[ Upstream commit 749177ccc74f9c6d0f51bd78a15c652a2134aa11 ]

ip6tables extensions check for this flag to restrict match/target to a
given protocol. Without this flag set, SYNPROXY6 returns an error.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agonetfilter: Zero the tuple in nfnl_cthelper_parse_tuple()
Ian Wilson [Thu, 12 Mar 2015 09:37:58 +0000 (09:37 +0000)]
netfilter: Zero the tuple in nfnl_cthelper_parse_tuple()

[ Upstream commit 78146572b9cd20452da47951812f35b1ad4906be ]

nfnl_cthelper_parse_tuple() is called from nfnl_cthelper_new(),
nfnl_cthelper_get() and nfnl_cthelper_del().  In each case they pass
a pointer to an nf_conntrack_tuple data structure local variable:

    struct nf_conntrack_tuple tuple;
    ...
    ret = nfnl_cthelper_parse_tuple(&tuple, tb[NFCTH_TUPLE]);

The problem is that this local variable is not initialized, and
nfnl_cthelper_parse_tuple() only initializes two fields: src.l3num and
dst.protonum.  This leaves all other fields with undefined values
based on whatever is on the stack:

    tuple->src.l3num = ntohs(nla_get_be16(tb[NFCTH_TUPLE_L3PROTONUM]));
    tuple->dst.protonum = nla_get_u8(tb[NFCTH_TUPLE_L4PROTONUM]);

The symptom observed was that when the rpc and tns helpers were added
then traffic to port 1536 was being sent to user-space.

Signed-off-by: Ian Wilson <iwilson@brocade.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agosb_edac: Fix erroneous bytes->gigabytes conversion
Jim Snow [Tue, 18 Nov 2014 13:51:09 +0000 (14:51 +0100)]
sb_edac: Fix erroneous bytes->gigabytes conversion

[ Upstream commit ece9210859abb2cc0126f8adbfbbdee668d4906a ]

commit 8c009100295597f23978c224aec5751a365bc965 upstream.

Signed-off-by: Jim Snow <jim.snow@intel.com>
Signed-off-by: Lukasz Anaczkowski <lukasz.anaczkowski@intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@osg.samsung.com>
[ vlee: Backported to 3.14. Adjusted context. ]
Signed-off-by: Vinson Lee <vlee@twitter.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agotracing: Have filter check for balanced ops
Steven Rostedt [Mon, 15 Jun 2015 21:50:25 +0000 (17:50 -0400)]
tracing: Have filter check for balanced ops

[ Upstream commit 2cf30dc180cea808077f003c5116388183e54f9e ]

When the following filter is used it causes a warning to trigger:

 # cd /sys/kernel/debug/tracing
 # echo "((dev==1)blocks==2)" > events/ext4/ext4_truncate_exit/filter
-bash: echo: write error: Invalid argument
 # cat events/ext4/ext4_truncate_exit/filter
((dev==1)blocks==2)
^
parse_error: No error

 ------------[ cut here ]------------
 WARNING: CPU: 2 PID: 1223 at kernel/trace/trace_events_filter.c:1640 replace_preds+0x3c5/0x990()
 Modules linked in: bnep lockd grace bluetooth  ...
 CPU: 3 PID: 1223 Comm: bash Tainted: G        W       4.1.0-rc3-test+ #450
 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
  0000000000000668 ffff8800c106bc98 ffffffff816ed4f9 ffff88011ead0cf0
  0000000000000000 ffff8800c106bcd8 ffffffff8107fb07 ffffffff8136b46c
  ffff8800c7d81d48 ffff8800d4c2bc00 ffff8800d4d4f920 00000000ffffffea
 Call Trace:
  [<ffffffff816ed4f9>] dump_stack+0x4c/0x6e
  [<ffffffff8107fb07>] warn_slowpath_common+0x97/0xe0
  [<ffffffff8136b46c>] ? _kstrtoull+0x2c/0x80
  [<ffffffff8107fb6a>] warn_slowpath_null+0x1a/0x20
  [<ffffffff81159065>] replace_preds+0x3c5/0x990
  [<ffffffff811596b2>] create_filter+0x82/0xb0
  [<ffffffff81159944>] apply_event_filter+0xd4/0x180
  [<ffffffff81152bbf>] event_filter_write+0x8f/0x120
  [<ffffffff811db2a8>] __vfs_write+0x28/0xe0
  [<ffffffff811dda43>] ? __sb_start_write+0x53/0xf0
  [<ffffffff812e51e0>] ? security_file_permission+0x30/0xc0
  [<ffffffff811dc408>] vfs_write+0xb8/0x1b0
  [<ffffffff811dc72f>] SyS_write+0x4f/0xb0
  [<ffffffff816f5217>] system_call_fastpath+0x12/0x6a
 ---[ end trace e11028bd95818dcd ]---

Worse yet, reading the error message (the filter again) it says that
there was no error, when there clearly was. The issue is that the
code that checks the input does not check for balanced ops. That is,
having an op between a closed parenthesis and the next token.

This would only cause a warning, and fail out before doing any real
harm, but it should still not caues a warning, and the error reported
should work:

 # cd /sys/kernel/debug/tracing
 # echo "((dev==1)blocks==2)" > events/ext4/ext4_truncate_exit/filter
-bash: echo: write error: Invalid argument
 # cat events/ext4/ext4_truncate_exit/filter
((dev==1)blocks==2)
^
parse_error: Meaningless filter expression

And give no kernel warning.

Link: http://lkml.kernel.org/r/20150615175025.7e809215@gandalf.local.home
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: stable@vger.kernel.org # 2.6.31+
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Tested-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agofs: Fix S_NOSEC handling
Jan Kara [Thu, 21 May 2015 14:05:52 +0000 (16:05 +0200)]
fs: Fix S_NOSEC handling

[ Upstream commit 2426f3910069ed47c0cc58559a6d088af7920201 ]

file_remove_suid() could mistakenly set S_NOSEC inode bit when root was
modifying the file. As a result following writes to the file by ordinary
user would avoid clearing suid or sgid bits.

Fix the bug by checking actual mode bits before setting S_NOSEC.

CC: stable@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agocan: fix loss of CAN frames in raw_rcv
Oliver Hartkopp [Sun, 21 Jun 2015 16:50:44 +0000 (18:50 +0200)]
can: fix loss of CAN frames in raw_rcv

[ Upstream commit 36c01245eb8046c16eee6431e7dbfbb302635fa8 ]

As reported by Manfred Schlaegl here

   http://marc.info/?l=linux-netdev&m=143482089824232&w=2

commit 514ac99c64b "can: fix multiple delivery of a single CAN frame for
overlapping CAN filters" requires the skb->tstamp to be set to check for
identical CAN skbs.

As net timestamping is influenced by several players (netstamp_needed and
netdev_tstamp_prequeue) Manfred missed a proper timestamp which leads to
CAN frame loss.

As skb timestamping became now mandatory for CAN related skbs this patch
makes sure that received CAN skbs always have a proper timestamp set.
Maybe there's a better solution in the future but this patch fixes the
CAN frame loss so far.

Reported-by: Manfred Schlaegl <manfred.schlaegl@gmx.at>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agofs/ufs: restore s_lock mutex
Fabian Frederick [Wed, 10 Jun 2015 00:09:32 +0000 (10:09 +1000)]
fs/ufs: restore s_lock mutex

[ Upstream commit cdd9eefdf905e92e7fc6cc393314efe68dc6ff66 ]

Commit 0244756edc4b98c ("ufs: sb mutex merge + mutex_destroy") generated
deadlocks in read/write mode on mkdir.

This patch partially reverts it keeping fixes by Andrew Morton and
mutex_destroy()

[AV: fixed a missing bit in ufs_remount()]

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Reported-by: Ian Campbell <ian.campbell@citrix.com>
Suggested-by: Jan Kara <jack@suse.cz>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Evgeniy Dushistov <dushistov@mail.ru>
Cc: Alexey Khoroshilov <khoroshilov@ispras.ru>
Cc: Roger Pau Monne <roger.pau@citrix.com>
Cc: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agofs/ufs: revert "ufs: fix deadlocks introduced by sb mutex merge"
Fabian Frederick [Wed, 10 Jun 2015 00:09:32 +0000 (10:09 +1000)]
fs/ufs: revert "ufs: fix deadlocks introduced by sb mutex merge"

[ Upstream commit 13b987ea275840d74d9df9a44326632fab1894da ]

This reverts commit 9ef7db7f38d0 ("ufs: fix deadlocks introduced by sb
mutex merge") That patch tried to solve commit 0244756edc4b98c ("ufs: sb
mutex merge + mutex_destroy") which is itself partially reverted due to
multiple deadlocks.

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Suggested-by: Jan Kara <jack@suse.cz>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Evgeniy Dushistov <dushistov@mail.ru>
Cc: Alexey Khoroshilov <khoroshilov@ispras.ru>
Cc: Roger Pau Monne <roger.pau@citrix.com>
Cc: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agonetfilter: nfnetlink_cthelper: Remove 'const' and '&' to avoid warnings
Chen Gang [Wed, 24 Dec 2014 15:04:54 +0000 (23:04 +0800)]
netfilter: nfnetlink_cthelper: Remove 'const' and '&' to avoid warnings

[ Upstream commit b18c5d15e8714336365d9d51782d5b53afa0443c ]

The related code can be simplified, and also can avoid related warnings
(with allmodconfig under parisc):

    CC [M]  net/netfilter/nfnetlink_cthelper.o
  net/netfilter/nfnetlink_cthelper.c: In function ‘nfnl_cthelper_from_nlattr’:
  net/netfilter/nfnetlink_cthelper.c:97:9: warning: passing argument 1 o ‘memcpy’ discards ‘const’ qualifier from pointer target type [-Wdiscarded-array-qualifiers]
    memcpy(&help->data, nla_data(attr), help->helper->data_len);
           ^
  In file included from include/linux/string.h:17:0,
                   from include/uapi/linux/uuid.h:25,
                   from include/linux/uuid.h:23,
                   from include/linux/mod_devicetable.h:12,
                   from ./arch/parisc/include/asm/hardware.h:4,
                   from ./arch/parisc/include/asm/processor.h:15,
                   from ./arch/parisc/include/asm/spinlock.h:6,
                   from ./arch/parisc/include/asm/atomic.h:21,
                   from include/linux/atomic.h:4,
                   from ./arch/parisc/include/asm/bitops.h:12,
                   from include/linux/bitops.h:36,
                   from include/linux/kernel.h:10,
                   from include/linux/list.h:8,
                   from include/linux/module.h:9,
                   from net/netfilter/nfnetlink_cthelper.c:11:
  ./arch/parisc/include/asm/string.h:8:8: note: expected ‘void *’ but argument is of type ‘const char (*)[]’
   void * memcpy(void * dest,const void *src,size_t count);
          ^

Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@soleta.eu>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoconfig: Enable NEED_DMA_MAP_STATE by default when SWIOTLB is selected
Konrad Rzeszutek Wilk [Fri, 17 Apr 2015 19:04:48 +0000 (15:04 -0400)]
config: Enable NEED_DMA_MAP_STATE by default when SWIOTLB is selected

[ Upstream commit a6dfa128ce5c414ab46b1d690f7a1b8decb8526d ]

A huge amount of NIC drivers use the DMA API, however if
compiled under 32-bit an very important part of the DMA API can
be ommitted leading to the drivers not working at all
(especially if used with 'swiotlb=force iommu=soft').

As Prashant Sreedharan explains it: "the driver [tg3] uses
DEFINE_DMA_UNMAP_ADDR(), dma_unmap_addr_set() to keep a copy of
the dma "mapping" and dma_unmap_addr() to get the "mapping"
value. On most of the platforms this is a no-op, but ... with
"iommu=soft and swiotlb=force" this house keeping is required,
... otherwise we pass 0 while calling pci_unmap_/pci_dma_sync_
instead of the DMA address."

As such enable this even when using 32-bit kernels.

Reported-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Prashant Sreedharan <prashant@broadcom.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michael Chan <mchan@broadcom.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: boris.ostrovsky@oracle.com
Cc: cascardo@linux.vnet.ibm.com
Cc: david.vrabel@citrix.com
Cc: sanjeevb@broadcom.com
Cc: siva.kallam@broadcom.com
Cc: vyasevich@gmail.com
Cc: xen-devel@lists.xensource.com
Link: http://lkml.kernel.org/r/20150417190448.GA9462@l.oracle.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agokprobes/x86: Return correct length in __copy_instruction()
Eugene Shatokhin [Tue, 17 Mar 2015 10:09:18 +0000 (19:09 +0900)]
kprobes/x86: Return correct length in __copy_instruction()

[ Upstream commit c80e5c0c23ce2282476fdc64c4b5e3d3a40723fd ]

On x86-64, __copy_instruction() always returns 0 (error) if the
instruction uses %rip-relative addressing. This is because
kernel_insn_init() is called the second time for 'insn' instance
in such cases and sets all its fields to 0.

Because of this, trying to place a kprobe on such instruction
will fail, register_kprobe() will return -EINVAL.

This patch fixes the problem.

Signed-off-by: Eugene Shatokhin <eugene.shatokhin@rosalab.ru>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Link: http://lkml.kernel.org/r/20150317100918.28349.94654.stgit@localhost.localdomain
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agomnt: Modify fs_fully_visible to deal with locked ro nodev and atime
Eric W. Biederman [Sat, 9 May 2015 04:49:47 +0000 (23:49 -0500)]
mnt: Modify fs_fully_visible to deal with locked ro nodev and atime

[ Upstream commit 8c6cf9cc829fcd0b179b59f7fe288941d0e31108 ]

Ignore an existing mount if the locked readonly, nodev or atime
attributes are less permissive than the desired attributes
of the new mount.

On success ensure the new mount locks all of the same readonly, nodev and
atime attributes as the old mount.

The nosuid and noexec attributes are not checked here as this change
is destined for stable and enforcing those attributes causes a
regression in lxc and libvirt-lxc where those applications will not
start and there are no known executables on sysfs or proc and no known
way to create exectuables without code modifications

Cc: stable@vger.kernel.org
Fixes: e51db73532955 ("userns: Better restrictions on when proc and sysfs can be mounted")
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agotty/serial: at91: RS485 mode: 0 is valid for delay_rts_after_send
Nicolas Ferre [Mon, 11 May 2015 11:00:31 +0000 (13:00 +0200)]
tty/serial: at91: RS485 mode: 0 is valid for delay_rts_after_send

[ Upstream commit 8687634b7908c42eb700e0469e110e02833611d1 ]

In RS485 mode, we may want to set the delay_rts_after_send value to 0.
In the datasheet, the 0 value is said to "disable" the Transmitter Timeguard but
this is exactly the expected behavior if we want no delay...

Moreover, if the value was set to non-zero value by device-tree or earlier
ioctl command, it was impossible to change it back to zero.

Reported-by: Sami Pietikäinen <Sami.Pietikainen@wapice.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Cc: stable@vger.kernel.org # 3.2+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agomnt: Refactor the logic for mounting sysfs and proc in a user namespace
Eric W. Biederman [Sat, 9 May 2015 04:22:29 +0000 (23:22 -0500)]
mnt: Refactor the logic for mounting sysfs and proc in a user namespace

[ Upstream commit 1b852bceb0d111e510d1a15826ecc4a19358d512 ]

Fresh mounts of proc and sysfs are a very special case that works very
much like a bind mount.  Unfortunately the current structure can not
preserve the MNT_LOCK... mount flags.  Therefore refactor the logic
into a form that can be modified to preserve those lock bits.

Add a new filesystem flag FS_USERNS_VISIBLE that requires some mount
of the filesystem be fully visible in the current mount namespace,
before the filesystem may be mounted.

Move the logic for calling fs_fully_visible from proc and sysfs into
fs/namespace.c where it has greater access to mount namespace state.

Cc: stable@vger.kernel.org
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoLinux 3.18.18 v3.18.18
Sasha Levin [Fri, 10 Jul 2015 00:46:48 +0000 (20:46 -0400)]
Linux 3.18.18

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agommc: sdhci-pxav3: do the mbus window configuration after enabling clocks
Thomas Petazzoni [Wed, 31 Dec 2014 10:54:10 +0000 (11:54 +0100)]
mmc: sdhci-pxav3: do the mbus window configuration after enabling clocks

[ upstream commit aa8165f914420f143476305a01894b017d3abe6b ]

In commit 5491ce3f79ee ("mmc: sdhci-pxav3: add support for the Armada
38x SDHCI controller"), the sdhci-pxav3 driver was extended to include
support for the SDHCI controller found in the Armada 38x
processor. This mainly involved adding some MBus window related
configuration.

However, this configuration is currently done too early in ->probe():
it is done before clocks are enabled, while this configuration
involves touching the registers of the controller, which will hang the
SoC if the clock is disabled. It wasn't noticed until now because the
bootloader typically leaves gatable clocks enabled, but in situations
where we have a deferred probe (due to a CD GPIO that cannot be taken,
for example), then the probe will be re-tried later, after a clock
disable has been done in the exit path of the failed probe attempt of
the device. This second probe() will hang the system due to the clock
being disabled.

This can for example be produced on Armada 385 GP, which has a CD GPIO
connected to an I2C PCA9555. If the driver for the PCA9555 is not
compiled into the kernel, then we will have the following sequence of
events:

  1. The SDHCI probes
  2. It does the MBus configuration (which works, because the clock is
     left enabled by the bootloader)
  3. It enables the clock
  4. It tries to get the CD GPIO, which fails due to the driver being
     missing, so -EPROBE_DEFER is returned.
  5. Before returning -EPROBE_DEFER, the driver cleans up what was
     done, which includes disabling the clock.
  6. Later on, the SDHCI probe is tried again.
  7. It does the MBus configuration, which hangs because the clock is
     no longer enabled.

This commit does the obvious fix of doing the MBus configuration after
the clock has been enabled by the driver.

Fixes: 5491ce3f79ee ("mmc: sdhci-pxav3: add support for the Armada 38x SDHCI controller")
Cc: <stable@vger.kernel.org> # v3.15+
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
[jogo: rebased onto 3.18.17]
Signed-off-by: Jonas Gorski <jogo@openwrt.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agosctp: Fix race between OOTB responce and route removal
Alexander Sverdlin [Mon, 29 Jun 2015 08:41:03 +0000 (10:41 +0200)]
sctp: Fix race between OOTB responce and route removal

[ Upstream commit 29c4afc4e98f4dc0ea9df22c631841f9c220b944 ]

There is NULL pointer dereference possible during statistics update if the route
used for OOTB responce is removed at unfortunate time. If the route exists when
we receive OOTB packet and we finally jump into sctp_packet_transmit() to send
ABORT, but in the meantime route is removed under our feet, we take "no_route"
path and try to update stats with IP_INC_STATS(sock_net(asoc->base.sk), ...).

But sctp_ootb_pkt_new() used to prepare responce packet doesn't call
sctp_transport_set_owner() and therefore there is no asoc associated with this
packet. Probably temporary asoc just for OOTB responces is overkill, so just
introduce a check like in all other places in sctp_packet_transmit(), where
"asoc" is dereferenced.

To reproduce this, one needs to
0. ensure that sctp module is loaded (otherwise ABORT is not generated)
1. remove default route on the machine
2. while true; do
     ip route del [interface-specific route]
     ip route add [interface-specific route]
   done
3. send enough OOTB packets (i.e. HB REQs) from another host to trigger ABORT
   responce

On x86_64 the crash looks like this:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
IP: [<ffffffffa05ec9ac>] sctp_packet_transmit+0x63c/0x730 [sctp]
PGD 0
Oops: 0000 [#1] PREEMPT SMP
Modules linked in: ...
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G           O    4.0.5-1-ARCH #1
Hardware name: ...
task: ffffffff818124c0 ti: ffffffff81800000 task.ti: ffffffff81800000
RIP: 0010:[<ffffffffa05ec9ac>]  [<ffffffffa05ec9ac>] sctp_packet_transmit+0x63c/0x730 [sctp]
RSP: 0018:ffff880127c037b8  EFLAGS: 00010296
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00000015ff66b480
RDX: 00000015ff66b400 RSI: ffff880127c17200 RDI: ffff880123403700
RBP: ffff880127c03888 R08: 0000000000017200 R09: ffffffff814625af
R10: ffffea00047e4680 R11: 00000000ffffff80 R12: ffff8800b0d38a28
R13: ffff8800b0d38a28 R14: ffff8800b3e88000 R15: ffffffffa05f24e0
FS:  0000000000000000(0000) GS:ffff880127c00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000020 CR3: 00000000c855b000 CR4: 00000000000007f0
Stack:
 ffff880127c03910 ffff8800b0d38a28 ffffffff8189d240 ffff88011f91b400
 ffff880127c03828 ffffffffa05c94c5 0000000000000000 ffff8800baa1c520
 0000000000000000 0000000000000001 0000000000000000 0000000000000000
Call Trace:
 <IRQ>
 [<ffffffffa05c94c5>] ? sctp_sf_tabort_8_4_8.isra.20+0x85/0x140 [sctp]
 [<ffffffffa05d6b42>] ? sctp_transport_put+0x52/0x80 [sctp]
 [<ffffffffa05d0bfc>] sctp_do_sm+0xb8c/0x19a0 [sctp]
 [<ffffffff810b0e00>] ? trigger_load_balance+0x90/0x210
 [<ffffffff810e0329>] ? update_process_times+0x59/0x60
 [<ffffffff812c7a40>] ? timerqueue_add+0x60/0xb0
 [<ffffffff810e0549>] ? enqueue_hrtimer+0x29/0xa0
 [<ffffffff8101f599>] ? read_tsc+0x9/0x10
 [<ffffffff8116d4b5>] ? put_page+0x55/0x60
 [<ffffffff810ee1ad>] ? clockevents_program_event+0x6d/0x100
 [<ffffffff81462b68>] ? skb_free_head+0x58/0x80
 [<ffffffffa029a10b>] ? chksum_update+0x1b/0x27 [crc32c_generic]
 [<ffffffff81283f3e>] ? crypto_shash_update+0xce/0xf0
 [<ffffffffa05d3993>] sctp_endpoint_bh_rcv+0x113/0x280 [sctp]
 [<ffffffffa05dd4e6>] sctp_inq_push+0x46/0x60 [sctp]
 [<ffffffffa05ed7a0>] sctp_rcv+0x880/0x910 [sctp]
 [<ffffffffa05ecb50>] ? sctp_packet_transmit_chunk+0xb0/0xb0 [sctp]
 [<ffffffffa05ecb70>] ? sctp_csum_update+0x20/0x20 [sctp]
 [<ffffffff814b05a5>] ? ip_route_input_noref+0x235/0xd30
 [<ffffffff81051d6b>] ? ack_ioapic_level+0x7b/0x150
 [<ffffffff814b27be>] ip_local_deliver_finish+0xae/0x210
 [<ffffffff814b2e15>] ip_local_deliver+0x35/0x90
 [<ffffffff814b2a15>] ip_rcv_finish+0xf5/0x370
 [<ffffffff814b3128>] ip_rcv+0x2b8/0x3a0
 [<ffffffff81474193>] __netif_receive_skb_core+0x763/0xa50
 [<ffffffff81476c28>] __netif_receive_skb+0x18/0x60
 [<ffffffff81476cb0>] netif_receive_skb_internal+0x40/0xd0
 [<ffffffff814776c8>] napi_gro_receive+0xe8/0x120
 [<ffffffffa03946aa>] rtl8169_poll+0x2da/0x660 [r8169]
 [<ffffffff8147896a>] net_rx_action+0x21a/0x360
 [<ffffffff81078dc1>] __do_softirq+0xe1/0x2d0
 [<ffffffff8107912d>] irq_exit+0xad/0xb0
 [<ffffffff8157d158>] do_IRQ+0x58/0xf0
 [<ffffffff8157b06d>] common_interrupt+0x6d/0x6d
 <EOI>
 [<ffffffff810e1218>] ? hrtimer_start+0x18/0x20
 [<ffffffffa05d65f9>] ? sctp_transport_destroy_rcu+0x29/0x30 [sctp]
 [<ffffffff81020c50>] ? mwait_idle+0x60/0xa0
 [<ffffffff810216ef>] arch_cpu_idle+0xf/0x20
 [<ffffffff810b731c>] cpu_startup_entry+0x3ec/0x480
 [<ffffffff8156b365>] rest_init+0x85/0x90
 [<ffffffff818eb035>] start_kernel+0x48b/0x4ac
 [<ffffffff818ea120>] ? early_idt_handlers+0x120/0x120
 [<ffffffff818ea339>] x86_64_start_reservations+0x2a/0x2c
 [<ffffffff818ea49c>] x86_64_start_kernel+0x161/0x184
Code: 90 48 8b 80 b8 00 00 00 48 89 85 70 ff ff ff 48 83 bd 70 ff ff ff 00 0f 85 cd fa ff ff 48 89 df 31 db e8 18 63 e7 e0 48 8b 45 80 <48> 8b 40 20 48 8b 40 30 48 8b 80 68 01 00 00 65 48 ff 40 78 e9
RIP  [<ffffffffa05ec9ac>] sctp_packet_transmit+0x63c/0x730 [sctp]
 RSP <ffff880127c037b8>
CR2: 0000000000000020
---[ end trace 5aec7fd2dc983574 ]---
Kernel panic - not syncing: Fatal exception in interrupt
Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff)
drm_kms_helper: panic occurred, switching back to text console
---[ end Kernel panic - not syncing: Fatal exception in interrupt

Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agobnx2x: fix lockdep splat
Eric Dumazet [Fri, 26 Jun 2015 05:32:29 +0000 (07:32 +0200)]
bnx2x: fix lockdep splat

[ Upstream commit d53c66a5b80698620f7c9ba2372fff4017e987b8 ]

Michel reported following lockdep splat

[   44.718117] INFO: trying to register non-static key.
[   44.723081] the code is fine but needs lockdep annotation.
[   44.728559] turning off the locking correctness validator.
[   44.734036] CPU: 8 PID: 5483 Comm: ethtool Not tainted 4.1.0
[   44.770289] Call Trace:
[   44.772741]  [<ffffffff816eb1cd>] dump_stack+0x4c/0x65
[   44.777879]  [<ffffffff8111d921>] ? console_unlock+0x1f1/0x510
[   44.783708]  [<ffffffff811121f5>] __lock_acquire+0x1d05/0x1f10
[   44.789538]  [<ffffffff8111370a>] ? mark_held_locks+0x6a/0x90
[   44.795276]  [<ffffffff81113835>] ? trace_hardirqs_on_caller+0x105/0x1d0
[   44.801967]  [<ffffffff8111390d>] ? trace_hardirqs_on+0xd/0x10
[   44.807793]  [<ffffffff811330fa>] ? hrtimer_try_to_cancel+0x4a/0x250
[   44.814142]  [<ffffffff81112ba6>] lock_acquire+0xb6/0x290
[   44.819537]  [<ffffffff810d6675>] ? flush_work+0x5/0x280
[   44.824844]  [<ffffffff810d66ad>] flush_work+0x3d/0x280
[   44.830061]  [<ffffffff810d6675>] ? flush_work+0x5/0x280
[   44.835366]  [<ffffffff816f3c43>] ? schedule_hrtimeout_range+0x13/0x20
[   44.841889]  [<ffffffff8112ec9b>] ? usleep_range+0x4b/0x50
[   44.847365]  [<ffffffff8111370a>] ? mark_held_locks+0x6a/0x90
[   44.853102]  [<ffffffff810d8585>] ? __cancel_work_timer+0x105/0x1c0
[   44.859359]  [<ffffffff81113835>] ? trace_hardirqs_on_caller+0x105/0x1d0
[   44.866045]  [<ffffffff810d851f>] __cancel_work_timer+0x9f/0x1c0
[   44.872048]  [<ffffffffa0010982>] ? bnx2x_func_stop+0x42/0x90 [bnx2x]
[   44.878481]  [<ffffffff810d8670>] cancel_work_sync+0x10/0x20
[   44.884134]  [<ffffffffa00259e5>] bnx2x_chip_cleanup+0x245/0x730 [bnx2x]
[   44.890829]  [<ffffffff8110ce02>] ? up+0x32/0x50
[   44.895439]  [<ffffffff811306b5>] ? del_timer_sync+0x5/0xd0
[   44.901005]  [<ffffffffa005596d>] bnx2x_nic_unload+0x20d/0x8e0 [bnx2x]
[   44.907527]  [<ffffffff811f1aef>] ? might_fault+0x5f/0xb0
[   44.912921]  [<ffffffffa005851c>] bnx2x_reload_if_running+0x2c/0x50 [bnx2x]
[   44.919879]  [<ffffffffa005a3c5>] bnx2x_set_ringparam+0x2b5/0x460 [bnx2x]
[   44.926664]  [<ffffffff815d498b>] dev_ethtool+0x55b/0x1c40
[   44.932148]  [<ffffffff815dfdc7>] ? rtnl_lock+0x17/0x20
[   44.937364]  [<ffffffff815e7f8b>] dev_ioctl+0x17b/0x630
[   44.942582]  [<ffffffff815abf8d>] sock_do_ioctl+0x5d/0x70
[   44.947972]  [<ffffffff815ac013>] sock_ioctl+0x73/0x280
[   44.953192]  [<ffffffff8124c1c8>] do_vfs_ioctl+0x88/0x5b0
[   44.958587]  [<ffffffff8110d0b3>] ? up_read+0x23/0x40
[   44.963631]  [<ffffffff812584cc>] ? __fget_light+0x6c/0xa0
[   44.969105]  [<ffffffff8124c781>] SyS_ioctl+0x91/0xb0
[   44.974149]  [<ffffffff816f4dd7>] system_call_fastpath+0x12/0x6f

As bnx2x_init_ptp() is only called if bp->flags contains PTP_SUPPORTED,
we also need to guard bnx2x_stop_ptp() with same condition, otherwise
ptp_task workqueue is not initialized and kernel barfs on
cancel_work_sync()

Fixes: eeed018cbfa30 ("bnx2x: Add timestamping and PTP hardware clock support")
Reported-by: Michel Lespinasse <walken@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Michal Kalderon <Michal.Kalderon@qlogic.com>
Cc: Ariel Elior <Ariel.Elior@qlogic.com>
Cc: Yuval Mintz <Yuval.Mintz@qlogic.com>
Cc: David Decotigny <decot@google.com>
Acked-by: Sony Chacko <sony.chacko@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agonet: phy: fix phy link up when limiting speed via device tree
Mugunthan V N [Thu, 25 Jun 2015 16:51:02 +0000 (22:21 +0530)]
net: phy: fix phy link up when limiting speed via device tree

[ Upstream commit eb686231fce3770299760f24fdcf5ad041f44153 ]

When limiting phy link speed using "max-speed" to 100mbps or less on a
giga bit phy, phy never completes auto negotiation and phy state
machine is held in PHY_AN. Fixing this issue by comparing the giga
bit advertise though phydev->supported doesn't have it but phy has
BMSR_ESTATEN set. So that auto negotiation is restarted as old and
new advertise are different and link comes up fine.

Signed-off-by: Mugunthan V N <mugunthanvnm@ti.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agonet/mlx4_en: Wake TX queues only when there's enough room
Ido Shamay [Thu, 25 Jun 2015 08:29:42 +0000 (11:29 +0300)]
net/mlx4_en: Wake TX queues only when there's enough room

[ Upstream commit 488a9b48e398b157703766e2cd91ea45ac6997c5 ]

Indication of a single completed packet, marked by txbbs_skipped
being bigger then zero, in not enough in order to wake up a
stopped TX queue. The completed packet may contain a single TXBB,
while next packet to be sent (after the wake up) may have multiple
TXBBs (LSO/TSO packets for example), causing overflow in queue followed
by WQE corruption and TX queue timeout.
Instead, wake the stopped queue only when there's enough room for the
worst case (maximum sized WQE) packet that we should need to handle after
the queue is opened again.

Also created an helper routine - mlx4_en_is_tx_ring_full, which checks
if the current TX ring is full or not. It provides better code readability
and removes code duplication.

Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agotcp: Do not call tcp_fastopen_reset_cipher from interrupt context
Christoph Paasch [Thu, 18 Jun 2015 16:15:34 +0000 (09:15 -0700)]
tcp: Do not call tcp_fastopen_reset_cipher from interrupt context

[ Upstream commit dfea2aa654243f70dc53b8648d0bbdeec55a7df1 ]

tcp_fastopen_reset_cipher really cannot be called from interrupt
context. It allocates the tcp_fastopen_context with GFP_KERNEL and
calls crypto_alloc_cipher, which allocates all kind of stuff with
GFP_KERNEL.

Thus, we might sleep when the key-generation is triggered by an
incoming TFO cookie-request which would then happen in interrupt-
context, as shown by enabling CONFIG_DEBUG_ATOMIC_SLEEP:

[   36.001813] BUG: sleeping function called from invalid context at mm/slub.c:1266
[   36.003624] in_atomic(): 1, irqs_disabled(): 0, pid: 1016, name: packetdrill
[   36.004859] CPU: 1 PID: 1016 Comm: packetdrill Not tainted 4.1.0-rc7 #14
[   36.006085] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
[   36.008250]  00000000000004f2 ffff88007f8838a8 ffffffff8171d53a ffff880075a084a8
[   36.009630]  ffff880075a08000 ffff88007f8838c8 ffffffff810967d3 ffff88007f883928
[   36.011076]  0000000000000000 ffff88007f8838f8 ffffffff81096892 ffff88007f89be00
[   36.012494] Call Trace:
[   36.012953]  <IRQ>  [<ffffffff8171d53a>] dump_stack+0x4f/0x6d
[   36.014085]  [<ffffffff810967d3>] ___might_sleep+0x103/0x170
[   36.015117]  [<ffffffff81096892>] __might_sleep+0x52/0x90
[   36.016117]  [<ffffffff8118e887>] kmem_cache_alloc_trace+0x47/0x190
[   36.017266]  [<ffffffff81680d82>] ? tcp_fastopen_reset_cipher+0x42/0x130
[   36.018485]  [<ffffffff81680d82>] tcp_fastopen_reset_cipher+0x42/0x130
[   36.019679]  [<ffffffff81680f01>] tcp_fastopen_init_key_once+0x61/0x70
[   36.020884]  [<ffffffff81680f2c>] __tcp_fastopen_cookie_gen+0x1c/0x60
[   36.022058]  [<ffffffff816814ff>] tcp_try_fastopen+0x58f/0x730
[   36.023118]  [<ffffffff81671788>] tcp_conn_request+0x3e8/0x7b0
[   36.024185]  [<ffffffff810e3872>] ? __module_text_address+0x12/0x60
[   36.025327]  [<ffffffff8167b2e1>] tcp_v4_conn_request+0x51/0x60
[   36.026410]  [<ffffffff816727e0>] tcp_rcv_state_process+0x190/0xda0
[   36.027556]  [<ffffffff81661f97>] ? __inet_lookup_established+0x47/0x170
[   36.028784]  [<ffffffff8167c2ad>] tcp_v4_do_rcv+0x16d/0x3d0
[   36.029832]  [<ffffffff812e6806>] ? security_sock_rcv_skb+0x16/0x20
[   36.030936]  [<ffffffff8167cc8a>] tcp_v4_rcv+0x77a/0x7b0
[   36.031875]  [<ffffffff816af8c3>] ? iptable_filter_hook+0x33/0x70
[   36.032953]  [<ffffffff81657d22>] ip_local_deliver_finish+0x92/0x1f0
[   36.034065]  [<ffffffff81657f1a>] ip_local_deliver+0x9a/0xb0
[   36.035069]  [<ffffffff81657c90>] ? ip_rcv+0x3d0/0x3d0
[   36.035963]  [<ffffffff81657569>] ip_rcv_finish+0x119/0x330
[   36.036950]  [<ffffffff81657ba7>] ip_rcv+0x2e7/0x3d0
[   36.037847]  [<ffffffff81610652>] __netif_receive_skb_core+0x552/0x930
[   36.038994]  [<ffffffff81610a57>] __netif_receive_skb+0x27/0x70
[   36.040033]  [<ffffffff81610b72>] process_backlog+0xd2/0x1f0
[   36.041025]  [<ffffffff81611482>] net_rx_action+0x122/0x310
[   36.042007]  [<ffffffff81076743>] __do_softirq+0x103/0x2f0
[   36.042978]  [<ffffffff81723e3c>] do_softirq_own_stack+0x1c/0x30

This patch moves the call to tcp_fastopen_init_key_once to the places
where a listener socket creates its TFO-state, which always happens in
user-context (either from the setsockopt, or implicitly during the
listen()-call)

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Fixes: 222e83d2e0ae ("tcp: switch tcp_fastopen key generation to net_get_random_once")
Signed-off-by: Christoph Paasch <cpaasch@apple.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoneigh: do not modify unlinked entries
Julian Anastasov [Tue, 16 Jun 2015 19:56:39 +0000 (22:56 +0300)]
neigh: do not modify unlinked entries

[ Upstream commit 2c51a97f76d20ebf1f50fef908b986cb051fdff9 ]

The lockless lookups can return entry that is unlinked.
Sometimes they get reference before last neigh_cleanup_and_release,
sometimes they do not need reference. Later, any
modification attempts may result in the following problems:

1. entry is not destroyed immediately because neigh_update
can start the timer for dead entry, eg. on change to NUD_REACHABLE
state. As result, entry lives for some time but is invisible
and out of control.

2. __neigh_event_send can run in parallel with neigh_destroy
while refcnt=0 but if timer is started and expired refcnt can
reach 0 for second time leading to second neigh_destroy and
possible crash.

Thanks to Eric Dumazet and Ying Xue for their work and analyze
on the __neigh_event_send change.

Fixes: 767e97e1e0db ("neigh: RCU conversion of struct neighbour")
Fixes: a263b3093641 ("ipv4: Make neigh lookups directly in output packet path.")
Fixes: 6fd6ce2056de ("ipv6: Do not depend on rt->n in ip6_finish_output2().")
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agopacket: avoid out of bounds read in round robin fanout
Willem de Bruijn [Wed, 17 Jun 2015 19:59:34 +0000 (15:59 -0400)]
packet: avoid out of bounds read in round robin fanout

[ Upstream commit 468479e6043c84f5a65299cc07cb08a22a28c2b1 ]

PACKET_FANOUT_LB computes f->rr_cur such that it is modulo
f->num_members. It returns the old value unconditionally, but
f->num_members may have changed since the last store. Ensure
that the return value is always < num.

When modifying the logic, simplify it further by replacing the loop
with an unconditional atomic increment.

Fixes: dc99f600698d ("packet: Add fanout support.")
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agopacket: read num_members once in packet_rcv_fanout()
Eric Dumazet [Tue, 16 Jun 2015 14:59:11 +0000 (07:59 -0700)]
packet: read num_members once in packet_rcv_fanout()

[ Upstream commit f98f4514d07871da7a113dd9e3e330743fd70ae4 ]

We need to tell compiler it must not read f->num_members multiple
times. Otherwise testing if num is not zero is flaky, and we could
attempt an invalid divide by 0 in fanout_demux_cpu()

Note bug was present in packet_rcv_fanout_hash() and
packet_rcv_fanout_lb() but final 3.1 had a simple location
after commit 95ec3eb417115fb ("packet: Add 'cpu' fanout policy.")

Fixes: dc99f600698dc ("packet: Add fanout support.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agobridge: fix br_stp_set_bridge_priority race conditions
Nikolay Aleksandrov [Mon, 15 Jun 2015 17:28:51 +0000 (20:28 +0300)]
bridge: fix br_stp_set_bridge_priority race conditions

[ Upstream commit 2dab80a8b486f02222a69daca6859519e05781d9 ]

After the ->set() spinlocks were removed br_stp_set_bridge_priority
was left running without any protection when used via sysfs. It can
race with port add/del and could result in use-after-free cases and
corrupted lists. Tested by running port add/del in a loop with stp
enabled while setting priority in a loop, crashes are easily
reproducible.
The spinlocks around sysfs ->set() were removed in commit:
14f98f258f19 ("bridge: range check STP parameters")
There's also a race condition in the netlink priority support that is
fixed by this change, but it was introduced recently and the fixes tag
covers it, just in case it's needed the commit is:
af615762e972 ("bridge: add ageing_time, stp_state, priority over netlink")

Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Fixes: 14f98f258f19 ("bridge: range check STP parameters")
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agosctp: fix ASCONF list handling
Marcelo Ricardo Leitner [Fri, 12 Jun 2015 13:16:41 +0000 (10:16 -0300)]
sctp: fix ASCONF list handling

[ Upstream commit 2d45a02d0166caf2627fe91897c6ffc3b19514c4 ]

->auto_asconf_splist is per namespace and mangled by functions like
sctp_setsockopt_auto_asconf() which doesn't guarantee any serialization.

Also, the call to inet_sk_copy_descendant() was backuping
->auto_asconf_list through the copy but was not honoring
->do_auto_asconf, which could lead to list corruption if it was
different between both sockets.

This commit thus fixes the list handling by using ->addr_wq_lock
spinlock to protect the list. A special handling is done upon socket
creation and destruction for that. Error handlig on sctp_init_sock()
will never return an error after having initialized asconf, so
sctp_destroy_sock() can be called without addrq_wq_lock. The lock now
will be take on sctp_close_sock(), before locking the socket, so we
don't do it in inverse order compared to sctp_addr_wq_timeout_handler().

Instead of taking the lock on sctp_sock_migrate() for copying and
restoring the list values, it's preferred to avoid rewritting it by
implementing sctp_copy_descendant().

Issue was found with a test application that kept flipping sysctl
default_auto_asconf on and off, but one could trigger it by issuing
simultaneous setsockopt() calls on multiple sockets or by
creating/destroying sockets fast enough. This is only triggerable
locally.

Fixes: 9f7d653b67ae ("sctp: Add Auto-ASCONF support (core).")
Reported-by: Ji Jianwen <jiji@redhat.com>
Suggested-by: Neil Horman <nhorman@tuxdriver.com>
Suggested-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agonet: don't wait for order-3 page allocation
Shaohua Li [Thu, 11 Jun 2015 23:50:48 +0000 (16:50 -0700)]
net: don't wait for order-3 page allocation

[ Upstream commit fb05e7a89f500cfc06ae277bdc911b281928995d ]

We saw excessive direct memory compaction triggered by skb_page_frag_refill.
This causes performance issues and add latency. Commit 5640f7685831e0
introduces the order-3 allocation. According to the changelog, the order-3
allocation isn't a must-have but to improve performance. But direct memory
compaction has high overhead. The benefit of order-3 allocation can't
compensate the overhead of direct memory compaction.

This patch makes the order-3 page allocation atomic. If there is no memory
pressure and memory isn't fragmented, the alloction will still success, so we
don't sacrifice the order-3 benefit here. If the atomic allocation fails,
direct memory compaction will not be triggered, skb_page_frag_refill will
fallback to order-0 immediately, hence the direct memory compaction overhead is
avoided. In the allocation failure case, kswapd is waken up and doing
compaction, so chances are allocation could success next time.

alloc_skb_with_frags is the same.

The mellanox driver does similar thing, if this is accepted, we must fix
the driver too.

V3: fix the same issue in alloc_skb_with_frags as pointed out by Eric
V2: make the changelog clearer

Cc: Eric Dumazet <edumazet@google.com>
Cc: Chris Mason <clm@fb.com>
Cc: Debabrata Banerjee <dbavatar@gmail.com>
Signed-off-by: Shaohua Li <shli@fb.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agobridge: fix multicast router rlist endless loop
Nikolay Aleksandrov [Tue, 9 Jun 2015 17:23:57 +0000 (10:23 -0700)]
bridge: fix multicast router rlist endless loop

[ Upstream commit 1a040eaca1a22f8da8285ceda6b5e4a2cb704867 ]

Since the addition of sysfs multicast router support if one set
multicast_router to "2" more than once, then the port would be added to
the hlist every time and could end up linking to itself and thus causing an
endless loop for rlist walkers.
So to reproduce just do:
echo 2 > multicast_router; echo 2 > multicast_router;
in a bridge port and let some igmp traffic flow, for me it hangs up
in br_multicast_flood().
Fix this by adding a check in br_multicast_add_router() if the port is
already linked.
The reason this didn't happen before the addition of multicast_router
sysfs entries is because there's a !hlist_unhashed check that prevents
it.

Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Fixes: 0909e11758bd ("bridge: Add multicast_router sysfs entries")
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agosparc: Use GFP_ATOMIC in ldc_alloc_exp_dring() as it can be called in softirq context
Sowmini Varadhan [Tue, 21 Apr 2015 14:30:41 +0000 (10:30 -0400)]
sparc: Use GFP_ATOMIC in ldc_alloc_exp_dring() as it can be called in softirq context

[ Upstream commit 671d773297969bebb1732e1cdc1ec03aa53c6be2 ]

Since it is possible for vnet_event_napi to end up doing
vnet_control_pkt_engine -> ... -> vnet_send_attr ->
vnet_port_alloc_tx_ring -> ldc_alloc_exp_dring -> kzalloc()
(i.e., in softirq context), kzalloc() should be called with
GFP_ATOMIC from ldc_alloc_exp_dring.

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agohwmon: (mcp3021) Fix broken output scaling
Stevens, Nick [Wed, 1 Jul 2015 16:07:41 +0000 (16:07 +0000)]
hwmon: (mcp3021) Fix broken output scaling

[ Upstream commit 347d7e45bd09ce09cbc30d5cea9de377eb22f55c ]

The mcp3021 scaling code is dividing the VDD (full-scale) value in
millivolts by the A2D resolution to obtain the scaling factor. When VDD
is 3300mV (the standard value) and the resolution is 12-bit (4096
divisions), the result is a scale factor of 3300/4096, which is always
one.  Effectively, the raw A2D reading is always being returned because
no scaling is applied.

This patch fixes the issue and simplifies the register-to-volts
calculation, removing the unneeded "output_scale" struct member.

Signed-off-by: Nick Stevens <Nick.Stevens@digi.com>
Cc: stable@vger.kernel.org # v3.10+
[Guenter Roeck: Dropped unnecessary value check]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agonfs: take extra reference to fl->fl_file when running a LOCKU operation
Jeff Layton [Tue, 30 Jun 2015 18:12:30 +0000 (14:12 -0400)]
nfs: take extra reference to fl->fl_file when running a LOCKU operation

[ Upstream commit db2efec0caba4f81a22d95a34da640b86c313c8e ]

Jean reported another crash, similar to the one fixed by feaff8e5b2cf:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000148
    IP: [<ffffffff8124ef7f>] locks_get_lock_context+0xf/0xa0
    PGD 0
    Oops: 0000 [#1] SMP
    Modules linked in: nfsv3 nfs_layout_flexfiles rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache vmw_vsock_vmci_transport vsock cfg80211 rfkill coretemp crct10dif_pclmul ppdev vmw_balloon crc32_pclmul crc32c_intel ghash_clmulni_intel pcspkr vmxnet3 parport_pc i2c_piix4 microcode serio_raw parport nfsd floppy vmw_vmci acpi_cpufreq auth_rpcgss shpchp nfs_acl lockd grace sunrpc vmwgfx drm_kms_helper ttm drm mptspi scsi_transport_spi mptscsih ata_generic mptbase i2c_core pata_acpi
    CPU: 0 PID: 329 Comm: kworker/0:1H Not tainted 4.1.0-rc7+ #2
    Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/30/2013
    Workqueue: rpciod rpc_async_schedule [sunrpc]
    30ec000
    RIP: 0010:[<ffffffff8124ef7f>]  [<ffffffff8124ef7f>] locks_get_lock_context+0xf/0xa0
    RSP: 0018:ffff8802330efc08  EFLAGS: 00010296
    RAX: ffff8802330efc58 RBX: ffff880097187c80 RCX: 0000000000000000
    RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000000000
    RBP: ffff8802330efc18 R08: ffff88023fc173d8 R09: 3038b7bf00000000
    R10: 00002f1a02000000 R11: 3038b7bf00000000 R12: 0000000000000000
    R13: 0000000000000000 R14: ffff8802337a2300 R15: 0000000000000020
    FS:  0000000000000000(0000) GS:ffff88023fc00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000148 CR3: 000000003680f000 CR4: 00000000001407f0
    Stack:
     ffff880097187c80 ffff880097187cd8 ffff8802330efc98 ffffffff81250281
     ffff8802330efc68 ffffffffa013e7df ffff8802330efc98 0000000000000246
     ffff8801f6901c00 ffff880233d2b8d8 ffff8802330efc58 ffff8802330efc58
    Call Trace:
     [<ffffffff81250281>] __posix_lock_file+0x31/0x5e0
     [<ffffffffa013e7df>] ? rpc_wake_up_task_queue_locked.part.35+0xcf/0x240 [sunrpc]
     [<ffffffff8125088b>] posix_lock_file_wait+0x3b/0xd0
     [<ffffffffa03890b2>] ? nfs41_wake_and_assign_slot+0x32/0x40 [nfsv4]
     [<ffffffffa0365808>] ? nfs41_sequence_done+0xd8/0x300 [nfsv4]
     [<ffffffffa0367525>] do_vfs_lock+0x35/0x40 [nfsv4]
     [<ffffffffa03690c1>] nfs4_locku_done+0x81/0x120 [nfsv4]
     [<ffffffffa013e310>] ? rpc_destroy_wait_queue+0x20/0x20 [sunrpc]
     [<ffffffffa013e310>] ? rpc_destroy_wait_queue+0x20/0x20 [sunrpc]
     [<ffffffffa013e33c>] rpc_exit_task+0x2c/0x90 [sunrpc]
     [<ffffffffa0134400>] ? call_refreshresult+0x170/0x170 [sunrpc]
     [<ffffffffa013ece4>] __rpc_execute+0x84/0x410 [sunrpc]
     [<ffffffffa013f085>] rpc_async_schedule+0x15/0x20 [sunrpc]
     [<ffffffff810add67>] process_one_work+0x147/0x400
     [<ffffffff810ae42b>] worker_thread+0x11b/0x460
     [<ffffffff810ae310>] ? rescuer_thread+0x2f0/0x2f0
     [<ffffffff810b35d9>] kthread+0xc9/0xe0
     [<ffffffff81010000>] ? perf_trace_xen_mmu_set_pmd+0xa0/0x160
     [<ffffffff810b3510>] ? kthread_create_on_node+0x170/0x170
     [<ffffffff8173c222>] ret_from_fork+0x42/0x70
     [<ffffffff810b3510>] ? kthread_create_on_node+0x170/0x170
    Code: a5 81 e8 85 75 e4 ff c6 05 31 ee aa 00 01 eb 98 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 54 49 89 fc 53 <48> 8b 9f 48 01 00 00 48 85 db 74 08 48 89 d8 5b 41 5c 5d c3 83
    RIP  [<ffffffff8124ef7f>] locks_get_lock_context+0xf/0xa0
     RSP <ffff8802330efc08>
    CR2: 0000000000000148
    ---[ end trace 64484f16250de7ef ]---

The problem is almost exactly the same as the one fixed by feaff8e5b2cf.
We must take a reference to the struct file when running the LOCKU
compound to prevent the final fput from running until the operation is
complete.

Reported-by: Jean Spector <jean@primarydata.com>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Cc: stable@vger.kernel.org
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agonfs: take extra reference to fl->fl_file when running a setlk
Jeff Layton [Tue, 12 May 2015 19:48:10 +0000 (15:48 -0400)]
nfs: take extra reference to fl->fl_file when running a setlk

[ Upstream commit feaff8e5b2cfc3eae02cf65db7a400b0b9ffc596 ]

We had a report of a crash while stress testing the NFS client:

    BUG: unable to handle kernel NULL pointer dereference at 0000000000000150
    IP: [<ffffffff8127b698>] locks_get_lock_context+0x8/0x90
    PGD 0
    Oops: 0000 [#1] SMP
    Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_filter ebtable_broute bridge stp llc ebtables ip6table_security ip6table_mangle ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_raw ip6table_filter ip6_tables iptable_security iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_raw coretemp crct10dif_pclmul ppdev crc32_pclmul crc32c_intel ghash_clmulni_intel vmw_balloon serio_raw vmw_vmci i2c_piix4 shpchp parport_pc acpi_cpufreq parport nfsd auth_rpcgss nfs_acl lockd grace sunrpc vmwgfx drm_kms_helper ttm drm mptspi scsi_transport_spi mptscsih mptbase e1000 ata_generic pata_acpi
    CPU: 1 PID: 399 Comm: kworker/1:1H Not tainted 4.1.0-0.rc1.git0.1.fc23.x86_64 #1
    Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/30/2013
    Workqueue: rpciod rpc_async_schedule [sunrpc]
    task: ffff880036aea7c0 ti: ffff8800791f4000 task.ti: ffff8800791f4000
    RIP: 0010:[<ffffffff8127b698>]  [<ffffffff8127b698>] locks_get_lock_context+0x8/0x90
    RSP: 0018:ffff8800791f7c00  EFLAGS: 00010293
    RAX: ffff8800791f7c40 RBX: ffff88001f2ad8c0 RCX: ffffe8ffffc80305
    RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
    RBP: ffff8800791f7c88 R08: ffff88007fc971d8 R09: 279656d600000000
    R10: 0000034a01000000 R11: 279656d600000000 R12: ffff88001f2ad918
    R13: ffff88001f2ad8c0 R14: 0000000000000000 R15: 0000000100e73040
    FS:  0000000000000000(0000) GS:ffff88007fc80000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000150 CR3: 0000000001c0b000 CR4: 00000000000407e0
    Stack:
     ffffffff8127c5b0 ffff8800791f7c18 ffffffffa0171e29 ffff8800791f7c58
     ffffffffa0171ef8 ffff8800791f7c78 0000000000000246 ffff88001ea0ba00
     ffff8800791f7c40 ffff8800791f7c40 00000000ff5d86a3 ffff8800791f7ca8
    Call Trace:
     [<ffffffff8127c5b0>] ? __posix_lock_file+0x40/0x760
     [<ffffffffa0171e29>] ? rpc_make_runnable+0x99/0xa0 [sunrpc]
     [<ffffffffa0171ef8>] ? rpc_wake_up_task_queue_locked.part.35+0xc8/0x250 [sunrpc]
     [<ffffffff8127cd3a>] posix_lock_file_wait+0x4a/0x120
     [<ffffffffa03e4f12>] ? nfs41_wake_and_assign_slot+0x32/0x40 [nfsv4]
     [<ffffffffa03bf108>] ? nfs41_sequence_done+0xd8/0x2d0 [nfsv4]
     [<ffffffffa03c116d>] do_vfs_lock+0x2d/0x30 [nfsv4]
     [<ffffffffa03c251d>] nfs4_lock_done+0x1ad/0x210 [nfsv4]
     [<ffffffffa0171a30>] ? __rpc_sleep_on_priority+0x390/0x390 [sunrpc]
     [<ffffffffa0171a30>] ? __rpc_sleep_on_priority+0x390/0x390 [sunrpc]
     [<ffffffffa0171a5c>] rpc_exit_task+0x2c/0xa0 [sunrpc]
     [<ffffffffa0167450>] ? call_refreshresult+0x150/0x150 [sunrpc]
     [<ffffffffa0172640>] __rpc_execute+0x90/0x460 [sunrpc]
     [<ffffffffa0172a25>] rpc_async_schedule+0x15/0x20 [sunrpc]
     [<ffffffff810baa1b>] process_one_work+0x1bb/0x410
     [<ffffffff810bacc3>] worker_thread+0x53/0x480
     [<ffffffff810bac70>] ? process_one_work+0x410/0x410
     [<ffffffff810bac70>] ? process_one_work+0x410/0x410
     [<ffffffff810c0b38>] kthread+0xd8/0xf0
     [<ffffffff810c0a60>] ? kthread_worker_fn+0x180/0x180
     [<ffffffff817a1aa2>] ret_from_fork+0x42/0x70
     [<ffffffff810c0a60>] ? kthread_worker_fn+0x180/0x180

Jean says:

"Running locktests with a large number of iterations resulted in a
 client crash.  The test run took a while and hasn't finished after close
 to 2 hours. The crash happened right after I gave up and killed the test
 (after 107m) with Ctrl+C."

The crash happened because a NULL inode pointer got passed into
locks_get_lock_context. The call chain indicates that file_inode(filp)
returned NULL, which means that f_inode was NULL. Since that's zeroed
out in __fput, that suggests that this filp pointer outlived the last
reference.

Looking at the code, that seems possible. We copy the struct file_lock
that's passed in, but if the task is signalled at an inopportune time we
can end up trying to use that file_lock in rpciod context after the process
that requested it has already returned (and possibly put its filp
reference).

Fix this by taking an extra reference to the filp when we allocate the
lock info, and put it in nfs4_lock_release.

Reported-by: Jean Spector <jean@primarydata.com>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agofuse: initialize fc->release before calling it
Miklos Szeredi [Wed, 1 Jul 2015 14:25:55 +0000 (16:25 +0200)]
fuse: initialize fc->release before calling it

[ Upstream commit 0ad0b3255a08020eaf50e34ef0d6df5bdf5e09ed ]

fc->release is called from fuse_conn_put() which was used in the error
cleanup before fc->release was initialized.

[Jeremiah Mahler <jmmahler@gmail.com>: assign fc->release after calling
fuse_conn_init(fc) instead of before.]

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Fixes: a325f9b92273 ("fuse: update fuse_conn_init() and separate out fuse_conn_kill()")
Cc: <stable@vger.kernel.org> #v2.6.31+
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agorbd: use GFP_NOIO in rbd_obj_request_create()
Ilya Dryomov [Wed, 24 Jun 2015 14:24:33 +0000 (17:24 +0300)]
rbd: use GFP_NOIO in rbd_obj_request_create()

[ Upstream commit 5a60e87603c4c533492c515b7f62578189b03c9c ]

rbd_obj_request_create() is called on the main I/O path, so we need to
use GFP_NOIO to make sure allocation doesn't blow back on us.  Not all
callers need this, but I'm still hardcoding the flag inside rather than
making it a parameter because a) this is going to stable, and b) those
callers shouldn't really use rbd_obj_request_create() and will be fixed
in the future.

More memory allocation fixes will follow.

Cc: stable@vger.kernel.org # 3.10+
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Alex Elder <elder@linaro.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agonet: mvneta: disable IP checksum with jumbo frames for Armada 370
Simon Guinot [Tue, 30 Jun 2015 14:20:22 +0000 (16:20 +0200)]
net: mvneta: disable IP checksum with jumbo frames for Armada 370

[ Upstream commit b65657fc240ae6c1d2a1e62db9a0e61ac9631d7a ]

The Ethernet controller found in the Armada 370, 380 and 385 SoCs don't
support TCP/IP checksumming with frame sizes larger than 1600 bytes.

This patch fixes the issue by disabling the features NETIF_F_IP_CSUM and
NETIF_F_TSO for the Armada 370 and compatibles SoCs when the MTU is set
to a value greater than 1600 bytes.

Signed-off-by: Simon Guinot <simon.guinot@sequanux.org>
Fixes: c5aff18204da ("net: mvneta: driver for Marvell Armada 370/XP network unit")
Cc: <stable@vger.kernel.org> # v3.8+
Acked-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoARM: mvebu: update Ethernet compatible string for Armada XP
Simon Guinot [Tue, 30 Jun 2015 14:20:21 +0000 (16:20 +0200)]
ARM: mvebu: update Ethernet compatible string for Armada XP

[ Upstream commit ea3b55fe83b5fcede82d183164b9d6831b26e33b ]

This patch updates the Ethernet DT nodes for Armada XP SoCs with the
compatible string "marvell,armada-xp-neta".

Signed-off-by: Simon Guinot <simon.guinot@sequanux.org>
Fixes: 77916519cba3 ("arm: mvebu: Armada XP MV78230 has only three Ethernet interfaces")
Cc: <stable@vger.kernel.org> # v3.8+
Acked-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Reviewed-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agonet: mvneta: introduce compatible string "marvell, armada-xp-neta"
Simon Guinot [Tue, 30 Jun 2015 14:20:20 +0000 (16:20 +0200)]
net: mvneta: introduce compatible string "marvell, armada-xp-neta"

[ Upstream commit f522a975a8101895a85354b9c143f41b8248e71a ]

The mvneta driver supports the Ethernet IP found in the Armada 370, XP,
380 and 385 SoCs. Since at least one more hardware feature is available
for the Armada XP SoCs then a way to identify them is needed.

This patch introduces a new compatible string "marvell,armada-xp-neta".

Signed-off-by: Simon Guinot <simon.guinot@sequanux.org>
Fixes: c5aff18204da ("net: mvneta: driver for Marvell Armada 370/XP network unit")
Cc: <stable@vger.kernel.org> # v3.8+
Acked-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Acked-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agodrm/radeon: SDMA fix hibernation (CI GPU family).
Jérôme Glisse [Fri, 19 Jun 2015 14:32:16 +0000 (10:32 -0400)]
drm/radeon: SDMA fix hibernation (CI GPU family).

[ Upstream commit 2ba8d1bb8f6b589037f7db1f01144fc80750e8f7 ]

In order for hibernation to reliably work we need to properly turn
off the SDMA block, sadly after numerous attemps i haven't not found
proper sequence for clean and full shutdown. So simply reset both
SDMA block, this makes hibernation works reliably on sea island GPU
family (CI)

Hibernation and suspend to ram were tested (several times) on :
Bonaire
Hawaii
Mullins
Kaveri
Kabini

Cc: stable@vger.kernel.org
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agodrm/radeon: compute ring fix hibernation (CI GPU family) v2.
Jérôme Glisse [Fri, 19 Jun 2015 14:32:15 +0000 (10:32 -0400)]
drm/radeon: compute ring fix hibernation (CI GPU family) v2.

[ Upstream commit 161569deaa03cf3c00ed63352006193f250b0648 ]

In order for hibernation to reliably work we need to cleanup more
thoroughly the compute ring. Hibernation is different from suspend
resume as when we resume from hibernation the hardware is first
fully initialize by regular kernel then freeze callback happens
(which correspond to a suspend inside the radeon kernel driver)
and turn off each of the block. It turns out we were not cleanly
shutting down the compute ring. This patch fix that.

Hibernation and suspend to ram were tested (several times) on :
Bonaire
Hawaii
Mullins
Kaveri
Kabini

Changed since v1:
  - Factor the ring stop logic into a function taking ring as arg.

Cc: stable@vger.kernel.org
Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoagp/intel: Fix typo in needs_ilk_vtd_wa()
Chris Wilson [Sun, 28 Jun 2015 13:18:16 +0000 (14:18 +0100)]
agp/intel: Fix typo in needs_ilk_vtd_wa()

[ Upstream commit 8b572a4200828b4e75cc22ed2f494b58d5372d65 ]

In needs_ilk_vtd_wa(), we pass in the GPU device but compared it against
the ids for the mobile GPU and the mobile host bridge. That latter is
impossible and so likely was just a typo for the desktop GPU device id
(which is also buggy).

Fixes commit da88a5f7f7d434e2cde1b3e19d952e6d84533662
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Wed Feb 13 09:31:53 2013 +0000

    drm/i915: Disable WC PTE updates to w/a buggy IOMMU on ILK

Reported-by: Ting-Wei Lan <lantw44@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91127
References: https://bugzilla.freedesktop.org/show_bug.cgi?id=60391
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: stable@vger.kernel.org
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoALSA: hda - Fix the dock headphone output on Fujitsu Lifebook E780
Takashi Iwai [Mon, 29 Jun 2015 06:38:02 +0000 (08:38 +0200)]
ALSA: hda - Fix the dock headphone output on Fujitsu Lifebook E780

[ Upstream commit 4df3fd1700abbb53bd874143dfd1f9ac9e7cbf4b ]

Fujitsu Lifebook E780 sets the sequence number 0x0f to only only of
the two headphones, thus the driver tries to assign another as the
line-out, and this results in the inconsistent mapping between the
created jack ctl and the actual I/O.  Due to this, PulseAudio doesn't
handle it properly and gets the silent output.

The fix is to ignore the non-HP sequencer checks.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=99681
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoALSA: hda - Add headset support to Acer Aspire V5
Takashi Iwai [Sat, 27 Jun 2015 08:21:13 +0000 (10:21 +0200)]
ALSA: hda - Add headset support to Acer Aspire V5

[ Upstream commit 7819717b11346b8a5420b223b46600e394049c66 ]

Acer Aspire V5 with ALC282 codec needs the similar quirk like Dell
laptops to support the headset mic.  The headset mic pin is 0x19 and
it's not exposed by BIOS, thus we need to fix the pincfg as well.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=96201
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agoclocksource: exynos_mct: Avoid blocking calls in the cpu hotplug notifier
Damian Eppel [Fri, 26 Jun 2015 13:23:04 +0000 (15:23 +0200)]
clocksource: exynos_mct: Avoid blocking calls in the cpu hotplug notifier

[ Upstream commit 56a94f13919c0db5958611b388e1581b4852f3c9 ]

Whilst testing cpu hotplug events on kernel configured with
DEBUG_PREEMPT and DEBUG_ATOMIC_SLEEP we get following BUG message,
caused by calling request_irq() and free_irq() in the context of
hotplug notification (which is in this case atomic context).

[   40.785859] CPU1: Software reset
[   40.786660] BUG: sleeping function called from invalid context at mm/slub.c:1241
[   40.786668] in_atomic(): 1, irqs_disabled(): 128, pid: 0, name: swapper/1
[   40.786678] Preemption disabled at:[<  (null)>]   (null)
[   40.786681]
[   40.786692] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.19.0-rc4-00024-g7dca860 #36
[   40.786698] Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
[   40.786728] [<c0014a00>] (unwind_backtrace) from [<c0011980>] (show_stack+0x10/0x14)
[   40.786747] [<c0011980>] (show_stack) from [<c0449ba0>] (dump_stack+0x70/0xbc)
[   40.786767] [<c0449ba0>] (dump_stack) from [<c00c6124>] (kmem_cache_alloc+0xd8/0x170)
[   40.786785] [<c00c6124>] (kmem_cache_alloc) from [<c005d6f8>] (request_threaded_irq+0x64/0x128)
[   40.786804] [<c005d6f8>] (request_threaded_irq) from [<c0350b8c>] (exynos4_local_timer_setup+0xc0/0x13c)
[   40.786820] [<c0350b8c>] (exynos4_local_timer_setup) from [<c0350ca8>] (exynos4_mct_cpu_notify+0x30/0xa8)
[   40.786838] [<c0350ca8>] (exynos4_mct_cpu_notify) from [<c003b330>] (notifier_call_chain+0x44/0x84)
[   40.786857] [<c003b330>] (notifier_call_chain) from [<c0022fd4>] (__cpu_notify+0x28/0x44)
[   40.786873] [<c0022fd4>] (__cpu_notify) from [<c0013714>] (secondary_start_kernel+0xec/0x150)
[   40.786886] [<c0013714>] (secondary_start_kernel) from [<40008764>] (0x40008764)

Interrupts cannot be requested/freed in the CPU_STARTING/CPU_DYING
notifications which run on the hotplugged cpu with interrupts and
preemption disabled.

To avoid the issue, request the interrupts for all possible cpus in
the boot code. The interrupts are marked NO_AUTOENABLE to avoid a racy
request_irq/disable_irq() sequence. The flag prevents the
request_irq() code from enabling the interrupt immediately.

The interrupt is then enabled in the CPU_STARTING notifier of the
hotplugged cpu and again disabled with disable_irq_nosync() in the
CPU_DYING notifier.

[ tglx: Massaged changelog to match the patch ]

Fixes: 7114cd749a12 ("clocksource: exynos_mct: use (request/free)_irq calls for local timer registration")
Reported-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Reviewed-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Tested-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Tested-by: Marcin Jabrzyk <m.jabrzyk@samsung.com>
Signed-off-by: Damian Eppel <d.eppel@samsung.com>
Cc: m.szyprowski@samsung.com
Cc: kyungmin.park@samsung.com
Cc: daniel.lezcano@linaro.org
Cc: kgene@kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1435324984-7328-1-git-send-email-d.eppel@samsung.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
10 years agotracing/filter: Do not allow infix to exceed end of string
Steven Rostedt (Red Hat) [Thu, 25 Jun 2015 22:10:09 +0000 (18:10 -0400)]
tracing/filter: Do not allow infix to exceed end of string

[ Upstream commit 6b88f44e161b9ee2a803e5b2b1fbcf4e20e8b980 ]

While debugging a WARN_ON() for filtering, I found that it is possible
for the filter string to be referenced after its end. With the filter:

 # echo '>' > /sys/kernel/debug/events/ext4/ext4_truncate_exit/filter

The filter_parse() function can call infix_get_op() which calls
infix_advance() that updates the infix filter pointers for the cnt
and tail without checking if the filter is already at the end, which
will put the cnt to zero and the tail beyond the end. The loop then calls
infix_next() that has

ps->infix.cnt--;
return ps->infix.string[ps->infix.tail++];

The cnt will now be below zero, and the tail that is returned is
already passed the end of the filter string. So far the allocation
of the filter string usually has some buffer that is zeroed out, but
if the filter string is of the exact size of the allocated buffer
there's no guarantee that the charater after the nul terminating
character will be zero.

Luckily, only root can write to the filter.

Cc: stable@vger.kernel.org # 2.6.33+
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>