unionfs-3.12.y.git
11 years agodrm/radeon/px: fix module unload
Alex Deucher [Fri, 12 Sep 2014 22:00:53 +0000 (18:00 -0400)]
drm/radeon/px: fix module unload

commit 2e97140dd58cab8772bf77d73eabda213e45202d upstream.

Use the new vga_switcheroo_fini_domain_pm_ops function
to unregister the pm ops.

Based on a patch from:
Pali Rohár <pali.rohar@gmail.com>

bug:
https://bugzilla.kernel.org/show_bug.cgi?id=84431

Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Pali Rohár <pali.rohar@gmail.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agodrm/nouveau/runpm: fix module unload
Alex Deucher [Fri, 12 Sep 2014 22:06:56 +0000 (18:06 -0400)]
drm/nouveau/runpm: fix module unload

commit 53beaa01e0fe8e4202f43485a03b32fcf5dfea74 upstream.

Use the new vga_switcheroo_fini_domain_pm_ops function
to unregister the pm ops.

Based on a patch from:
Pali Rohár <pali.rohar@gmail.com>

bug:
https://bugzilla.kernel.org/show_bug.cgi?id=84431

Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agovgaswitcheroo: add vga_switcheroo_fini_domain_pm_ops
Alex Deucher [Fri, 12 Sep 2014 21:51:29 +0000 (17:51 -0400)]
vgaswitcheroo: add vga_switcheroo_fini_domain_pm_ops

commit 766a53d059d1500c9755c8af017bd411bd8f1b20 upstream.

Drivers should call this on unload to unregister pmops.

Bug:
https://bugzilla.kernel.org/show_bug.cgi?id=84431

Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Pali Rohár <pali.rohar@gmail.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoRevert "PCI: Don't scan random busses in pci_scan_bridge()"
Bjorn Helgaas [Fri, 19 Sep 2014 16:56:06 +0000 (10:56 -0600)]
Revert "PCI: Don't scan random busses in pci_scan_bridge()"

commit 7a0b33d4a45d30b9a838fba4efcd80b7b57d4d16 upstream.

This reverts commit fc1b253141b3 ("PCI: Don't scan random busses in
pci_scan_bridge()") because it breaks CardBus on some machines.

David tested a Dell Latitude D505 that worked like this prior to
fc1b253141b3:

    pci 0000:00:1e.0: PCI bridge to [bus 01]
    pci 0000:01:01.0: CardBus bridge to [bus 02-05]

Note that the 01:01.0 CardBus bridge has a bus number aperture of
[bus 02-05], but those buses are all outside the 00:1e.0 PCI bridge bus
number aperture, so accesses to buses 02-05 never reach CardBus.  This is
later patched up by yenta_fixup_parent_bridge(), which changes the
subordinate bus number of the 00:1e.0 PCI bridge:

    pci_bus 0000:01: Raising subordinate bus# of parent bus (#01) from #01 to #05

With fc1b253141b3, pci_scan_bridge() fails immediately when it notices that
we can't allocate a valid secondary bus number for the CardBus bridge, and
CardBus doesn't work at all:

    pci 0000:01:01.0: can't allocate child bus 01 from [bus 01]

I'd prefer to fix this by integrating the yenta_fixup_parent_bridge() logic
into pci_scan_bridge() so we fix the bus number apertures up front.  But
I don't think we can do that before v3.17, so I'm going to revert this to
avoid the problem while we're working on the long-term fix.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=83441
Link: http://lkml.kernel.org/r/1409303414-5196-1-git-send-email-david.henningsson@canonical.com
Reported-by: David Henningsson <david.henningsson@canonical.com>
Tested-by: David Henningsson <david.henningsson@canonical.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoPCI: Add pci_ignore_hotplug() to ignore hotplug events for a device
Bjorn Helgaas [Wed, 10 Sep 2014 19:45:01 +0000 (13:45 -0600)]
PCI: Add pci_ignore_hotplug() to ignore hotplug events for a device

commit b440bde74f043c8ec31081cb59c9a53ade954701 upstream.

Powering off a hot-pluggable device, e.g., with pci_set_power_state(D3cold),
normally generates a hot-remove event that unbinds the driver.

Some drivers expect to remain bound to a device even while they power it
off and back on again.  This can be dangerous, because if the device is
removed or replaced while it is powered off, the driver doesn't know that
anything changed.  But some drivers accept that risk.

Add pci_ignore_hotplug() for use by drivers that know their device cannot
be removed.  Using pci_ignore_hotplug() tells the PCI core that hot-plug
events for the device should be ignored.

The radeon and nouveau drivers use this to switch between a low-power,
integrated GPU and a higher-power, higher-performance discrete GPU.  They
power off the unused GPU, but they want to remain bound to it.

This is a reimplementation of f244d8b623da ("ACPIPHP / radeon / nouveau:
Fix VGA switcheroo problem related to hotplug") but extends it to work with
both acpiphp and pciehp.

This fixes a problem where systems with dual GPUs using the radeon drivers
become unusable, freezing every few seconds (see bugzillas below).  The
resume of the radeon device may also fail, e.g.,

This fixes problems on dual GPU systems where the radeon driver becomes
unusable because of problems while suspending the device, as in bug 79701:

    [drm] radeon: finishing device.
    radeon 0000:01:00.0: Userspace still has active objects !
    radeon 0000:01:00.0: ffff8800cb4ec288 ffff8800cb4ec000 16384 4294967297 force free
    ...
    WARNING: CPU: 0 PID: 67 at /home/apw/COD/linux/drivers/gpu/drm/radeon/radeon_gart.c:234 radeon_gart_unbind+0xd2/0xe0 [radeon]()
    trying to unbind memory from uninitialized GART !

or while resuming it, as in bug 77261:

    radeon 0000:01:00.0: ring 0 stalled for more than 10158msec
    radeon 0000:01:00.0: GPU lockup ...
    radeon 0000:01:00.0: GPU pci config reset
    pciehp 0000:00:01.0:pcie04: Card not present on Slot(1-1)
    radeon 0000:01:00.0: GPU reset succeeded, trying to resume
    *ERROR* radeon: dpm resume failed
    radeon 0000:01:00.0: Wait for MC idle timedout !

Link: https://bugzilla.kernel.org/show_bug.cgi?id=77261
Link: https://bugzilla.kernel.org/show_bug.cgi?id=79701
Reported-by: Shawn Starr <shawn.starr@rogers.com>
Reported-by: Jose P. <lbdkmjdf@sharklasers.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Rajat Jain <rajatxjain@gmail.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoarm: armv7: perf: fix armv7 ref-cycles error
Zhiqiang Zhang [Fri, 26 Sep 2014 07:44:22 +0000 (15:44 +0800)]
arm: armv7: perf: fix armv7 ref-cycles error

ref-cycles event is specially to Intel core, but can still used in arm
architecture with the wrong return value with 3.10 stable. this patch fix the
bug and make it return NOT SUPPORTED distinctly.

In upstream this bug has been fixed by other way, which changes more than one
file and more than 1000 lines. the primary commit is
6b7658ec8a100b608e59e3cde353434db51f5be0.  besides we can not simply
cherry-pick.

Signed-off-by: Zhiqiang Zhang <zhangzhiqiang.zhang@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com
Cc: Will Deacon <will.deacon@arm.com>
Cc: Christopher Covington <cov@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoperf: Fix a race condition in perf_remove_from_context()
Cong Wang [Tue, 2 Sep 2014 22:27:20 +0000 (15:27 -0700)]
perf: Fix a race condition in perf_remove_from_context()

commit 3577af70a2ce4853d58e57d832e687d739281479 upstream.

We saw a kernel soft lockup in perf_remove_from_context(),
it looks like the `perf` process, when exiting, could not go
out of the retry loop. Meanwhile, the target process was forking
a child. So either the target process should execute the smp
function call to deactive the event (if it was running) or it should
do a context switch which deactives the event.

It seems we optimize out a context switch in perf_event_context_sched_out(),
and what's more important, we still test an obsolete task pointer when
retrying, so no one actually would deactive that event in this situation.
Fix it directly by reloading the task pointer in perf_remove_from_context().

This should cure the above soft lockup.

Signed-off-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1409696840-843-1-git-send-email-xiyou.wangcong@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoFix unbalanced mutex in dma_pool_create().
Krzysztof Hałasa [Thu, 18 Sep 2014 13:12:02 +0000 (15:12 +0200)]
Fix unbalanced mutex in dma_pool_create().

commit 153a9f131f50420b7ce008c94f1c6374cbc460d7 upstream.

dma_pool_create() needs to unlock the mutex in error case.  The bug was
introduced in the 3.16 by commit cc6b664aa26d ("mm/dmapool.c: remove
redundant NULL check for dev in dma_pool_create()")/

Signed-off-by: Krzysztof Hałasa <khc@piap.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agospi: sirf: enable RX_IO_DMA_INT interrupt
Qipan Li [Tue, 2 Sep 2014 09:01:03 +0000 (17:01 +0800)]
spi: sirf: enable RX_IO_DMA_INT interrupt

commit f2a08b404691ef3e7be6ce81c185335cfc68b6db upstream.

in spi interrupt handler, we need check RX_IO_DMA status to ensure
rx fifo have received the specify count data.

if not set, the while statement in spi isr function will keep loop,
at last, make the kernel hang.

[The code is actually there in the interrupt handler but apparently it
needs the interrupt unmasking so the handler sees the status -- broonie]

Signed-off-by: Qipan Li <Qipan.Li@csr.com>
Signed-off-by: Barry Song <Baohua.Song@csr.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agospi: dw: Don't use devm_kzalloc in master->setup callback
Axel Lin [Sun, 31 Aug 2014 04:47:06 +0000 (12:47 +0800)]
spi: dw: Don't use devm_kzalloc in master->setup callback

commit a97c883a16da7e0691a3be5465926c92a8da4da6 upstream.

device_add() expects that any memory allocated via devm_* API is only
done in the device's probe function.

Fix below boot warning:
WARNING: CPU: 1 PID: 1 at drivers/base/dd.c:286 driver_probe_device+0x2b4/0x2f4()
Modules linked in:
CPU: 1 PID: 1 Comm: swapper/0 Not tainted 3.16.0-10474-g835c90b-dirty #160
[<c0016364>] (unwind_backtrace) from [<c001251c>] (show_stack+0x20/0x24)
[<c001251c>] (show_stack) from [<c04eaefc>] (dump_stack+0x7c/0x98)
[<c04eaefc>] (dump_stack) from [<c0023d4c>] (warn_slowpath_common+0x78/0x9c)
[<c0023d4c>] (warn_slowpath_common) from [<c0023d9c>] (warn_slowpath_null+0x2c/0x34)
[<c0023d9c>] (warn_slowpath_null) from [<c0302c60>] (driver_probe_device+0x2b4/0x2f4)
[<c0302c60>] (driver_probe_device) from [<c0302d90>] (__device_attach+0x50/0x54)
[<c0302d90>] (__device_attach) from [<c0300e60>] (bus_for_each_drv+0x54/0x9c)
[<c0300e60>] (bus_for_each_drv) from [<c0302958>] (device_attach+0x84/0x90)
[<c0302958>] (device_attach) from [<c0301f10>] (bus_probe_device+0x94/0xb8)
[<c0301f10>] (bus_probe_device) from [<c03000c0>] (device_add+0x434/0x4fc)
[<c03000c0>] (device_add) from [<c0342dd4>] (spi_add_device+0x98/0x164)
[<c0342dd4>] (spi_add_device) from [<c03444a4>] (spi_register_master+0x598/0x768)
[<c03444a4>] (spi_register_master) from [<c03446b4>] (devm_spi_register_master+0x40/0x80)
[<c03446b4>] (devm_spi_register_master) from [<c0346214>] (dw_spi_add_host+0x1a8/0x258)
[<c0346214>] (dw_spi_add_host) from [<c0346920>] (dw_spi_mmio_probe+0x1d4/0x294)
[<c0346920>] (dw_spi_mmio_probe) from [<c0304560>] (platform_drv_probe+0x3c/0x6c)
[<c0304560>] (platform_drv_probe) from [<c0302a98>] (driver_probe_device+0xec/0x2f4)
[<c0302a98>] (driver_probe_device) from [<c0302d3c>] (__driver_attach+0x9c/0xa0)
[<c0302d3c>] (__driver_attach) from [<c0300f0c>] (bus_for_each_dev+0x64/0x98)
[<c0300f0c>] (bus_for_each_dev) from [<c0302518>] (driver_attach+0x2c/0x30)
[<c0302518>] (driver_attach) from [<c0302134>] (bus_add_driver+0xdc/0x1f4)
[<c0302134>] (bus_add_driver) from [<c03035c8>] (driver_register+0x88/0x104)
[<c03035c8>] (driver_register) from [<c030445c>] (__platform_driver_register+0x58/0x6c)
[<c030445c>] (__platform_driver_register) from [<c0700f00>] (dw_spi_mmio_driver_init+0x18/0x20)
[<c0700f00>] (dw_spi_mmio_driver_init) from [<c0008914>] (do_one_initcall+0x90/0x1d4)
[<c0008914>] (do_one_initcall) from [<c06d7d90>] (kernel_init_freeable+0x178/0x248)
[<c06d7d90>] (kernel_init_freeable) from [<c04e687c>] (kernel_init+0x18/0xfc)
[<c04e687c>] (kernel_init) from [<c000ecd8>] (ret_from_fork+0x14/0x20)

Reported-by: Thor Thayer <tthayer@opensource.altera.com>
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agospi: fsl: Don't use devm_kzalloc in master->setup callback
Axel Lin [Sun, 31 Aug 2014 04:44:09 +0000 (12:44 +0800)]
spi: fsl: Don't use devm_kzalloc in master->setup callback

commit d9f26748128c73ec6bed2846ca52fb1c2edc3ced upstream.

device_add() expects that any memory allocated via devm_* API is only
done in the device's probe function.

Fix below boot warning:
[    3.092348] WARNING: at drivers/base/dd.c:286
[    3.096637] Modules linked in:
[    3.099697] CPU: 0 PID: 25 Comm: kworker/u2:1 Tainted: G W 3.16.1-s3k-drv-999-svn5771_knld-999 #158
[ 3.109610] Workqueue: deferwq deferred_probe_work_func
[    3.114736] task: c787f020 ti: c790c000 task.ti: c790c000
[    3.120062] NIP: c01df158 LR: c01df144 CTR: 00000000
[    3.124983] REGS: c790db30 TRAP: 0700   Tainted: G        W (3.16.1-s3k-drv-999-svn5771_knld-999)
[    3.134162] MSR: 00029032 <EE,ME,IR,DR,RI>  CR: 22002082 XER: 20000000
[    3.140703]
[    3.140703] GPR00: 00000001 c790dbe0 c787f020 00000044 00000054 00000308 c056da0e 20737069
[    3.140703] GPR08: 33323736 000ebfe0 00000308 000ebfdf 22002082 00000000 c046c5a0 c046c608
[    3.140703] GPR16: c046c614 c046c620 c046c62c c046c638 c046c648 c046c654 c046c68c c046c6c4
[    3.140703] GPR24: 00000000 00000000 00000003 c0401aa0 c0596638 c059662c c054e7a8 c7996800
[    3.170102] NIP [c01df158] driver_probe_device+0xf8/0x334
[    3.175431] LR [c01df144] driver_probe_device+0xe4/0x334
[    3.180633] Call Trace:
[    3.183093] [c790dbe0] [c01df144] driver_probe_device+0xe4/0x334 (unreliable)
[    3.190147] [c790dc10] [c01dd15c] bus_for_each_drv+0x7c/0xc0
[    3.195741] [c790dc40] [c01df5fc] device_attach+0xcc/0xf8
[    3.201076] [c790dc60] [c01dd6d4] bus_probe_device+0xb4/0xc4
[    3.206666] [c790dc80] [c01db9f8] device_add+0x270/0x564
[    3.211923] [c790dcc0] [c0219e84] spi_add_device+0xc0/0x190
[    3.217427] [c790dce0] [c021a79c] spi_register_master+0x720/0x834
[    3.223455] [c790dd40] [c021cb48] of_fsl_spi_probe+0x55c/0x614
[    3.229234] [c790dda0] [c01e0d2c] platform_drv_probe+0x30/0x74
[    3.234987] [c790ddb0] [c01df18c] driver_probe_device+0x12c/0x334
[    3.241008] [c790dde0] [c01dd15c] bus_for_each_drv+0x7c/0xc0
[    3.246602] [c790de10] [c01df5fc] device_attach+0xcc/0xf8
[    3.251937] [c790de30] [c01dd6d4] bus_probe_device+0xb4/0xc4
[    3.257536] [c790de50] [c01de9d8] deferred_probe_work_func+0x98/0xe0
[    3.263816] [c790de70] [c00305b8] process_one_work+0x18c/0x440
[    3.269577] [c790dea0] [c0030a00] worker_thread+0x194/0x67c
[    3.275105] [c790def0] [c0039198] kthread+0xd0/0xe4
[    3.279911] [c790df40] [c000c6d0] ret_from_kernel_thread+0x5c/0x64
[    3.285970] Instruction dump:
[    3.288900] 80de0000 419e01d0 3b7b0038 3c60c046 7f65db78 38635264 48211b99 813f00a0
[    3.296559] 381f00a0 7d290278 3169ffff 7c0b4910 <0f00000093df0044 7fe3fb78 4bfffd4d

Reported-by: leroy christophe <christophe.leroy@c-s.fr>
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Tested-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoIB/core: When marshaling uverbs path, clear unused fields
Matan Barak [Tue, 2 Sep 2014 12:32:34 +0000 (15:32 +0300)]
IB/core: When marshaling uverbs path, clear unused fields

commit a59c5850f09b4c2d6ad2fc47e5e1be8d654529d6 upstream.

When marsheling a user path to the kernel struct ib_sa_path, need
to zero smac, dmac and set the vlan id to the "no vlan" value.

Fixes: dd5f03beb4f7 ("IB/core: Ethernet L2 attributes in verbs/cm structures")
Reported-by: Aleksey Senin <alekseys@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoIB/mlx4: Don't duplicate the default RoCE GID
Moni Shoua [Thu, 21 Aug 2014 11:28:38 +0000 (14:28 +0300)]
IB/mlx4: Don't duplicate the default RoCE GID

commit f5c4834d9328c4ed9fe5dcbec6128d6da16db69a upstream.

When reading the IPv6 addresses from the net-device, make sure to
avoid adding a duplicate entry to the GID table because of equality
between the default GID we generate and the default IPv6 link-local
address of the device.

Fixes: acc4fccf4eff ("IB/mlx4: Make sure GID index 0 is always occupied")
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoIB/mlx4: Avoid null pointer dereference in mlx4_ib_scan_netdevs()
Moni Shoua [Thu, 21 Aug 2014 11:28:37 +0000 (14:28 +0300)]
IB/mlx4: Avoid null pointer dereference in mlx4_ib_scan_netdevs()

commit e381835cf1b8e3b2857277dbc3b77d8c5350f70a upstream.

When Ethernet netdev is not present for a port (e.g. when the link
layer type of the port is InfiniBand) it's possible to dereference a
null pointer when we do netdevice scanning.

To fix that, we move a section of code that needs to run only when
netdev is present to a proper if () statement.

Fixes: ad4885d279b6 ("IB/mlx4: Build the port IBoE GID table properly under bonding")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoIB/qib: Correct reference counting in debugfs qp_stats
Mike Marciniszyn [Fri, 19 Sep 2014 12:32:19 +0000 (08:32 -0400)]
IB/qib: Correct reference counting in debugfs qp_stats

commit 85cbb7c728bf39c45a9789b88c9471c0d7a58b0e upstream.

This particular reference count is not needed with the rcu protection,
and the current code leaks a reference count, causing a hang in
qib_qp_destroy().

Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoGFS2: fix d_splice_alias() misuses
Al Viro [Fri, 12 Sep 2014 19:56:04 +0000 (20:56 +0100)]
GFS2: fix d_splice_alias() misuses

commit cfb2f9d5c921e38b0f12bb26fed10b877664444d upstream.

Callers of d_splice_alias(dentry, inode) don't need iput(), neither
on success nor on failure.  Either the reference to inode is stored
in a previously negative dentry, or it's dropped.  In either case
inode reference the caller used to hold is consumed.

__gfs2_lookup() does iput() in case when d_splice_alias() has failed.
Double iput() if we ever hit that.  And gfs2_create_inode() ends up
not only with double iput(), but with link count dropped to zero - on
an inode it has just found in directory.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoRevert "hwrng: virtio - ensure reads happen after successful probe"
Amit Shah [Sat, 26 Jul 2014 22:05:01 +0000 (07:35 +0930)]
Revert "hwrng: virtio - ensure reads happen after successful probe"

commit eeec626366ffe558fc3d5685bd2b49a962acf57d upstream.

This reverts commit e052dbf554610e2104c5a7518c4d8374bed701bb.

Now that we use the virtio ->scan() function to register with the hwrng
core, we will not get read requests till probe is successfully finished.

So revert the workaround we had in place to refuse read requests while
we were not yet setup completely.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agovirtio: rng: delay hwrng_register() till driver is ready
Amit Shah [Sat, 26 Jul 2014 22:04:01 +0000 (07:34 +0930)]
virtio: rng: delay hwrng_register() till driver is ready

commit 5c06273401f2eb7b290cadbae18ee00f8f65e893 upstream.

Instead of calling hwrng_register() in the probe routing, call it in the
scan routine.  This ensures that when hwrng_register() is successful,
and it requests a few random bytes to seed the kernel's pool at init,
we're ready to service that request.

This will also enable us to remove the workaround added previously to
check whether probe was completed, and only then ask for data from the
host.  The revert follows in the next commit.

There's a slight behaviour change here on unsuccessful hwrng_register().
Previously, when hwrng_register() failed, the probe() routine would
fail, and the vqs would be torn down, and driver would be marked not
initialized.  Now, the vqs will remain initialized, driver would be
marked initialized as well, but won't be available in the list of RNGs
available to hwrng core.  To fix the failures, the procedure remains the
same, i.e. unload and re-load the module, and hope things succeed the
next time around.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoalarmtimer: Lock k_itimer during timer callback
Richard Larocque [Wed, 10 Sep 2014 01:31:05 +0000 (18:31 -0700)]
alarmtimer: Lock k_itimer during timer callback

commit 474e941bed9262f5fa2394f9a4a67e24499e5926 upstream.

Locks the k_itimer's it_lock member when handling the alarm timer's
expiry callback.

The regular posix timers defined in posix-timers.c have this lock held
during timout processing because their callbacks are routed through
posix_timer_fn().  The alarm timers follow a different path, so they
ought to grab the lock somewhere else.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Sharvil Nanavati <sharvil@google.com>
Signed-off-by: Richard Larocque <rlarocque@google.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoalarmtimer: Do not signal SIGEV_NONE timers
Richard Larocque [Wed, 10 Sep 2014 01:31:04 +0000 (18:31 -0700)]
alarmtimer: Do not signal SIGEV_NONE timers

commit 265b81d23a46c39df0a735a3af4238954b41a4c2 upstream.

Avoids sending a signal to alarm timers created with sigev_notify set to
SIGEV_NONE by checking for that special case in the timeout callback.

The regular posix timers avoid sending signals to SIGEV_NONE timers by
not scheduling any callbacks for them in the first place.  Although it
would be possible to do something similar for alarm timers, it's simpler
to handle this as a special case in the timeout.

Prior to this patch, the alarm timer would ignore the sigev_notify value
and try to deliver signals to the process anyway.  Even worse, the
sanity check for the value of sigev_signo is skipped when SIGEV_NONE was
specified, so the signal number could be bogus.  If sigev_signo was an
unitialized value (as it often would be if SIGEV_NONE is used), then
it's hard to predict which signal will be sent.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Sharvil Nanavati <sharvil@google.com>
Signed-off-by: Richard Larocque <rlarocque@google.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoalarmtimer: Return relative times in timer_gettime
Richard Larocque [Wed, 10 Sep 2014 01:31:03 +0000 (18:31 -0700)]
alarmtimer: Return relative times in timer_gettime

commit e86fea764991e00a03ff1e56409ec9cacdbda4c9 upstream.

Returns the time remaining for an alarm timer, rather than the time at
which it is scheduled to expire.  If the timer has already expired or it
is not currently scheduled, the it_value's members are set to zero.

This new behavior matches that of the other posix-timers and the POSIX
specifications.

This is a change in user-visible behavior, and may break existing
applications.  Hopefully, few users rely on the old incorrect behavior.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Sharvil Nanavati <sharvil@google.com>
Signed-off-by: Richard Larocque <rlarocque@google.com>
[jstultz: minor style tweak]
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoparisc: Only use -mfast-indirect-calls option for 32-bit kernel builds
John David Anglin [Tue, 23 Sep 2014 00:54:50 +0000 (20:54 -0400)]
parisc: Only use -mfast-indirect-calls option for 32-bit kernel builds

commit d26a7730b5874a5fa6779c62f4ad7c5065a94723 upstream.

In spite of what the GCC manual says, the -mfast-indirect-calls has
never been supported in the 64-bit parisc compiler. Indirect calls have
always been done using function descriptors irrespective of the
-mfast-indirect-calls option.

Recently, it was noticed that a function descriptor was always requested
when the -mfast-indirect-calls option was specified. This caused
problems when the option was used in  application code and doesn't make
any sense because the whole point of the option is to avoid using a
function descriptor for indirect calls.

Fixing this broke 64-bit kernel builds.

I will fix GCC but for now we need the attached change. This results in
the same kernel code as before.

Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoparisc: Implement new LWS CAS supporting 64 bit operations.
Guy Martin [Fri, 12 Sep 2014 16:02:34 +0000 (18:02 +0200)]
parisc: Implement new LWS CAS supporting 64 bit operations.

commit 89206491201cbd1571009b36292af781cef74c1b upstream.

The current LWS cas only works correctly for 32bit. The new LWS allows
for CAS operations of variable size.

Signed-off-by: Guy Martin <gmsoft@tuxicoman.be>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agodon't bugger nd->seq on set_root_rcu() from follow_dotdot_rcu()
Al Viro [Sun, 14 Sep 2014 01:55:46 +0000 (21:55 -0400)]
don't bugger nd->seq on set_root_rcu() from follow_dotdot_rcu()

commit 7bd88377d482e1eae3c5329b12e33cfd664fa6a9 upstream.

return the value instead, and have path_init() do the assignment.  Broken by
"vfs: Fix absolute RCU path walk failures due to uninitialized seq number",
which was Cc-stable with 2.6.38+ as destination.  This one should go where
it went.

To avoid dummy value returned in case when root is already set (it would do
no harm, actually, since the only caller that doesn't ignore the return value
is guaranteed to have nd->root *not* set, but it's more obvious that way),
lift the check into callers.  And do the same to set_root(), to keep them
in sync.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agotty/serial: at91: BUG: disable interrupts when !UART_ENABLE_MS()
Richard Genoud [Wed, 3 Sep 2014 16:09:26 +0000 (18:09 +0200)]
tty/serial: at91: BUG: disable interrupts when !UART_ENABLE_MS()

commit 35b675b9100fc38f58cb75b30e422ecd342519a8 upstream.

In set_termios(), interrupts where not disabled if UART_ENABLE_MS() was
false.

Tested on at91sam9g35.

Signed-off-by: Richard Genoud <richard.genoud@gmail.com>
Reviewed-by: Peter Hurley <peter@hurleysoftware.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agopowerpc: Add smp_mb()s to arch_spin_unlock_wait()
Michael Ellerman [Thu, 7 Aug 2014 05:36:18 +0000 (15:36 +1000)]
powerpc: Add smp_mb()s to arch_spin_unlock_wait()

commit 78e05b1421fa41ae8457701140933baa5e7d9479 upstream.

Similar to the previous commit which described why we need to add a
barrier to arch_spin_is_locked(), we have a similar problem with
spin_unlock_wait().

We need a barrier on entry to ensure any spinlock we have previously
taken is visibly locked prior to the load of lock->slock.

It's also not clear if spin_unlock_wait() is intended to have ACQUIRE
semantics. For now be conservative and add a barrier on exit to give it
ACQUIRE semantics.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agopowerpc: Add smp_mb() to arch_spin_is_locked()
Michael Ellerman [Thu, 7 Aug 2014 05:36:17 +0000 (15:36 +1000)]
powerpc: Add smp_mb() to arch_spin_is_locked()

commit 51d7d5205d3389a32859f9939f1093f267409929 upstream.

The kernel defines the function spin_is_locked(), which can be used to
check if a spinlock is currently locked.

Using spin_is_locked() on a lock you don't hold is obviously racy. That
is, even though you may observe that the lock is unlocked, it may become
locked at any time.

There is (at least) one exception to that, which is if two locks are
used as a pair, and the holder of each checks the status of the other
before doing any update.

Assuming *A and *B are two locks, and *COUNTER is a shared non-atomic
value:

The first CPU does:

spin_lock(*A)

if spin_is_locked(*B)
# nothing
else
smp_mb()
LOAD r = *COUNTER
r++
STORE *COUNTER = r

spin_unlock(*A)

And the second CPU does:

spin_lock(*B)

if spin_is_locked(*A)
# nothing
else
smp_mb()
LOAD r = *COUNTER
r++
STORE *COUNTER = r

spin_unlock(*B)

Although this is a strange locking construct, it should work.

It seems to be understood, but not documented, that spin_is_locked() is
not a memory barrier, so in the examples above and below the caller
inserts its own memory barrier before acting on the result of
spin_is_locked().

For now we assume spin_is_locked() is implemented as below, and we break
it out in our examples:

bool spin_is_locked(*LOCK) {
LOAD l = *LOCK
return l.locked
}

Our intuition is that there should be no problem even if the two code
sequences run simultaneously such as:

CPU 0 CPU 1
==================================================
spin_lock(*A) spin_lock(*B)
LOAD b = *B LOAD a = *A
if b.locked # true if a.locked # true
# nothing # nothing
spin_unlock(*A) spin_unlock(*B)

If one CPU gets the lock before the other then it will do the update and
the other CPU will back off:

CPU 0 CPU 1
==================================================
spin_lock(*A)
LOAD b = *B
spin_lock(*B)
if b.locked # false LOAD a = *A
else if a.locked # true
smp_mb() # nothing
LOAD r1 = *COUNTER spin_unlock(*B)
r1++
STORE *COUNTER = r1
spin_unlock(*A)

However in reality spin_lock() itself is not indivisible. On powerpc we
implement it as a load-and-reserve and store-conditional.

Ignoring the retry logic for the lost reservation case, it boils down to:
spin_lock(*LOCK) {
LOAD l = *LOCK
l.locked = true
STORE *LOCK = l
ACQUIRE_BARRIER
}

The ACQUIRE_BARRIER is required to give spin_lock() ACQUIRE semantics as
defined in memory-barriers.txt:

     This acts as a one-way permeable barrier.  It guarantees that all
     memory operations after the ACQUIRE operation will appear to happen
     after the ACQUIRE operation with respect to the other components of
     the system.

On modern powerpc systems we use lwsync for ACQUIRE_BARRIER. lwsync is
also know as "lightweight sync", or "sync 1".

As described in Power ISA v2.07 section B.2.1.1, in this scenario the
lwsync is not the barrier itself. It instead causes the LOAD of *LOCK to
act as the barrier, preventing any loads or stores in the locked region
from occurring prior to the load of *LOCK.

Whether this behaviour is in accordance with the definition of ACQUIRE
semantics in memory-barriers.txt is open to discussion, we may switch to
a different barrier in future.

What this means in practice is that the following can occur:

CPU 0 CPU 1
==================================================
LOAD a = *A  LOAD b = *B
a.locked = true b.locked = true
LOAD b = *B LOAD a = *A
STORE *A = a STORE *B = b
if b.locked # false if a.locked # false
else else
smp_mb() smp_mb()
LOAD r1 = *COUNTER LOAD r2 = *COUNTER
r1++ r2++
STORE *COUNTER = r1
STORE *COUNTER = r2 # Lost update
spin_unlock(*A) spin_unlock(*B)

That is, the load of *B can occur prior to the store that makes *A
visibly locked. And similarly for CPU 1. The result is both CPUs hold
their lock and believe the other lock is unlocked.

The easiest fix for this is to add a full memory barrier to the start of
spin_is_locked(), so adding to our previous definition would give us:

bool spin_is_locked(*LOCK) {
smp_mb()
LOAD l = *LOCK
return l.locked
}

The new barrier orders the store to the lock we are locking vs the load
of the other lock:

CPU 0 CPU 1
==================================================
LOAD a = *A  LOAD b = *B
a.locked = true b.locked = true
STORE *A = a STORE *B = b
smp_mb() smp_mb()
LOAD b = *B LOAD a = *A
if b.locked # true if a.locked # true
# nothing # nothing
spin_unlock(*A) spin_unlock(*B)

Although the above example is theoretical, there is code similar to this
example in sem_lock() in ipc/sem.c. This commit in addition to the next
commit appears to be a fix for crashes we are seeing in that code where
we believe this race happens in practice.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agopowerpc/perf: Fix ABIv2 kernel backtraces
Anton Blanchard [Tue, 26 Aug 2014 02:44:15 +0000 (12:44 +1000)]
powerpc/perf: Fix ABIv2 kernel backtraces

commit 85101af13bb854a6572fa540df7c7201958624b9 upstream.

ABIv2 kernels are failing to backtrace through the kernel. An example:

39.30%  readseek2_proce  [kernel.kallsyms]    [k] find_get_entry
            |
            --- find_get_entry
               __GI___libc_read

The problem is in valid_next_sp() where we check that the new stack
pointer is at least STACK_FRAME_OVERHEAD below the previous one.

ABIv1 has a minimum stack frame size of 112 bytes consisting of 48 bytes
and 64 bytes of parameter save area. ABIv2 changes that to 32 bytes
with no paramter save area.

STACK_FRAME_OVERHEAD is in theory the minimum stack frame size,
but we over 240 uses of it, some of which assume that it includes
space for the parameter area.

We need to work through all our stack defines and rationalise them
but let's fix perf now by creating STACK_FRAME_MIN_SIZE and using
in valid_next_sp(). This fixes the issue:

30.64%  readseek2_proce  [kernel.kallsyms]    [k] find_get_entry
            |
            --- find_get_entry
               pagecache_get_page
               generic_file_read_iter
               new_sync_read
               vfs_read
               sys_read
               syscall_exit
               __GI___libc_read

Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoath9k_htc: fix random decryption failure
Johannes Stezenbach [Fri, 12 Sep 2014 20:36:51 +0000 (22:36 +0200)]
ath9k_htc: fix random decryption failure

commit d21ccfd0a60ea3dece3e1d142f52694abf87a0b1 upstream.

In v3.15 the driver stopped to accept network packets after successful
authentification, which could be worked around by passing the
nohwcrypt=1 module parameter.  This was not reproducible by
everyone, and showed random behaviour in some tests.
It was caused by an uninitialized variable introduced
in 4ed1a8d4a257 ("ath9k_htc: use ath9k_cmn_rx_accept") and
used in 341b29b9cd2f ("ath9k_htc: use ath9k_cmn_rx_skb_postprocess").

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=78581
Fixes: 341b29b9cd2f ("ath9k_htc: use ath9k_cmn_rx_skb_postprocess")
Signed-off-by: Johannes Stezenbach <js@sig21.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agobrcmfmac: handle IF event for P2P_DEVICE interface
Arend van Spriel [Fri, 12 Sep 2014 14:19:30 +0000 (16:19 +0200)]
brcmfmac: handle IF event for P2P_DEVICE interface

commit 87c4790330810fe5caf0172d9320cf24ef19cebe upstream.

The firmware notifies about interface changes through the IF event
which has a NO_IF flag that means host can ignore the event. This
behaviour was introduced in the driver by:

  commit 2ee8382fc6c763c76396a6aaff77a27089eed3aa
  Author: Arend van Spriel <arend@broadcom.com>
  Date:   Sat Aug 10 12:27:24 2013 +0200

      brcmfmac: ignore IF event if firmware indicates it

It turns out that the IF event for the P2P_DEVICE also has this
flag set, but the event should not be ignored in this scenario.
The mentioned commit caused a regression in 3.12 kernel in creation
of the P2P_DEVICE interface.

Reviewed-by: Hante Meuleman <meuleman@broadcom.com>
Reviewed-by: Franky (Zhenhui) Lin <frankyl@broadcom.com>
Reviewed-by: Daniel (Deognyoun) Kim <dekim@broadcom.com>
Reviewed-by: Pieter-Paul Giesberts <pieterpg@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agosched: Fix unreleased llc_shared_mask bit during CPU hotplug
Wanpeng Li [Wed, 24 Sep 2014 08:38:05 +0000 (16:38 +0800)]
sched: Fix unreleased llc_shared_mask bit during CPU hotplug

commit 03bd4e1f7265548832a76e7919a81f3137c44fd1 upstream.

The following bug can be triggered by hot adding and removing a large number of
xen domain0's vcpus repeatedly:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 IP: [..] find_busiest_group
PGD 5a9d5067 PUD 13067 PMD 0
Oops: 0000 [#3] SMP
[...]
Call Trace:
load_balance
? _raw_spin_unlock_irqrestore
idle_balance
__schedule
schedule
schedule_timeout
? lock_timer_base
schedule_timeout_uninterruptible
msleep
lock_device_hotplug_sysfs
online_store
dev_attr_store
sysfs_write_file
vfs_write
SyS_write
system_call_fastpath

Last level cache shared mask is built during CPU up and the
build_sched_domain() routine takes advantage of it to setup
the sched domain CPU topology.

However, llc_shared_mask is not released during CPU disable,
which leads to an invalid sched domainCPU topology.

This patch fix it by releasing the llc_shared_mask correctly
during CPU disable.

Yasuaki also reported that this can happen on real hardware:

  https://lkml.org/lkml/2014/7/22/1018

His case is here:

==
Here is an example on my system.
My system has 4 sockets and each socket has 15 cores and HT is
enabled. In this case, each core of sockes is numbered as
follows:

 | CPU#
Socket#0 | 0-14 , 60-74
Socket#1 | 15-29, 75-89
Socket#2 | 30-44, 90-104
Socket#3 | 45-59, 105-119

Then llc_shared_mask of CPU#30 has 0x3fff80000001fffc0000000.

It means that last level cache of Socket#2 is shared with
CPU#30-44 and 90-104.

When hot-removing socket#2 and #3, each core of sockets is
numbered as follows:

 | CPU#
Socket#0 | 0-14 , 60-74
Socket#1 | 15-29, 75-89

But llc_shared_mask is not cleared. So llc_shared_mask of CPU#30
remains having 0x3fff80000001fffc0000000.

After that, when hot-adding socket#2 and #3, each core of
sockets is numbered as follows:

 | CPU#
Socket#0 | 0-14 , 60-74
Socket#1 | 15-29, 75-89
Socket#2 | 30-59
Socket#3 | 90-119

Then llc_shared_mask of CPU#30 becomes
0x3fff8000fffffffc0000000. It means that last level cache of
Socket#2 is shared with CPU#30-59 and 90-104. So the mask has
the wrong value.

Signed-off-by: Wanpeng Li <wanpeng.li@linux.intel.com>
Tested-by: Linn Crosetto <linn@hp.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Toshi Kani <toshi.kani@hp.com>
Reviewed-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Steven Rostedt <srostedt@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1411547885-48165-1-git-send-email-wanpeng.li@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agomm: softdirty: keep bit when zapping file pte
Peter Feiner [Thu, 25 Sep 2014 23:05:29 +0000 (16:05 -0700)]
mm: softdirty: keep bit when zapping file pte

commit dbab31aa2ceec2d201966fa0b552f151310ba5f4 upstream.

This fixes the same bug as b43790eedd31 ("mm: softdirty: don't forget to
save file map softdiry bit on unmap") and 9aed8614af5a ("mm/memory.c:
don't forget to set softdirty on file mapped fault") where the return
value of pte_*mksoft_dirty was being ignored.

To be sure that no other pte/pmd "mk" function return values were being
ignored, I annotated the functions in arch/x86/include/asm/pgtable.h
with __must_check and rebuilt.

The userspace effect of this bug is that the softdirty mark might be
lost if a file mapped pte get zapped.

Signed-off-by: Peter Feiner <pfeiner@google.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Jamie Liu <jamieliu@google.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agofs/cachefiles: add missing \n to kerror conversions
Fabian Frederick [Thu, 25 Sep 2014 23:05:27 +0000 (16:05 -0700)]
fs/cachefiles: add missing \n to kerror conversions

commit 6ff66ac77aeaa9c13db28784e1c50c027a1f487b upstream.

Commit 0227d6abb378 ("fs/cachefiles: replace kerror by pr_err") didn't
include newline featuring in original kerror definition

Signed-off-by: Fabian Frederick <fabf@skynet.be>
Reported-by: David Howells <dhowells@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agomm, slab: initialize object alignment on cache creation
David Rientjes [Thu, 25 Sep 2014 23:05:20 +0000 (16:05 -0700)]
mm, slab: initialize object alignment on cache creation

commit d4a5fca592b9ab52b90bb261a90af3c8f53be011 upstream.

Since commit 4590685546a3 ("mm/sl[aou]b: Common alignment code"), the
"ralign" automatic variable in __kmem_cache_create() may be used as
uninitialized.

The proper alignment defaults to BYTES_PER_WORD and can be overridden by
SLAB_RED_ZONE or the alignment specified by the caller.

This fixes https://bugzilla.kernel.org/show_bug.cgi?id=85031

Signed-off-by: David Rientjes <rientjes@google.com>
Reported-by: Andrei Elovikov <a.elovikov@gmail.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoocfs2/dlm: do not get resource spinlock if lockres is new
Joseph Qi [Thu, 25 Sep 2014 23:05:16 +0000 (16:05 -0700)]
ocfs2/dlm: do not get resource spinlock if lockres is new

commit 5760a97c7143c208fa3a8f8cad0ed7dd672ebd28 upstream.

There is a deadlock case which reported by Guozhonghua:
  https://oss.oracle.com/pipermail/ocfs2-devel/2014-September/010079.html

This case is caused by &res->spinlock and &dlm->master_lock
misordering in different threads.

It was introduced by commit 8d400b81cc83 ("ocfs2/dlm: Clean up refmap
helpers").  Since lockres is new, it doesn't not require the
&res->spinlock.  So remove it.

Fixes: 8d400b81cc83 ("ocfs2/dlm: Clean up refmap helpers")
Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Reviewed-by: joyce.xue <xuejiufei@huawei.com>
Reported-by: Guozhonghua <guozhonghua@h3c.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agonilfs2: fix data loss with mmap()
Andreas Rohner [Thu, 25 Sep 2014 23:05:14 +0000 (16:05 -0700)]
nilfs2: fix data loss with mmap()

commit 56d7acc792c0d98f38f22058671ee715ff197023 upstream.

This bug leads to reproducible silent data loss, despite the use of
msync(), sync() and a clean unmount of the file system.  It is easily
reproducible with the following script:

  ----------------[BEGIN SCRIPT]--------------------
  mkfs.nilfs2 -f /dev/sdb
  mount /dev/sdb /mnt

  dd if=/dev/zero bs=1M count=30 of=/mnt/testfile

  umount /mnt
  mount /dev/sdb /mnt
  CHECKSUM_BEFORE="$(md5sum /mnt/testfile)"

  /root/mmaptest/mmaptest /mnt/testfile 30 10 5

  sync
  CHECKSUM_AFTER="$(md5sum /mnt/testfile)"
  umount /mnt
  mount /dev/sdb /mnt
  CHECKSUM_AFTER_REMOUNT="$(md5sum /mnt/testfile)"
  umount /mnt

  echo "BEFORE MMAP:\t$CHECKSUM_BEFORE"
  echo "AFTER MMAP:\t$CHECKSUM_AFTER"
  echo "AFTER REMOUNT:\t$CHECKSUM_AFTER_REMOUNT"
  ----------------[END SCRIPT]--------------------

The mmaptest tool looks something like this (very simplified, with
error checking removed):

  ----------------[BEGIN mmaptest]--------------------
  data = mmap(NULL, file_size - file_offset, PROT_READ | PROT_WRITE,
              MAP_SHARED, fd, file_offset);

  for (i = 0; i < write_count; ++i) {
        memcpy(data + i * 4096, buf, sizeof(buf));
        msync(data, file_size - file_offset, MS_SYNC))
  }
  ----------------[END mmaptest]--------------------

The output of the script looks something like this:

  BEFORE MMAP:    281ed1d5ae50e8419f9b978aab16de83  /mnt/testfile
  AFTER MMAP:     6604a1c31f10780331a6850371b3a313  /mnt/testfile
  AFTER REMOUNT:  281ed1d5ae50e8419f9b978aab16de83  /mnt/testfile

So it is clear, that the changes done using mmap() do not survive a
remount.  This can be reproduced a 100% of the time.  The problem was
introduced in commit 136e8770cd5d ("nilfs2: fix issue of
nilfs_set_page_dirty() for page at EOF boundary").

If the page was read with mpage_readpage() or mpage_readpages() for
example, then it has no buffers attached to it.  In that case
page_has_buffers(page) in nilfs_set_page_dirty() will be false.
Therefore nilfs_set_file_dirty() is never called and the pages are never
collected and never written to disk.

This patch fixes the problem by also calling nilfs_set_file_dirty() if the
page has no buffers attached to it.

[akpm@linux-foundation.org: s/PAGE_SHIFT/PAGE_CACHE_SHIFT/]
Signed-off-by: Andreas Rohner <andreas.rohner@gmx.net>
Tested-by: Andreas Rohner <andreas.rohner@gmx.net>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agofs/notify: don't show f_handle if exportfs_encode_inode_fh failed
Andrey Vagin [Tue, 9 Sep 2014 21:51:06 +0000 (14:51 -0700)]
fs/notify: don't show f_handle if exportfs_encode_inode_fh failed

commit 7e8824816bda16bb11ff5ff1e1212d642e57b0b3 upstream.

Currently we handle only ENOSPC.  In case of other errors the file_handle
variable isn't filled properly and we will show a part of stack.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agofsnotify/fdinfo: use named constants instead of hardcoded values
Andrey Vagin [Tue, 9 Sep 2014 21:51:04 +0000 (14:51 -0700)]
fsnotify/fdinfo: use named constants instead of hardcoded values

commit 1fc98d11cac6dd66342e5580cb2687e5b1e9a613 upstream.

MAX_HANDLE_SZ is equal to 128, but currently the size of pad is only 64
bytes, so exportfs_encode_inode_fh can return an error.

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agokcmp: fix standard comparison bug
Rasmus Villemoes [Tue, 9 Sep 2014 21:51:01 +0000 (14:51 -0700)]
kcmp: fix standard comparison bug

commit acbbe6fbb240a927ee1f5994f04d31267d422215 upstream.

The C operator <= defines a perfectly fine total ordering on the set of
values representable in a long.  However, unlike its namesake in the
integers, it is not translation invariant, meaning that we do not have
"b <= c" iff "a+b <= a+c" for all a,b,c.

This means that it is always wrong to try to boil down the relationship
between two longs to a question about the sign of their difference,
because the resulting relation [a LEQ b iff a-b <= 0] is neither
anti-symmetric or transitive.  The former is due to -LONG_MIN==LONG_MIN
(take any two a,b with a-b = LONG_MIN; then a LEQ b and b LEQ a, but a !=
b).  The latter can either be seen observing that x LEQ x+1 for all x,
implying x LEQ x+1 LEQ x+2 ...  LEQ x-1 LEQ x; or more directly with the
simple example a=LONG_MIN, b=0, c=1, for which a-b < 0, b-c < 0, but a-c >
0.

Note that it makes absolutely no difference that a transmogrying bijection
has been applied before the comparison is done.  In fact, had the
obfuscation not been done, one could probably not observe the bug
(assuming all values being compared always lie in one half of the address
space, the mathematical value of a-b is always representable in a long).
As it stands, one can easily obtain three file descriptors exhibiting the
non-transitivity of kcmp().

Side note 1: I can't see that ensuring the MSB of the multiplier is
set serves any purpose other than obfuscating the obfuscating code.

Side note 2:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <fcntl.h>
#include <unistd.h>
#include <assert.h>
#include <sys/syscall.h>

enum kcmp_type {
        KCMP_FILE,
        KCMP_VM,
        KCMP_FILES,
        KCMP_FS,
        KCMP_SIGHAND,
        KCMP_IO,
        KCMP_SYSVSEM,
        KCMP_TYPES,
};
pid_t pid;

int kcmp(pid_t pid1, pid_t pid2, int type,
 unsigned long idx1, unsigned long idx2)
{
return syscall(SYS_kcmp, pid1, pid2, type, idx1, idx2);
}
int cmp_fd(int fd1, int fd2)
{
int c = kcmp(pid, pid, KCMP_FILE, fd1, fd2);
if (c < 0) {
perror("kcmp");
exit(1);
}
assert(0 <= c && c < 3);
return c;
}
int cmp_fdp(const void *a, const void *b)
{
static const int normalize[] = {0, -1, 1};
return normalize[cmp_fd(*(int*)a, *(int*)b)];
}
#define MAX 100 /* This is plenty; I've seen it trigger for MAX==3 */
int main(int argc, char *argv[])
{
int r, s, count = 0;
int REL[3] = {0,0,0};
int fd[MAX];
pid = getpid();
while (count < MAX) {
r = open("/dev/null", O_RDONLY);
if (r < 0)
break;
fd[count++] = r;
}
printf("opened %d file descriptors\n", count);
for (r = 0; r < count; ++r) {
for (s = r+1; s < count; ++s) {
REL[cmp_fd(fd[r], fd[s])]++;
}
}
printf("== %d\t< %d\t> %d\n", REL[0], REL[1], REL[2]);
qsort(fd, count, sizeof(fd[0]), cmp_fdp);
memset(REL, 0, sizeof(REL));

for (r = 0; r < count; ++r) {
for (s = r+1; s < count; ++s) {
REL[cmp_fd(fd[r], fd[s])]++;
}
}
printf("== %d\t< %d\t> %d\n", REL[0], REL[1], REL[2]);
return (REL[0] + REL[2] != 0);
}

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Cyrill Gorcunov <gorcunov@openvz.org>
"Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoeventpoll: fix uninitialized variable in epoll_ctl
Nicolas Iooss [Tue, 9 Sep 2014 21:50:51 +0000 (14:50 -0700)]
eventpoll: fix uninitialized variable in epoll_ctl

commit c680e41b3a2e944185c74bf60531e3d316d3ecc4 upstream.

When calling epoll_ctl with operation EPOLL_CTL_DEL, structure epds is
not initialized but ep_take_care_of_epollwakeup reads its event field.
When this unintialized field has EPOLLWAKEUP bit set, a capability check
is done for CAP_BLOCK_SUSPEND in ep_take_care_of_epollwakeup.  This
produces unexpected messages in the audit log, such as (on a system
running SELinux):

    type=AVC msg=audit(1408212798.866:410): avc:  denied
    { block_suspend } for  pid=7754 comm="dbus-daemon" capability=36
    scontext=unconfined_u:unconfined_r:unconfined_t
    tcontext=unconfined_u:unconfined_r:unconfined_t
    tclass=capability2 permissive=1

    type=SYSCALL msg=audit(1408212798.866:410): arch=c000003e syscall=233
    success=yes exit=0 a0=3 a1=2 a2=9 a3=7fffd4d66ec0 items=0 ppid=1
    pid=7754 auid=1000 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0
    fsgid=0 tty=(none) ses=3 comm="dbus-daemon"
    exe="/usr/bin/dbus-daemon"
    subj=unconfined_u:unconfined_r:unconfined_t key=(null)

("arch=c000003e syscall=233 a1=2" means "epoll_ctl(op=EPOLL_CTL_DEL)")

Remove use of epds in epoll_ctl when op == EPOLL_CTL_DEL.

Fixes: 4d7e30d98939 ("epoll: Add a flag, EPOLLWAKEUP, to prevent suspend while epoll events are ready")
Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Arve Hjønnevåg <arve@android.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agokernel/printk/printk.c: fix faulty logic in the case of recursive printk
Patrick Palka [Tue, 9 Sep 2014 21:50:48 +0000 (14:50 -0700)]
kernel/printk/printk.c: fix faulty logic in the case of recursive printk

commit 000a7d66ec30898f46869be01ab8205b056385d0 upstream.

We shouldn't set text_len in the code path that detects printk recursion
because text_len corresponds to the length of the string inside textbuf.
A few lines down from the line

    text_len = strlen(recursion_msg);

is the line

    text_len += vscnprintf(text + text_len, ...);

So if printk detects recursion, it sets text_len to 29 (the length of
recursion_msg) and logs an error.  Then the message supplied by the
caller of printk is stored inside textbuf but offset by 29 bytes.  This
means that the output of the recursive call to printk will contain 29
bytes of garbage in front of it.

This defect is caused by commit 458df9fd4815 ("printk: remove separate
printk_sched buffers and use printk buf instead") which turned the line

    text_len = vscnprintf(text, ...);

into

    text_len += vscnprintf(text + text_len, ...);

To fix this, this patch avoids setting text_len when logging the printk
recursion error.  This patch also marks unlikely() the branch leading up
to this code.

Fixes: 458df9fd4815b478 ("printk: remove separate printk_sched buffers and use printk buf instead")
Signed-off-by: Patrick Palka <patrick@parcs.ath.cx>
Reviewed-by: Petr Mladek <pmladek@suse.cz>
Reviewed-by: Jan Kara <jack@suse.cz>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoRevert "mac80211: disable uAPSD if all ACs are under ACM"
Johannes Berg [Mon, 25 Aug 2014 10:08:09 +0000 (12:08 +0200)]
Revert "mac80211: disable uAPSD if all ACs are under ACM"

commit bb512ad0732232f1d2693bb68f31a76bed8f22ae upstream.

This reverts commit 24aa11ab8ae03292d38ec0dbd9bc2ac49fe8a6dd.

That commit was wrong since it uses data that hasn't even been set
up yet, but might be a hold-over from a previous connection.

Additionally, it seems like a driver-specific workaround that
shouldn't have been in mac80211 to start with.

Fixes: 24aa11ab8ae0 ("mac80211: disable uAPSD if all ACs are under ACM")
Reviewed-by: Luciano Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoftrace: Update all ftrace_ops for a ftrace_hash_ops update
Steven Rostedt (Red Hat) [Mon, 18 Aug 2014 17:21:08 +0000 (13:21 -0400)]
ftrace: Update all ftrace_ops for a ftrace_hash_ops update

commit 84261912ebee41269004e8a9f3614ba38ef6b206 upstream.

When updating what an ftrace_ops traces, if it is registered (that is,
actively tracing), and that ftrace_ops uses the shared global_ops
local_hash, then we need to update all tracers that are active and
also share the global_ops' ftrace_hash_ops.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoftrace: Fix function_profiler and function tracer together
Steven Rostedt (Red Hat) [Fri, 15 Aug 2014 21:18:46 +0000 (17:18 -0400)]
ftrace: Fix function_profiler and function tracer together

commit 5f151b240192a1557119d5375af71efc26825bc8 upstream.

The latest rewrite of ftrace removed the separate ftrace_ops of
the function tracer and the function graph tracer and had them
share the same ftrace_ops. This simplified the accounting by removing
the multiple layers of functions called, where the global_ops func
would call a special list that would iterate over the other ops that
were registered within it (like function and function graph), which
itself was registered to the ftrace ops list of all functions
currently active. If that sounds confusing, the code that implemented
it was also confusing and its removal is a good thing.

The problem with this change was that it assumed that the function
and function graph tracer can never be used at the same time.
This is mostly true, but there is an exception. That is when the
function profiler uses the function graph tracer to profile.
The function profiler can be activated the same time as the function
tracer, and this breaks the assumption and the result is that ftrace
will crash (it detects the error and shuts itself down, it does not
cause a kernel oops).

To solve this issue, a previous change allowed the hash tables
for the functions traced by a ftrace_ops to be a pointer and let
multiple ftrace_ops share the same hash. This allows the function
and function_graph tracer to have separate ftrace_ops, but still
share the hash, which is what is done.

Now the function and function graph tracers have separate ftrace_ops
again, and the function tracer can be run while the function_profile
is active.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoftrace: Allow ftrace_ops to use the hashes from other ops
Steven Rostedt (Red Hat) [Fri, 15 Aug 2014 21:23:02 +0000 (17:23 -0400)]
ftrace: Allow ftrace_ops to use the hashes from other ops

commit 33b7f99cf003ca6c1d31c42b50e1100ad71aaec0 upstream.

Currently the top level debug file system function tracer shares its
ftrace_ops with the function graph tracer. This was thought to be fine
because the tracers are not used together, as one can only enable
function or function_graph tracer in the current_tracer file.

But that assumption proved to be incorrect. The function profiler
can use the function graph tracer when function tracing is enabled.
Since all function graph users uses the function tracing ftrace_ops
this causes a conflict and when a user enables both function profiling
as well as the function tracer it will crash ftrace and disable it.

The quick solution so far is to move them as separate ftrace_ops like
it was earlier. The problem though is to synchronize the functions that
are traced because both function and function_graph tracer are limited
by the selections made in the set_ftrace_filter and set_ftrace_notrace
files.

To handle this, a new structure is made called ftrace_ops_hash. This
structure will now hold the filter_hash and notrace_hash, and the
ftrace_ops will point to this structure. That will allow two ftrace_ops
to share the same hashes.

Since most ftrace_ops do not share the hashes, and to keep allocation
simple, the ftrace_ops structure will include both a pointer to the
ftrace_ops_hash called func_hash, as well as the structure itself,
called local_hash. When the ops are registered, the func_hash pointer
will be initialized to point to the local_hash within the ftrace_ops
structure. Some of the ftrace internal ftrace_ops will be initialized
statically. This will allow for the function and function_graph tracer
to have separate ops but still share the same hash tables that determine
what functions they trace.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agousb: dwc3: fix TRB completion when multiple TRBs are started
Felipe Balbi [Thu, 4 Sep 2014 15:28:10 +0000 (10:28 -0500)]
usb: dwc3: fix TRB completion when multiple TRBs are started

commit 0b93a4c838fa10370d72f86fe712426ac63804de upstream.

After commit 2ec2a8be (usb: dwc3: gadget:
always enable IOC on bulk/interrupt transfers)
we created a situation where it was possible to
hang a bulk/interrupt endpoint if we had more
than one pending request in our queue and they
were both started with a single Start Transfer
command.

The problems triggers because we had not enabled
Transfer In Progress event for those endpoints
and we were not able to process early giveback
of requests completed without LST bit set.

Fix the problem by finally enabling Xfer In Progress
event for all endpoint types, except control.

Fixes: 2ec2a8be (usb: dwc3: gadget: always
enable IOC on bulk/interrupt transfers)
Reported-by: Pratyush Anand <pratyush.anand@st.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agogenhd: fix leftover might_sleep() in blk_free_devt()
Jens Axboe [Tue, 16 Sep 2014 19:38:51 +0000 (13:38 -0600)]
genhd: fix leftover might_sleep() in blk_free_devt()

commit 46f341ffcfb5d8530f7d1e60f3be06cce6661b62 upstream.

Commit 2da78092 changed the locking from a mutex to a spinlock,
so we now longer sleep in this context. But there was a leftover
might_sleep() in there, which now triggers since we do the final
free from an RCU callback. Get rid of it.

Reported-by: Pontus Fuchs <pontus.fuchs@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolockdep: Revert lockdep check in raw_seqcount_begin()
Trond Myklebust [Thu, 5 Jun 2014 15:31:01 +0000 (11:31 -0400)]
lockdep: Revert lockdep check in raw_seqcount_begin()

commit 22fdcf02f6e80d64a927f702dd9d631a927d87d4 upstream.

This commit reverts the addition of lockdep checking to raw_seqcount_begin
for the following reasons:

 1) It violates the naming convention that raw_* functions should not
    do lockdep checks (a convention that is also followed by the other
    raw_*_seqcount_begin functions).

 2) raw_seqcount_begin does not spin, so it can only be part of an ABBA
    deadlock in very special circumstances (for instance if a lock
    is held across the entire raw_seqcount_begin()+read_seqcount_retry()
    loop while also being taken inside the write_seqcount protected area).

 3) It is causing false positives with some existing callers, and there
    is no non-lockdep alternative for those callers to use.

None of the three existing callers (__d_lookup_rcu, netdev_get_name, and
the NFS state code) appear to use the function in a manner that is ABBA
deadlock prone.

Fixes: 1ca7d67cf5d5: seqcount: Add lockdep functionality to seqcount/seqlock
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Waiman Long <Waiman.Long@hp.com>
Cc: Stephen Boyd <sboyd@codeaurora.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/CAHQdGtRR6SvEhXiqWo24hoUh9AU9cL82Z8Z-d8-7u951F_d+5g@mail.gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolockd: fix rpcbind crash on lockd startup failure
J. Bruce Fields [Fri, 29 Aug 2014 20:25:50 +0000 (16:25 -0400)]
lockd: fix rpcbind crash on lockd startup failure

commit 7c17705e77b12b20fb8afb7c1b15dcdb126c0c12 upstream.

Nikita Yuschenko reported that booting a kernel with init=/bin/sh and
then nfs mounting without portmap or rpcbind running using a busybox
mount resulted in:

  # mount -t nfs 10.30.130.21:/opt /mnt
  svc: failed to register lockdv1 RPC service (errno 111).
  lockd_up: makesock failed, error=-111
  Unable to handle kernel paging request for data at address 0x00000030
  Faulting instruction address: 0xc055e65c
  Oops: Kernel access of bad area, sig: 11 [#1]
  MPC85xx CDS
  Modules linked in:
  CPU: 0 PID: 1338 Comm: mount Not tainted 3.10.44.cge #117
  task: cf29cea0 ti: cf35c000 task.ti: cf35c000
  NIP: c055e65c LR: c0566490 CTR: c055e648
  REGS: cf35dad0 TRAP: 0300   Not tainted  (3.10.44.cge)
  MSR: 00029000 <CE,EE,ME>  CR: 22442488  XER: 20000000
  DEAR: 00000030, ESR: 00000000

  GPR00: c05606f4 cf35db80 cf29cea0 cf0ded80 cf0dedb8 00000001 1dec3086
  00000000
  GPR08: 00000000 c07b1640 00000007 1dec3086 22442482 100b9758 00000000
  10090ae8
  GPR16: 00000000 000186a5 00000000 00000000 100c3018 bfa46edc 100b0000
  bfa46ef0
  GPR24: cf386ae0 c07834f0 00000000 c0565f88 00000001 cf0dedb8 00000000
  cf0ded80
  NIP [c055e65c] call_start+0x14/0x34
  LR [c0566490] __rpc_execute+0x70/0x250
  Call Trace:
  [cf35db80] [00000080] 0x80 (unreliable)
  [cf35dbb0] [c05606f4] rpc_run_task+0x9c/0xc4
  [cf35dbc0] [c0560840] rpc_call_sync+0x50/0xb8
  [cf35dbf0] [c056ee90] rpcb_register_call+0x54/0x84
  [cf35dc10] [c056f24c] rpcb_register+0xf8/0x10c
  [cf35dc70] [c0569e18] svc_unregister.isra.23+0x100/0x108
  [cf35dc90] [c0569e38] svc_rpcb_cleanup+0x18/0x30
  [cf35dca0] [c0198c5c] lockd_up+0x1dc/0x2e0
  [cf35dcd0] [c0195348] nlmclnt_init+0x2c/0xc8
  [cf35dcf0] [c015bb5c] nfs_start_lockd+0x98/0xec
  [cf35dd20] [c015ce6c] nfs_create_server+0x1e8/0x3f4
  [cf35dd90] [c0171590] nfs3_create_server+0x10/0x44
  [cf35dda0] [c016528c] nfs_try_mount+0x158/0x1e4
  [cf35de20] [c01670d0] nfs_fs_mount+0x434/0x8c8
  [cf35de70] [c00cd3bc] mount_fs+0x20/0xbc
  [cf35de90] [c00e4f88] vfs_kern_mount+0x50/0x104
  [cf35dec0] [c00e6e0c] do_mount+0x1d0/0x8e0
  [cf35df10] [c00e75ac] SyS_mount+0x90/0xd0
  [cf35df40] [c000ccf4] ret_from_syscall+0x0/0x3c

The addition of svc_shutdown_net() resulted in two calls to
svc_rpcb_cleanup(); the second is no longer necessary and crashes when
it calls rpcb_register_call with clnt=NULL.

Reported-by: Nikita Yushchenko <nyushchenko@dev.rtsoft.ru>
Fixes: 679b033df484 "lockd: ensure we tear down any live sockets when socket creation fails during lockd_up"
Acked-by: Jeff Layton <jlayton@primarydata.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agortlwifi: rtl8192cu: Add new ID
Larry Finger [Sun, 24 Aug 2014 22:49:43 +0000 (17:49 -0500)]
rtlwifi: rtl8192cu: Add new ID

commit c66517165610b911e4c6d268f28d8c640832dbd1 upstream.

The Sitecom WLA-2102 adapter uses this driver.

Reported-by: Nico Baggus <nico-linux@noci.xs4all.nl>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Nico Baggus <nico-linux@noci.xs4all.nl>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoregulatory: add NUL to alpha2
Eliad Peller [Wed, 11 Jun 2014 07:23:35 +0000 (10:23 +0300)]
regulatory: add NUL to alpha2

commit a5fe8e7695dc3f547e955ad2b662e3e72969e506 upstream.

alpha2 is defined as 2-chars array, but is used in multiple
places as string (e.g. with nla_put_string calls), which
might leak kernel data.

Solve it by simply adding an extra char for the NULL
terminator, making such operations safe.

Signed-off-by: Eliad Peller <eliadx.peller@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agopercpu: perform tlb flush after pcpu_map_pages() failure
Tejun Heo [Fri, 15 Aug 2014 20:06:10 +0000 (16:06 -0400)]
percpu: perform tlb flush after pcpu_map_pages() failure

commit 849f5169097e1ba35b90ac9df76b5bb6f9c0aabd upstream.

If pcpu_map_pages() fails midway, it unmaps the already mapped pages.
Currently, it doesn't flush tlb after the partial unmapping.  This may
be okay in most cases as the established mapping hasn't been used at
that point but it can go wrong and when it goes wrong it'd be
extremely difficult to track down.

Flush tlb after the partial unmapping.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agopercpu: fix pcpu_alloc_pages() failure path
Tejun Heo [Fri, 15 Aug 2014 20:06:06 +0000 (16:06 -0400)]
percpu: fix pcpu_alloc_pages() failure path

commit f0d279654dea22b7a6ad34b9334aee80cda62cde upstream.

When pcpu_alloc_pages() fails midway, pcpu_free_pages() is invoked to
free what has already been allocated.  The invocation is across the
whole requested range and pcpu_free_pages() will try to free all
non-NULL pages; unfortunately, this is incorrect as
pcpu_get_pages_and_bitmap(), unlike what its comment suggests, doesn't
clear the pages array and thus the array may have entries from the
previous invocations making the partial failure path free incorrect
pages.

Fix it by open-coding the partial freeing of the already allocated
pages.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agopercpu: free percpu allocation info for uniprocessor system
Honggang Li [Tue, 12 Aug 2014 13:36:15 +0000 (21:36 +0800)]
percpu: free percpu allocation info for uniprocessor system

commit 3189eddbcafcc4d827f7f19facbeddec4424eba8 upstream.

Currently, only SMP system free the percpu allocation info.
Uniprocessor system should free it too. For example, one x86 UML
virtual machine with 256MB memory, UML kernel wastes one page memory.

Signed-off-by: Honggang Li <enjoymindful@gmail.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agogpio: Fix potential NULL handler data in chained irqchip handler
Jarkko Nikula [Tue, 16 Sep 2014 13:23:15 +0000 (16:23 +0300)]
gpio: Fix potential NULL handler data in chained irqchip handler

commit 324b0398781e7afb846378dd2d8a4374faaf236b upstream.

There is possibility with misconfigured pins that interrupt occurs instantly
after setting irq_set_chained_handler() in gpiochip_set_chained_irqchip().
Now if handler gets called before irq_set_handler_data() the handler gets
NULL handler data.

Fix this by moving irq_set_handler_data() call before
irq_set_chained_handler() in gpiochip_set_chained_irqchip().

Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoata_piix: Add Device IDs for Intel 9 Series PCH
James Ralston [Wed, 27 Aug 2014 21:31:58 +0000 (14:31 -0700)]
ata_piix: Add Device IDs for Intel 9 Series PCH

commit 6cad1376954e591c3c41500c4e586e183e7ffe6d upstream.

This patch adds the IDE mode SATA Device IDs for the Intel 9 Series PCH.

Signed-off-by: James Ralston <james.d.ralston@intel.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agohwmon: (ds1621) Update zbits after conversion rate change
Robert Coulson [Thu, 28 Aug 2014 17:45:43 +0000 (10:45 -0700)]
hwmon: (ds1621) Update zbits after conversion rate change

commit 39c627a084475e8a690a4a9e7601410ca173ddd2 upstream.

After the conversion rate is changed, the zbits are not updated,
but should be, since they are used later in the set_temp function.

Fixes: a50d9a4d9ad3 ("hwmon: (ds1621) Fix temperature rounding operations")
Reported-by: Murat Ilsever <murat.ilsever@gmail.com>
Signed-off-by: Robert Coulson <rob.coulson@gmail.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoInput: i8042 - add nomux quirk for Avatar AVIU-145A6
Hans de Goede [Thu, 11 Sep 2014 17:10:26 +0000 (10:10 -0700)]
Input: i8042 - add nomux quirk for Avatar AVIU-145A6

commit d2682118f4bb3ceb835f91c1a694407a31bb7378 upstream.

The sys_vendor / product_name are somewhat generic unfortunately, so this
may lead to some false positives. But nomux usually does no harm, where as
not having it clearly is causing problems on the Avatar AVIU-145A6.

https://bugzilla.kernel.org/show_bug.cgi?id=77391

Reported-by: Hugo P <saurosii@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoInput: i8042 - add Fujitsu U574 to no_timeout dmi table
Hans de Goede [Wed, 10 Sep 2014 20:53:37 +0000 (13:53 -0700)]
Input: i8042 - add Fujitsu U574 to no_timeout dmi table

commit cc18a69c92d0972bc2fc5a047ee3be1e8398171b upstream.

https://bugzilla.kernel.org/show_bug.cgi?id=69731

Reported-by: Jason Robinson <mail@jasonrobinson.me>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoInput: atkbd - do not try 'deactivate' keyboard on any LG laptops
Dmitry Torokhov [Wed, 10 Sep 2014 20:50:37 +0000 (13:50 -0700)]
Input: atkbd - do not try 'deactivate' keyboard on any LG laptops

commit c01206796139e2b1feb7539bc72174fef1c6dc6e upstream.

We are getting more and more reports about LG laptops not having
functioning keyboard if we try to deactivate keyboard during probe.
Given that having keyboard deactivated is merely "nice to have"
instead of a hard requirement for probing, let's disable it on all
LG boxes instead of trying to hunt down particular models.

This change is prompted by patches trying to add "LG Electronics"/"ROCKY"
and "LG Electronics"/"LW60-F27B" to the DMI list.

https://bugzilla.kernel.org/show_bug.cgi?id=77051

Reported-by: Jaime Velasco Juan <jsagarribay@gmail.com>
Reported-by: Georgios Tsalikis <georgios@tsalikis.net>
Tested-by: Jaime Velasco Juan <jsagarribay@gmail.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoInput: elantech - fix detection of touchpad on ASUS s301l
Hans de Goede [Mon, 8 Sep 2014 21:39:52 +0000 (14:39 -0700)]
Input: elantech - fix detection of touchpad on ASUS s301l

commit 271329b3c798b2102120f5df829071c211ef00ed upstream.

Adjust Elantech signature validation to account fo rnewer models of
touchpads.

Reported-and-tested-by: Màrius Monton <marius.monton@gmail.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoInput: synaptics - add support for ForcePads
Dmitry Torokhov [Sat, 30 Aug 2014 20:51:06 +0000 (13:51 -0700)]
Input: synaptics - add support for ForcePads

commit 5715fc764f7753d464dbe094b5ef9cffa6e479a4 upstream.

ForcePads are found on HP EliteBook 1040 laptops. They lack any kind of
physical buttons, instead they generate primary button click when user
presses somewhat hard on the surface of the touchpad. Unfortunately they
also report primary button click whenever there are 2 or more contacts
on the pad, messing up all multi-finger gestures (2-finger scrolling,
multi-finger tapping, etc). To cope with this behavior we introduce a
delay (currently 50 msecs) in reporting primary press in case more
contacts appear.

Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoInput: serport - add compat handling for SPIOCSTYPE ioctl
John Sung [Tue, 9 Sep 2014 17:06:51 +0000 (10:06 -0700)]
Input: serport - add compat handling for SPIOCSTYPE ioctl

commit a80d8b02751060a178bb1f7a6b7a93645a7a308b upstream.

When running a 32-bit inputattach utility in a 64-bit system, there will be
error code "inputattach: can't set device type". This is caused by the
serport device driver not supporting compat_ioctl, so that SPIOCSTYPE ioctl
fails.

Signed-off-by: John Sung <penmount.touch@gmail.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agodm crypt: fix access beyond the end of allocated space
Mikulas Patocka [Thu, 28 Aug 2014 15:09:31 +0000 (11:09 -0400)]
dm crypt: fix access beyond the end of allocated space

commit d49ec52ff6ddcda178fc2476a109cf1bd1fa19ed upstream.

The DM crypt target accesses memory beyond allocated space resulting in
a crash on 32 bit x86 systems.

This bug is very old (it dates back to 2.6.25 commit 3a7f6c990ad04 "dm
crypt: use async crypto").  However, this bug was masked by the fact
that kmalloc rounds the size up to the next power of two.  This bug
wasn't exposed until 3.17-rc1 commit 298a9fa08a ("dm crypt: use per-bio
data").  By switching to using per-bio data there was no longer any
padding beyond the end of a dm-crypt allocated memory block.

To minimize allocation overhead dm-crypt puts several structures into one
block allocated with kmalloc.  The block holds struct ablkcipher_request,
cipher-specific scratch pad (crypto_ablkcipher_reqsize(any_tfm(cc))),
struct dm_crypt_request and an initialization vector.

The variable dmreq_start is set to offset of struct dm_crypt_request
within this memory block.  dm-crypt allocates the block with this size:
cc->dmreq_start + sizeof(struct dm_crypt_request) + cc->iv_size.

When accessing the initialization vector, dm-crypt uses the function
iv_of_dmreq, which performs this calculation: ALIGN((unsigned long)(dmreq
+ 1), crypto_ablkcipher_alignmask(any_tfm(cc)) + 1).

dm-crypt allocated "cc->iv_size" bytes beyond the end of dm_crypt_request
structure.  However, when dm-crypt accesses the initialization vector, it
takes a pointer to the end of dm_crypt_request, aligns it, and then uses
it as the initialization vector.  If the end of dm_crypt_request is not
aligned on a crypto_ablkcipher_alignmask(any_tfm(cc)) boundary the
alignment causes the initialization vector to point beyond the allocated
space.

Fix this bug by calculating the variable iv_size_padding and adding it
to the allocated size.

Also correct the alignment of dm_crypt_request.  struct dm_crypt_request
is specific to dm-crypt (it isn't used by the crypto subsystem at all),
so it is aligned on __alignof__(struct dm_crypt_request).

Also align per_bio_data_size on ARCH_KMALLOC_MINALIGN, so that it is
aligned as if the block was allocated with kmalloc.

Reported-by: Krzysztof Kolasa <kkolasa@winsoft.pl>
Tested-by: Milan Broz <gmazyland@gmail.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agodm cache: fix race causing dirty blocks to be marked as clean
Anssi Hannula [Fri, 5 Sep 2014 00:11:28 +0000 (03:11 +0300)]
dm cache: fix race causing dirty blocks to be marked as clean

commit 40aa978eccec61347cd47b97c598df49acde8be5 upstream.

When a writeback or a promotion of a block is completed, the cell of
that block is removed from the prison, the block is marked as clean, and
the clear_dirty() callback of the cache policy is called.

Unfortunately, performing those actions in this order allows an incoming
new write bio for that block to come in before clearing the dirty status
is completed and therefore possibly causing one of these two scenarios:

Scenario A:

Thread 1                      Thread 2
cell_defer()                  .
- cell removed from prison    .
- detained bios queued        .
.                             incoming write bio
.                             remapped to cache
.                             set_dirty() called,
.                               but block already dirty
.                               => it does nothing
clear_dirty()                 .
- block marked clean          .
- policy clear_dirty() called .

Result: Block is marked clean even though it is actually dirty. No
writeback will occur.

Scenario B:

Thread 1                      Thread 2
cell_defer()                  .
- cell removed from prison    .
- detained bios queued        .
clear_dirty()                 .
- block marked clean          .
.                             incoming write bio
.                             remapped to cache
.                             set_dirty() called
.                             - block marked dirty
.                             - policy set_dirty() called
- policy clear_dirty() called .

Result: Block is properly marked as dirty, but policy thinks it is clean
and therefore never asks us to writeback it.
This case is visible in "dmsetup status" dirty block count (which
normally decreases to 0 on a quiet device).

Fix these issues by calling clear_dirty() before calling cell_defer().
Incoming bios for that block will then be detained in the cell and
released only after clear_dirty() has completed, so the race will not
occur.

Found by inspecting the code after noticing spurious dirty counts
(scenario B).

Signed-off-by: Anssi Hannula <anssi.hannula@iki.fi>
Acked-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoblock: Fix dev_t minor allocation lifetime
Keith Busch [Tue, 26 Aug 2014 15:05:36 +0000 (09:05 -0600)]
block: Fix dev_t minor allocation lifetime

commit 2da78092dda13f1efd26edbbf99a567776913750 upstream.

Releases the dev_t minor when all references are closed to prevent
another device from acquiring the same major/minor.

Since the partition's release may be invoked from call_rcu's soft-irq
context, the ext_dev_idr's mutex had to be replaced with a spinlock so
as not so sleep.

Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agofutex: Unlock hb->lock in futex_wait_requeue_pi() error path
Thomas Gleixner [Thu, 11 Sep 2014 21:44:35 +0000 (23:44 +0200)]
futex: Unlock hb->lock in futex_wait_requeue_pi() error path

commit 13c42c2f43b19aab3195f2d357db00d1e885eaa8 upstream.

futex_wait_requeue_pi() calls futex_wait_setup(). If
futex_wait_setup() succeeds it returns with hb->lock held and
preemption disabled. Now the sanity check after this does:

        if (match_futex(&q.key, &key2)) {
    ret = -EINVAL;
goto out_put_keys;
}

which releases the keys but does not release hb->lock.

So we happily return to user space with hb->lock held and therefor
preemption disabled.

Unlock hb->lock before taking the exit route.

Reported-by: Dave "Trinity" Jones <davej@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Darren Hart <dvhart@linux.intel.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1409112318500.4178@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoworkqueue: apply __WQ_ORDERED to create_singlethread_workqueue()
Tejun Heo [Fri, 12 Sep 2014 19:14:30 +0000 (04:14 +0900)]
workqueue: apply __WQ_ORDERED to create_singlethread_workqueue()

commit e09c2c295468476a239d13324ce9042ec4de05eb upstream.

create_singlethread_workqueue() is a compat interface for single
threaded workqueue which maps to ordered workqueue w/ rescuer in the
current implementation.  create_singlethread_workqueue() currently
implemented by invoking alloc_workqueue() w/ appropriate parameters.

8719dceae2f9 ("workqueue: reject adjusting max_active or applying
attrs to ordered workqueues") introduced __WQ_ORDERED to protect
ordered workqueues against dynamic attribute changes which can break
ordering guarantees but forgot to apply it to
create_singlethread_workqueue().  This in itself is okay as nobody
currently uses dynamic attribute change on workqueues created with
create_singlethread_workqueue().

However, 4c16bd327c ("workqueue: implement NUMA affinity for unbound
workqueues") broke singlethreaded guarantee for ordered workqueues
through allocating a separate pool_workqueue on each NUMA node by
default.  A later change 8a2b75384444 ("workqueue: fix ordered
workqueues in NUMA setups") fixed it by allocating only one global
pool_workqueue if __WQ_ORDERED is set.

Combined, the __WQ_ORDERED omission in create_singlethread_workqueue()
became critical breaking its single threadedness and ordering
guarantee.

Let's make create_singlethread_workqueue() wrap
alloc_ordered_workqueue() instead so that it inherits __WQ_ORDERED and
can implicitly track future ordered_workqueue changes.

v2: I missed that __WQ_ORDERED now protects against pwq splitting
    across NUMA nodes and incorrectly described the patch as a
    nice-to-have fix to protect against future dynamic attribute
    usages.  Oleg pointed out that this is actually a critical
    breakage due to 8a2b75384444 ("workqueue: fix ordered workqueues
    in NUMA setups").

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Mike Anderson <mike.anderson@us.ibm.com>
Cc: Oleg Nesterov <onestero@redhat.com>
Cc: Gustavo Luiz Duarte <gduarte@redhat.com>
Cc: Tomas Henzl <thenzl@redhat.com>
Fixes: 4c16bd327c ("workqueue: implement NUMA affinity for unbound workqueues")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoiwlwifi: mvm: set MAC_FILTER_IN_BEACON correctly for STA/P2P client
Luciano Coelho [Wed, 30 Jul 2014 12:04:08 +0000 (15:04 +0300)]
iwlwifi: mvm: set MAC_FILTER_IN_BEACON correctly for STA/P2P client

commit 7c8b3bc688bdcfd7789efbd17da4d85b104ad351 upstream.

In commit cad3f08c (iwlwifi: mvm: enable MAC_FILTER_IN_BEACON when
forced_assoc_off is set) the code to set the MAC_FILTER_IN_BEACON flag
was accidentally moved to the main block of the if statement, while it
should be in the else block instead.  Move it to the right place.

Fixes: cad3f08c23de ("iwlwifi: mvm: enable MAC_FILTER_IN_BEACON when forced_assoc_off is set")
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Luciano Coelho <luciano.coelho@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoiwlwifi: mvm: treat EAPOLs like mgmt frames wrt rate
Eyal Shapira [Tue, 2 Sep 2014 15:39:21 +0000 (18:39 +0300)]
iwlwifi: mvm: treat EAPOLs like mgmt frames wrt rate

commit aa11bbf3df026d6b1c6b528bef634fd9de7c2619 upstream.

Using the LQ table which is initially set according to
the rssi could lead to EAPOLs being sent in high legacy
rates like 54mbps.
It's better to avoid sending EAPOLs in high rates as it reduces
the chances of a successful 4-Way handshake.
Avoid this and treat them like other mgmt frames which would
initially get sent at the basic rate.

Signed-off-by: Eyal Shapira <eyalx.shapira@intel.com>
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoiwlwifi: increase DEFAULT_MAX_TX_POWER
Eliad Peller [Tue, 26 Aug 2014 08:23:11 +0000 (11:23 +0300)]
iwlwifi: increase DEFAULT_MAX_TX_POWER

commit 22d059a5c7c5de61e53c88e30b65e55fbfd91e91 upstream.

The chip is able to transmit up to 22dBm, so set
the constant appropriately.

Signed-off-by: Eliad Peller <eliadx.peller@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoiwlwifi: mvm: fix endianity issues with Smart Fifo commands
Emmanuel Grumbach [Thu, 31 Jul 2014 11:32:37 +0000 (14:32 +0300)]
iwlwifi: mvm: fix endianity issues with Smart Fifo commands

commit 86974bff066dd8b98be46d7c7d3aba89034f0833 upstream.

This code was broken on big endian systems. Sparse didn't
catch the bug since the firmware command was not tagged as
little endian.
Fix the bug for big endian systems and tag the field in the
firmware command to prevent such issues in the future.

Fixes: 1f3b0ff8ec ("iwlwifi: mvm: Add Smart FIFO support")
Reviewed-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoRevert "iwlwifi: dvm: don't enable CTS to self"
Emmanuel Grumbach [Sun, 31 Aug 2014 19:11:11 +0000 (22:11 +0300)]
Revert "iwlwifi: dvm: don't enable CTS to self"

commit f47f46d7b09cf1d09e4b44b6cc4dd7d68a08028c upstream.

This reverts commit 43d826ca5979927131685cc2092c7ce862cb91cd.

This commit caused packet loss.

Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoSCSI: libiscsi: fix potential buffer overrun in __iscsi_conn_send_pdu
Mike Christie [Wed, 3 Sep 2014 05:00:39 +0000 (00:00 -0500)]
SCSI: libiscsi: fix potential buffer overrun in __iscsi_conn_send_pdu

commit db9bfd64b14a3a8f1868d2164518fdeab1b26ad1 upstream.

This patches fixes a potential buffer overrun in __iscsi_conn_send_pdu.
This function is used by iscsi drivers and userspace to send iscsi PDUs/
commands. For login commands, we have a set buffer size. For all other
commands we do not support data buffers.

This was reported by Dan Carpenter here:
http://www.spinics.net/lists/linux-scsi/msg66838.html

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoNFC: microread: Potential overflows in microread_target_discovered()
Dan Carpenter [Mon, 1 Sep 2014 17:27:29 +0000 (20:27 +0300)]
NFC: microread: Potential overflows in microread_target_discovered()

commit d07f1e8600ccb885c8f4143402b8912f7d827bcb upstream.

Smatch says that skb->data is untrusted so we need to check to make sure
that the memcpy() doesn't overflow.

Fixes: cfad1ba87150 ('NFC: Initial support for Inside Secure microread')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoiscsi-target: Fix memory corruption in iscsit_logout_post_handler_diffcid
Nicholas Bellinger [Wed, 17 Sep 2014 18:45:17 +0000 (11:45 -0700)]
iscsi-target: Fix memory corruption in iscsit_logout_post_handler_diffcid

commit b53b0d99d6fbf7d44330395349a895521cfdbc96 upstream.

This patch fixes a bug in iscsit_logout_post_handler_diffcid() where
a pointer used as storage for list_for_each_entry() was incorrectly
being used to determine if no matching entry had been found.

This patch changes iscsit_logout_post_handler_diffcid() to key off
bool conn_found to determine if the function needs to exit early.

Reported-by: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoiscsi-target: avoid NULL pointer in iscsi_copy_param_list failure
Joern Engel [Tue, 2 Sep 2014 21:49:54 +0000 (17:49 -0400)]
iscsi-target: avoid NULL pointer in iscsi_copy_param_list failure

commit 8ae757d09c45102b347a1bc2867f54ffc1ab8fda upstream.

In iscsi_copy_param_list() a failed iscsi_param_list memory allocation
currently invokes iscsi_release_param_list() to cleanup, and will promptly
trigger a NULL pointer dereference.

Instead, go ahead and return for the first iscsi_copy_param_list()
failure case.

Found by coverity.

Signed-off-by: Joern Engel <joern@logfs.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agotarget: Fix inverted logic in SE_DEV_ALUA_SUPPORT_STATE_STORE
Sebastian Herbszt [Sun, 31 Aug 2014 22:17:53 +0000 (00:17 +0200)]
target: Fix inverted logic in SE_DEV_ALUA_SUPPORT_STATE_STORE

commit 1f0b030c45c781f9fe568e5e2a813d6c8567a051 upstream.

Fix inverted logic in SE_DEV_ALUA_SUPPORT_STATE_STORE for setting
the supported ALUA access states via configfs, originally introduced
in commit b0a382c5.

A value of 1 should enable the support, not disable it.

Signed-off-by: Sebastian Herbszt <herbszt@gmx.de>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoTarget/iser: Don't put isert_conn inside disconnected handler
Sagi Grimberg [Wed, 2 Jul 2014 13:19:25 +0000 (16:19 +0300)]
Target/iser: Don't put isert_conn inside disconnected handler

commit 0fc4ea701fcf5bc51ace4e288af5be741465f776 upstream.

disconnected_handler is invoked on several CM events (such
as DISCONNECTED, DEVICE_REMOVAL, TIMEWAIT_EXIT...). Since
multiple  events can occur while before isert_free_conn is
invoked, we might put all isert_conn references and free
the connection too early.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoTarget/iser: Get isert_conn reference once got to connected_handler
Sagi Grimberg [Wed, 2 Jul 2014 13:19:24 +0000 (16:19 +0300)]
Target/iser: Get isert_conn reference once got to connected_handler

commit c2f88b17a1d97ca4ecd96cc22333a7a4f1407d39 upstream.

In case the connection didn't reach connected state, disconnected
handler will never be invoked thus the second kref_put on
isert_conn will be missing.

Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoiio: adc: at91: don't use the last converted data register
Ludovic Desroches [Thu, 9 Oct 2014 14:02:00 +0000 (15:02 +0100)]
iio: adc: at91: don't use the last converted data register

commit d4f51956ac8ad302db9b0c4e4232775b1baa7b44 upstream.

If touchscreen mode is enabled and a conversion is requested on another
channel, the result in the last converted data register can be a
touchscreen relative value. Starting a conversion involves to do a
conversion for all active channel. It starts with ADC channels and ends
with touchscreen channels. Then if ADC_LCD register is not read quickly,
its content may be a touchscreen conversion.
To remove this temporal constraint, the conversion value is taken from
the channel data register.

Signed-off-by: Ludovic Desroches <ludovic.desroches@atmel.com>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoiio: adc: xilinx-xadc: assign auxiliary channels address correctly
Subbaraya Sundeep Bhatta [Sun, 9 Nov 2014 09:55:00 +0000 (09:55 +0000)]
iio: adc: xilinx-xadc: assign auxiliary channels address correctly

commit 1887e724e2b6df06847522fe9dc2ab53639516cc upstream.

This patch fixes incorrect logic for assigning address
to auxiliary channels of xilinx xadc.

Signed-off-by: Subbaraya Sundeep Bhatta <sbhatta@xilinx.com>
Acked-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoiio:inkern: fix overwritten -EPROBE_DEFER in of_iio_channel_get_by_name
Johannes Pointner [Mon, 25 Aug 2014 08:04:00 +0000 (09:04 +0100)]
iio:inkern: fix overwritten -EPROBE_DEFER in of_iio_channel_get_by_name

commit 872687f626e033b4ddfaec1e410057cfc6636d77 upstream.

Fixes: a2c12493ed7e ('iio: of_iio_channel_get_by_name() returns non-null pointers for error legs')
which improperly assumes that of_iio_channel_get_by_name must always
return NULL and thus now hides -EPROBE_DEFER.

Signed-off-by: Johannes Pointner <johannes.pointner@br-automation.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoiio:magnetometer: bugfix magnetometers gain values
Denis CIOCCA [Thu, 9 Oct 2014 12:55:00 +0000 (13:55 +0100)]
iio:magnetometer: bugfix magnetometers gain values

commit a31d0928999fbf33b3a6042e8bcb7b7f7e07d094 upstream.

This patch fix gains values. The first driver was designed using
engineering samples, in mass production the values are changed.

Signed-off-by: Denis Ciocca <denis.ciocca@st.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoiio: adc: ad_sigma_delta: Fix indio_dev->trig assignment
Srinivas Pandruvada [Fri, 22 Aug 2014 20:48:00 +0000 (21:48 +0100)]
iio: adc: ad_sigma_delta: Fix indio_dev->trig assignment

commit 9e5846be33277802c0c76e5c12825d0e4d27f639 upstream.

This can result in wrong reference count for trigger device, call
iio_trigger_get to increment reference.
Refer to http://www.spinics.net/lists/linux-iio/msg13669.html for discussion
with Jonathan.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoiio: st_sensors: Fix indio_dev->trig assignment
Srinivas Pandruvada [Fri, 22 Aug 2014 20:48:00 +0000 (21:48 +0100)]
iio: st_sensors: Fix indio_dev->trig assignment

commit f0e84acd7056e6d7ade551c6439531606ae30a46 upstream.

This can result in wrong reference count for trigger device, call
iio_trigger_get to increment reference.
Refer to http://www.spinics.net/lists/linux-iio/msg13669.html for discussion
with Jonathan.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoiio: meter: ade7758: Fix indio_dev->trig assignment
Srinivas Pandruvada [Fri, 22 Aug 2014 20:48:00 +0000 (21:48 +0100)]
iio: meter: ade7758: Fix indio_dev->trig assignment

commit 0495081179212b758775df752e657ea71dcae020 upstream.

This can result in wrong reference count for trigger device, call
iio_trigger_get to increment reference.
Refer to http://www.spinics.net/lists/linux-iio/msg13669.html for discussion
with Jonathan.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Acked-by: Lars-Peter Clausen <lars@metafoo.de>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoiio: inv_mpu6050: Fix indio_dev->trig assignment
Srinivas Pandruvada [Fri, 22 Aug 2014 20:48:00 +0000 (21:48 +0100)]
iio: inv_mpu6050: Fix indio_dev->trig assignment

commit b07e3b3850b2e1f09c19f54d3ed7210d9f529e2c upstream.

This can result in wrong reference count for trigger device, call
iio_trigger_get to increment reference.
Refer to http://www.spinics.net/lists/linux-iio/msg13669.html for discussion
with Jonathan.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoiio: gyro: itg3200: Fix indio_dev->trig assignment
Srinivas Pandruvada [Fri, 22 Aug 2014 20:48:00 +0000 (21:48 +0100)]
iio: gyro: itg3200: Fix indio_dev->trig assignment

commit 0b4dce2ee694a991ef38203ec5ff91a738518cb3 upstream.

This can result in wrong reference count for trigger device, call
iio_trigger_get to increment reference.
Refer to http://www.spinics.net/lists/linux-iio/msg13669.html for discussion
with Jonathan.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoiio: hid_sensor_hub: Fix indio_dev->trig assignment
Srinivas Pandruvada [Fri, 22 Aug 2014 20:48:00 +0000 (21:48 +0100)]
iio: hid_sensor_hub: Fix indio_dev->trig assignment

commit 55a6f9ddfdea0d2d343cd1b39baf8aa752664b6e upstream.

This can result in wrong reference count for trigger device, call
iio_trigger_get to increment reference.
Refer to http://www.spinics.net/lists/linux-iio/msg13669.html for discussion
with Jonathan.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoiio: accel: bma180: Fix indio_dev->trig assignment
Srinivas Pandruvada [Fri, 22 Aug 2014 20:48:00 +0000 (21:48 +0100)]
iio: accel: bma180: Fix indio_dev->trig assignment

commit 0668a4e4d297328ce08b44d91d160537596115e2 upstream.

This can result in wrong reference count for trigger device, call
iio_trigger_get to increment reference.
Refer to http://www.spinics.net/lists/linux-iio/msg13669.html for discussion
with Jonathan.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoiio:trigger: modify return value for iio_trigger_get
Srinivas Pandruvada [Fri, 22 Aug 2014 20:48:00 +0000 (21:48 +0100)]
iio:trigger: modify return value for iio_trigger_get

commit f153566570fb9e32c2f59182883f4f66048788fb upstream.

Instead of a void function, return the trigger pointer.

Whilst not in of itself a fix, this makes the following set of
7 fixes cleaner than they would otherwise be.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoSMB3: Fix oops when creating symlinks on smb3
Steve French [Mon, 15 Sep 2014 04:27:09 +0000 (23:27 -0500)]
SMB3: Fix oops when creating symlinks on smb3

commit da80659d4aa758dc6935b10ec64513f0b67bc969 upstream.

We were not checking for symlink support properly for SMB2/SMB3
mounts so could oops when mounted with mfsymlinks when try
to create symlink when mfsymlinks on smb2/smb3 mounts

Signed-off-by: Steve French <smfrench@gmail.com>
CC: Sachin Prabhu <sprabhu@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoASoC: davinci-mcasp: Correct rx format unit configuration
Peter Ujfalusi [Thu, 4 Sep 2014 07:52:53 +0000 (10:52 +0300)]
ASoC: davinci-mcasp: Correct rx format unit configuration

commit fe0a29e163a5d045c73faab682a8dac71c2f8012 upstream.

In case of capture we should not use rotation. The reverse and mask is
enough to get the data align correctly from the bus to MCU:
Format   data from bus    after reverse (XRBUF)
S16_LE:  |LSB|MSB|xxx|xxx|  |xxx|xxx|MSB|LSB|
S24_3LE: |LSB|DAT|MSB|xxx|  |xxx|MSB|DAT|LSB|
S24_LE:  |LSB|DAT|MSB|xxx|  |xxx|MSB|DAT|LSB|
S32_LE:  |LSB|DAT|DAT|MSB|  |MSB|DAT|DAT|LSB|

With this patch all supported formats will work for playback and capture.

Reported-by: Jyri Sarha <jsarha@ti.com> (broken S24_3LE capture)
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoshmem: fix nlink for rename overwrite directory
Miklos Szeredi [Wed, 24 Sep 2014 15:56:17 +0000 (17:56 +0200)]
shmem: fix nlink for rename overwrite directory

commit b928095b0a7cff7fb9fcf4c706348ceb8ab2c295 upstream.

If overwriting an empty directory with rename, then need to drop the extra
nlink.

Test prog:

#include <stdio.h>
#include <fcntl.h>
#include <err.h>
#include <sys/stat.h>

int main(void)
{
const char *test_dir1 = "test-dir1";
const char *test_dir2 = "test-dir2";
int res;
int fd;
struct stat statbuf;

res = mkdir(test_dir1, 0777);
if (res == -1)
err(1, "mkdir(\"%s\")", test_dir1);

res = mkdir(test_dir2, 0777);
if (res == -1)
err(1, "mkdir(\"%s\")", test_dir2);

fd = open(test_dir2, O_RDONLY);
if (fd == -1)
err(1, "open(\"%s\")", test_dir2);

res = rename(test_dir1, test_dir2);
if (res == -1)
err(1, "rename(\"%s\", \"%s\")", test_dir1, test_dir2);

res = fstat(fd, &statbuf);
if (res == -1)
err(1, "fstat(%i)", fd);

if (statbuf.st_nlink != 0) {
fprintf(stderr, "nlink is %lu, should be 0\n", statbuf.st_nlink);
return 1;
}

return 0;
}

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agox86/kaslr: Avoid the setup_data area when picking location
Kees Cook [Thu, 11 Sep 2014 16:19:31 +0000 (09:19 -0700)]
x86/kaslr: Avoid the setup_data area when picking location

commit 0cacbfbeb5077b63d5d3cf6df88b14ac12ad584b upstream.

The KASLR location-choosing logic needs to avoid the setup_data
list memory areas as well. Without this, it would be possible to
have the ASLR position stomp on the memory, ultimately causing
the boot to fail.

Signed-off-by: Kees Cook <keescook@chromium.org>
Tested-by: Baoquan He <bhe@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20140911161931.GA12001@www.outflux.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agox86 early_ioremap: Increase FIX_BTMAPS_SLOTS to 8
Dave Young [Tue, 26 Aug 2014 09:06:41 +0000 (17:06 +0800)]
x86 early_ioremap: Increase FIX_BTMAPS_SLOTS to 8

commit 3eddc69ffeba092d288c386646bfa5ec0fce25fd upstream.

3.16 kernel boot fail with earlyprintk=efi, it keeps scrolling at the
bottom line of screen.

Bisected, the first bad commit is below:
commit 86dfc6f339886559d80ee0d4bd20fe5ee90450f0
Author: Lv Zheng <lv.zheng@intel.com>
Date:   Fri Apr 4 12:38:57 2014 +0800

    ACPICA: Tables: Fix table checksums verification before installation.

I did some debugging by enabling both serial and efi earlyprintk, below is
some debug dmesg, seems early_ioremap fails in scroll up function due to
no free slot, see below dmesg output:

  WARNING: CPU: 0 PID: 0 at mm/early_ioremap.c:116 __early_ioremap+0x90/0x1c4()
  __early_ioremap(ed00c80000000c80) not found slot
  Modules linked in:
  CPU: 0 PID: 0 Comm: swapper Not tainted 3.17.0-rc1+ #204
  Hardware name: Hewlett-Packard HP Z420 Workstation/1589, BIOS J61 v03.15 05/09/2013
  Call Trace:
    dump_stack+0x4e/0x7a
    warn_slowpath_common+0x75/0x8e
    ? __early_ioremap+0x90/0x1c4
    warn_slowpath_fmt+0x47/0x49
    __early_ioremap+0x90/0x1c4
    ? sprintf+0x46/0x48
    early_ioremap+0x13/0x15
    early_efi_map+0x24/0x26
    early_efi_scroll_up+0x6d/0xc0
    early_efi_write+0x1b0/0x214
    call_console_drivers.constprop.21+0x73/0x7e
    console_unlock+0x151/0x3b2
    ? vprintk_emit+0x49f/0x532
    vprintk_emit+0x521/0x532
    ? console_unlock+0x383/0x3b2
    printk+0x4f/0x51
    acpi_os_vprintf+0x2b/0x2d
    acpi_os_printf+0x43/0x45
    acpi_info+0x5c/0x63
    ? __acpi_map_table+0x13/0x18
    ? acpi_os_map_iomem+0x21/0x147
    acpi_tb_print_table_header+0x177/0x186
    acpi_tb_install_table_with_override+0x4b/0x62
    acpi_tb_install_standard_table+0xd9/0x215
    ? early_ioremap+0x13/0x15
    ? __acpi_map_table+0x13/0x18
    acpi_tb_parse_root_table+0x16e/0x1b4
    acpi_initialize_tables+0x57/0x59
    acpi_table_init+0x50/0xce
    acpi_boot_table_init+0x1e/0x85
    setup_arch+0x9b7/0xcc4
    start_kernel+0x94/0x42d
    ? early_idt_handlers+0x120/0x120
    x86_64_start_reservations+0x2a/0x2c
    x86_64_start_kernel+0xf3/0x100

Quote reply from Lv.zheng about the early ioremap slot usage in this case:

"""
In early_efi_scroll_up(), 2 mapping entries will be used for the src/dst screen buffer.
In drivers/acpi/acpica/tbutils.c, we've improved the early table loading code in acpi_tb_parse_root_table().
We now need 2 mapping entries:
1. One mapping entry is used for RSDT table mapping. Each RSDT entry contains an address for another ACPI table.
2. For each entry in RSDP, we need another mapping entry to map the table to perform necessary check/override before installing it.

When acpi_tb_parse_root_table() prints something through EFI earlyprintk console, we'll have 4 mapping entries used.
The current 4 slots setting of early_ioremap() seems to be too small for such a use case.
"""

Thus increase the slot to 8 in this patch to fix this issue.
boot-time mappings become 512 page with this patch.

Signed-off-by: Dave Young <dyoung@redhat.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agox86/xen: don't copy bogus duplicate entries into kernel page tables
Stefan Bader [Tue, 2 Sep 2014 10:16:01 +0000 (11:16 +0100)]
x86/xen: don't copy bogus duplicate entries into kernel page tables

commit 0b5a50635fc916cf46e3de0b819a61fc3f17e7ee upstream.

When RANDOMIZE_BASE (KASLR) is enabled; or the sum of all loaded
modules exceeds 512 MiB, then loading modules fails with a warning
(and hence a vmalloc allocation failure) because the PTEs for the
newly-allocated vmalloc address space are not zero.

  WARNING: CPU: 0 PID: 494 at linux/mm/vmalloc.c:128
           vmap_page_range_noflush+0x2a1/0x360()

This is caused by xen_setup_kernel_pagetables() copying
level2_kernel_pgt into level2_fixmap_pgt, overwriting many non-present
entries.

Without KASLR, the normal kernel image size only covers the first half
of level2_kernel_pgt and module space starts after that.

L4[511]->level3_kernel_pgt[510]->level2_kernel_pgt[  0..255]->kernel
                                                  [256..511]->module
                          [511]->level2_fixmap_pgt[  0..505]->module

This allows 512 MiB of of module vmalloc space to be used before
having to use the corrupted level2_fixmap_pgt entries.

With KASLR enabled, the kernel image uses the full PUD range of 1G and
module space starts in the level2_fixmap_pgt. So basically:

L4[511]->level3_kernel_pgt[510]->level2_kernel_pgt[0..511]->kernel
                          [511]->level2_fixmap_pgt[0..505]->module

And now no module vmalloc space can be used without using the corrupt
level2_fixmap_pgt entries.

Fix this by properly converting the level2_fixmap_pgt entries to MFNs,
and setting level1_fixmap_pgt as read-only.

A number of comments were also using the the wrong L3 offset for
level2_kernel_pgt.  These have been corrected.

Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoxen/manage: Always freeze/thaw processes when suspend/resuming
Ross Lagerwall [Mon, 18 Aug 2014 09:41:36 +0000 (10:41 +0100)]
xen/manage: Always freeze/thaw processes when suspend/resuming

commit 61a734d305e16944b42730ef582a7171dc733321 upstream.

Always freeze processes when suspending and thaw processes when resuming
to prevent a race noticeable with HVM guests.

This prevents a deadlock where the khubd kthread (which is designed to
be freezable) acquires a usb device lock and then tries to allocate
memory which requires the disk which hasn't been resumed yet.
Meanwhile, the xenwatch thread deadlocks waiting for the usb device
lock.

Freezing processes fixes this because the khubd thread is only thawed
after the xenwatch thread finishes resuming all the devices.

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoKVM: s390/mm: Fix guest storage key corruption in ptep_set_access_flags
Christian Borntraeger [Thu, 28 Aug 2014 21:44:57 +0000 (23:44 +0200)]
KVM: s390/mm: Fix guest storage key corruption in ptep_set_access_flags

commit 1951497d90d6754201af3e65241a06f9ef6755cd upstream.

commit 0944fe3f4a32 ("s390/mm: implement software referenced bits")
triggered another paging/storage key corruption. There is an
unhandled invalid->valid pte change where we have to set the real
storage key from the pgste.
When doing paging a guest page might be swapcache or swap and when
faulted in it might be read-only and due to a parallel scan old.
An do_wp_page will make it writeable and young. Due to software
reference tracking this page was invalid and now becomes valid.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>