Vlad Yasevich [Mon, 16 Nov 2015 20:43:44 +0000 (15:43 -0500)]
vlan: Fix untag operations of stacked vlans with REORDER_HEADER off
[ Upstream commit
a6e18ff111701b4ff6947605bfbe9594ec42a6e8 ]
When we have multiple stacked vlan devices all of which have
turned off REORDER_HEADER flag, the untag operation does not
locate the ethernet addresses correctly for nested vlans.
The reason is that in case of REORDER_HEADER flag being off,
the outer vlan headers are put back and the mac_len is adjusted
to account for the presense of the header. Then, the subsequent
untag operation, for the next level vlan, always use VLAN_ETH_HLEN
to locate the begining of the ethernet header and that ends up
being a multiple of 4 bytes short of the actuall beginning
of the mac header (the multiple depending on the how many vlan
encapsulations ethere are).
As a reslult, if there are multiple levles of vlan devices
with REODER_HEADER being off, the recevied packets end up
being dropped.
To solve this, we use skb->mac_len as the offset. The value
is always set on receive path and starts out as a ETH_HLEN.
The value is also updated when the vlan header manupations occur
so we know it will be correct.
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sergei Shtylyov [Thu, 3 Dec 2015 22:45:40 +0000 (01:45 +0300)]
sh_eth: fix kernel oops in skb_put()
[ Upstream commit
248be83dcb3feb3f6332eb3d010a016402138484 ]
In a low memory situation the following kernel oops occurs:
Unable to handle kernel NULL pointer dereference at virtual address
00000050
pgd =
8490c000
[
00000050] *pgd=
4651e831, *pte=
00000000, *ppte=
00000000
Internal error: Oops: 17 [#1] PREEMPT ARM
Modules linked in:
CPU: 0 Not tainted (3.4-at16 #9)
PC is at skb_put+0x10/0x98
LR is at sh_eth_poll+0x2c8/0xa10
pc : [<
8035f780>] lr : [<
8028bf50>] psr:
60000113
sp :
84eb1a90 ip :
84eb1ac8 fp :
84eb1ac4
r10:
0000003f r9 :
000005ea r8 :
00000000
r7 :
00000000 r6 :
940453b0 r5 :
00030000 r4 :
9381b180
r3 :
00000000 r2 :
00000000 r1 :
000005ea r0 :
00000000
Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user
Control:
10c53c7d Table:
4248c059 DAC:
00000015
Process klogd (pid: 2046, stack limit = 0x84eb02e8)
[...]
This is because netdev_alloc_skb() fails and 'mdp->rx_skbuff[entry]' is left
NULL but sh_eth_rx() later uses it without checking. Add such check...
Reported-by: Yasushi SHOJI <yashi@atmark-techno.com>
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hannes Frederic Sowa [Mon, 14 Dec 2015 21:03:39 +0000 (22:03 +0100)]
net: add validation for the socket syscall protocol argument
[ Upstream commit
79462ad02e861803b3840cc782248c7359451cd9 ]
郭永刚 reported that one could simply crash the kernel as root by
using a simple program:
int socket_fd;
struct sockaddr_in addr;
addr.sin_port = 0;
addr.sin_addr.s_addr = INADDR_ANY;
addr.sin_family = 10;
socket_fd = socket(10,3,0x40000000);
connect(socket_fd , &addr,16);
AF_INET, AF_INET6 sockets actually only support 8-bit protocol
identifiers. inet_sock's skc_protocol field thus is sized accordingly,
thus larger protocol identifiers simply cut off the higher bits and
store a zero in the protocol fields.
This could lead to e.g. NULL function pointer because as a result of
the cut off inet_num is zero and we call down to inet_autobind, which
is NULL for raw sockets.
kernel: Call Trace:
kernel: [<
ffffffff816db90e>] ? inet_autobind+0x2e/0x70
kernel: [<
ffffffff816db9a4>] inet_dgram_connect+0x54/0x80
kernel: [<
ffffffff81645069>] SYSC_connect+0xd9/0x110
kernel: [<
ffffffff810ac51b>] ? ptrace_notify+0x5b/0x80
kernel: [<
ffffffff810236d8>] ? syscall_trace_enter_phase2+0x108/0x200
kernel: [<
ffffffff81645e0e>] SyS_connect+0xe/0x10
kernel: [<
ffffffff81779515>] tracesys_phase2+0x84/0x89
I found no particular commit which introduced this problem.
CVE: CVE-2015-8543
Cc: Cong Wang <cwang@twopensource.com>
Reported-by: 郭永刚 <guoyonggang@360.cn>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Dumazet [Wed, 9 Dec 2015 15:25:06 +0000 (07:25 -0800)]
ipv6: sctp: clone options to avoid use after free
[ Upstream commit
9470e24f35ab81574da54e69df90c1eb4a96b43f ]
SCTP is lacking proper np->opt cloning at accept() time.
TCP and DCCP use ipv6_dup_options() helper, do the same
in SCTP.
We might later factorize this code in a common helper to avoid
future mistakes.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Marcelo Ricardo Leitner [Fri, 4 Dec 2015 17:14:04 +0000 (15:14 -0200)]
sctp: update the netstamp_needed counter when copying sockets
[ Upstream commit
01ce63c90170283a9855d1db4fe81934dddce648 ]
Dmitry Vyukov reported that SCTP was triggering a WARN on socket destroy
related to disabling sock timestamp.
When SCTP accepts an association or peel one off, it copies sock flags
but forgot to call net_enable_timestamp() if a packet timestamping flag
was copied, leading to extra calls to net_disable_timestamp() whenever
such clones were closed.
The fix is to call net_enable_timestamp() whenever we copy a sock with
that flag on, like tcp does.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Marcelo Ricardo Leitner [Fri, 4 Dec 2015 17:14:03 +0000 (15:14 -0200)]
sctp: use the same clock as if sock source timestamps were on
[ Upstream commit
cb5e173ed7c03a0d4630ce68a95a186cce3cc872 ]
SCTP echoes a cookie o INIT ACK chunks that contains a timestamp, for
detecting stale cookies. This cookie is echoed back to the server by the
client and then that timestamp is checked.
Thing is, if the listening socket is using packet timestamping, the
cookie is encoded with ktime_get() value and checked against
ktime_get_real(), as done by __net_timestamp().
The fix is to sctp also use ktime_get_real(), so we can compare bananas
with bananas later no matter if packet timestamping was enabled or not.
Fixes: 52db882f3fc2 ("net: sctp: migrate cookie life from timeval to ktime")
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pavel Machek [Fri, 4 Dec 2015 08:50:00 +0000 (09:50 +0100)]
atl1c: Improve driver not to do order 4 GFP_ATOMIC allocation
[ Upstream commit
f2a3771ae8aca879c32336c76ad05a017629bae2 ]
atl1c driver is doing order-4 allocation with GFP_ATOMIC
priority. That often breaks networking after resume. Switch to
GFP_KERNEL. Still not ideal, but should be significantly better.
atl1c_setup_ring_resources() is called from .open() function, and
already uses GFP_KERNEL, so this change is safe.
Signed-off-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Nicolas Dichtel [Thu, 3 Dec 2015 16:21:50 +0000 (17:21 +0100)]
gre6: allow to update all parameters via rtnl
[ Upstream commit
6a61d4dbf4f54b5683e0f1e58d873cecca7cb977 ]
Parameters were updated only if the kernel was unable to find the tunnel
with the new parameters, ie only if core pamareters were updated (keys,
addr, link, type).
Now it's possible to update ttl, hoplimit, flowinfo and flags.
Fixes: c12b395a4664 ("gre: Support GRE over IPv6")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ben Hutchings [Wed, 18 Nov 2015 02:01:21 +0000 (02:01 +0000)]
usb: Use the USB_SS_MULT() macro to decode burst multiplier for log message
commit
5377adb092664d336ac212499961cac5e8728794 upstream.
usb_parse_ss_endpoint_companion() now decodes the burst multiplier
correctly in order to check that it's <= 3, but still uses the wrong
expression if warning that it's > 3.
Fixes: ff30cbc8da42 ("usb: Use the USB_SS_MULT() macro to get the ...")
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alexey Khoroshilov [Fri, 20 Nov 2015 21:36:44 +0000 (00:36 +0300)]
USB: whci-hcd: add check for dma mapping error
commit
f9fa1887dcf26bd346665a6ae3d3f53dec54cba1 upstream.
qset_fill_page_list() do not check for dma mapping errors.
Found by Linux Driver Verification project (linuxtesting.org).
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alan Stern [Thu, 10 Dec 2015 20:27:21 +0000 (15:27 -0500)]
USB: add quirk for devices with broken LPM
commit
ad87e03213b552a5c33d5e1e7a19a73768397010 upstream.
Some USB device / host controller combinations seem to have problems
with Link Power Management. For example, Steinar found that his xHCI
controller wouldn't handle bandwidth calculations correctly for two
video cards simultaneously when LPM was enabled, even though the bus
had plenty of bandwidth available.
This patch introduces a new quirk flag for devices that should remain
disabled for LPM, and creates quirk entries for Steinar's devices.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-by: Steinar H. Gunderson <sgunderson@bigfoot.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Konstantin Shkolnyy [Tue, 10 Nov 2015 22:40:13 +0000 (16:40 -0600)]
USB: cp210x: Remove CP2110 ID from compatibility list
commit
7c90e610b60cd1ed6abafd806acfaedccbbe52d1 upstream.
CP2110 ID (0x10c4, 0xea80) doesn't belong here because it's a HID
and completely different from CP210x devices.
Signed-off-by: Konstantin Shkolnyy <konstantin.shkolnyy@gmail.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jonas Jonsson [Sun, 22 Nov 2015 10:47:18 +0000 (11:47 +0100)]
USB: serial: Another Infineon flash loader USB ID
commit
a0e80fbd56b4573de997c9a088a33abbc1121400 upstream.
The flash loader has been seen on a Telit UE910 modem. The flash loader
is a bit special, it presents both an ACM and CDC Data interface but
only the latter is useful. Unless a magic string is sent to the device
it will disappear and the regular modem device appears instead.
Signed-off-by: Jonas Jonsson <jonas@ludd.ltu.se>
Tested-by: Daniele Palmas <dnlplm@gmail.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jonas Jonsson [Sun, 22 Nov 2015 10:47:17 +0000 (11:47 +0100)]
USB: cdc_acm: Ignore Infineon Flash Loader utility
commit
f33a7f72e5fc033daccbb8d4753d7c5c41a4d67b upstream.
Some modems, such as the Telit UE910, are using an Infineon Flash Loader
utility. It has two interfaces, 2/2/0 (Abstract Modem) and 10/0/0 (CDC
Data). The latter can be used as a serial interface to upgrade the
firmware of the modem. However, that isn't possible when the cdc-acm
driver takes control of the device.
The following is an explanation of the behaviour by Daniele Palmas during
discussion on linux-usb.
"This is what happens when the device is turned on (without modifying
the drivers):
[155492.352031] usb 1-3: new high-speed USB device number 27 using ehci-pci
[155492.485429] usb 1-3: config 1 interface 0 altsetting 0 endpoint 0x81 has an invalid bInterval 255, changing to 11
[155492.485436] usb 1-3: New USB device found, idVendor=058b, idProduct=0041
[155492.485439] usb 1-3: New USB device strings: Mfr=0, Product=0, SerialNumber=0
[155492.485952] cdc_acm 1-3:1.0: ttyACM0: USB ACM device
This is the flashing device that is caught by the cdc-acm driver. Once
the ttyACM appears, the application starts sending a magic string
(simple write on the file descriptor) to keep the device in flashing
mode. If this magic string is not properly received in a certain time
interval, the modem goes on in normal operative mode:
[155493.748094] usb 1-3: USB disconnect, device number 27
[155494.916025] usb 1-3: new high-speed USB device number 28 using ehci-pci
[155495.059978] usb 1-3: New USB device found, idVendor=1bc7, idProduct=0021
[155495.059983] usb 1-3: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[155495.059986] usb 1-3: Product: 6 CDC-ACM + 1 CDC-ECM
[155495.059989] usb 1-3: Manufacturer: Telit
[155495.059992] usb 1-3: SerialNumber:
359658044004697
[155495.138958] cdc_acm 1-3:1.0: ttyACM0: USB ACM device
[155495.140832] cdc_acm 1-3:1.2: ttyACM1: USB ACM device
[155495.142827] cdc_acm 1-3:1.4: ttyACM2: USB ACM device
[155495.144462] cdc_acm 1-3:1.6: ttyACM3: USB ACM device
[155495.145967] cdc_acm 1-3:1.8: ttyACM4: USB ACM device
[155495.147588] cdc_acm 1-3:1.10: ttyACM5: USB ACM device
[155495.154322] cdc_ether 1-3:1.12 wwan0: register 'cdc_ether' at usb-0000:00:1a.7-3, Mobile Broadband Network Device, 00:00:11:12:13:14
Using the cdc-acm driver, the string, though being sent in the same way
than using the usb-serial-simple driver (I can confirm that the data is
passing properly since I used an hw usb sniffer), does not make the
device to stay in flashing mode."
Signed-off-by: Jonas Jonsson <jonas@ludd.ltu.se>
Tested-by: Daniele Palmas <dnlplm@gmail.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Junxiao Bi [Fri, 20 Nov 2015 23:57:30 +0000 (15:57 -0800)]
ocfs2: fix umask ignored issue
commit
8f1eb48758aacf6c1ffce18179295adbf3bd7640 upstream.
New created file's mode is not masked with umask, and this makes umask not
work for ocfs2 volume.
Fixes: 702e5bc ("ocfs2: use generic posix ACL infrastructure")
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Gang He <ghe@suse.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jeff Layton [Wed, 25 Nov 2015 18:50:11 +0000 (13:50 -0500)]
nfs: if we have no valid attrs, then don't declare the attribute cache valid
commit
c812012f9ca7cf89c9e1a1cd512e6c3b5be04b85 upstream.
If we pass in an empty nfs_fattr struct to nfs_update_inode, it will
(correctly) not update any of the attributes, but it then clears the
NFS_INO_INVALID_ATTR flag, which indicates that the attributes are
up to date. Don't clear the flag if the fattr struct has no valid
attrs to apply.
Reviewed-by: Steve French <steve.french@primarydata.com>
Signed-off-by: Jeff Layton <jeff.layton@primarydata.com>
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Benjamin Coddington [Fri, 20 Nov 2015 14:56:20 +0000 (09:56 -0500)]
nfs4: start callback_ident at idr 1
commit
c68a027c05709330fe5b2f50c50d5fa02124b5d8 upstream.
If clp->cl_cb_ident is zero, then nfs_cb_idr_remove_locked() skips removing
it when the nfs_client is freed. A decoding or server bug can then find
and try to put that first nfs_client which would lead to a crash.
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Fixes: d6870312659d ("nfs4client: convert to idr_alloc()")
Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Stefan Richter [Tue, 3 Nov 2015 00:46:21 +0000 (01:46 +0100)]
firewire: ohci: fix JMicron JMB38x IT context discovery
commit
100ceb66d5c40cc0c7018e06a9474302470be73c upstream.
Reported by Clifford and Craig for JMicron OHCI-1394 + SDHCI combo
controllers: Often or even most of the time, the controller is
initialized with the message "added OHCI v1.10 device as card 0, 4 IR +
0 IT contexts, quirks 0x10". With 0 isochronous transmit DMA contexts
(IT contexts), applications like audio output are impossible.
However, OHCI-1394 demands that at least 4 IT contexts are implemented
by the link layer controller, and indeed JMicron JMB38x do implement
four of them. Only their IsoXmitIntMask register is unreliable at early
access.
With my own JMB381 single function controller I found:
- I can reproduce the problem with a lower probability than Craig's.
- If I put a loop around the section which clears and reads
IsoXmitIntMask, then either the first or the second attempt will
return the correct initial mask of 0x0000000f. I never encountered
a case of needing more than a second attempt.
- Consequently, if I put a dummy reg_read(...IsoXmitIntMaskSet)
before the first write, the subsequent read will return the correct
result.
- If I merely ignore a wrong read result and force the known real
result, later isochronous transmit DMA usage works just fine.
So let's just fix this chip bug up by the latter method. Tested with
JMB381 on kernel 3.13 and 4.3.
Since OHCI-1394 generally requires 4 IT contexts at a minium, this
workaround is simply applied whenever the initial read of IsoXmitIntMask
returns 0, regardless whether it's a JMicron chip or not. I never heard
of this issue together with any other chip though.
I am not 100% sure that this fix works on the OHCI-1394 part of JMB380
and JMB388 combo controllers exactly the same as on the JMB381 single-
function controller, but so far I haven't had a chance to let an owner
of a combo chip run a patched kernel.
Strangely enough, IsoRecvIntMask is always reported correctly, even
though it is probed right before IsoXmitIntMask.
Reported-by: Clifford Dunn
Reported-by: Craig Moore <craig.moore@qenos.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Daeho Jeong [Sun, 18 Oct 2015 21:02:56 +0000 (17:02 -0400)]
ext4, jbd2: ensure entering into panic after recording an error in superblock
commit
4327ba52afd03fc4b5afa0ee1d774c9c5b0e85c5 upstream.
If a EXT4 filesystem utilizes JBD2 journaling and an error occurs, the
journaling will be aborted first and the error number will be recorded
into JBD2 superblock and, finally, the system will enter into the
panic state in "errors=panic" option. But, in the rare case, this
sequence is little twisted like the below figure and it will happen
that the system enters into panic state, which means the system reset
in mobile environment, before completion of recording an error in the
journal superblock. In this case, e2fsck cannot recognize that the
filesystem failure occurred in the previous run and the corruption
wouldn't be fixed.
Task A Task B
ext4_handle_error()
-> jbd2_journal_abort()
-> __journal_abort_soft()
-> __jbd2_journal_abort_hard()
| -> journal->j_flags |= JBD2_ABORT;
|
| __ext4_abort()
| -> jbd2_journal_abort()
| | -> __journal_abort_soft()
| | -> if (journal->j_flags & JBD2_ABORT)
| | return;
| -> panic()
|
-> jbd2_journal_update_sb_errno()
Tested-by: Hobin Woo <hobin.woo@samsung.com>
Signed-off-by: Daeho Jeong <daeho.jeong@samsung.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Lukas Czerner [Sun, 18 Oct 2015 02:57:06 +0000 (22:57 -0400)]
ext4: fix potential use after free in __ext4_journal_stop
commit
6934da9238da947628be83635e365df41064b09b upstream.
There is a use-after-free possibility in __ext4_journal_stop() in the
case that we free the handle in the first jbd2_journal_stop() because
we're referencing handle->h_err afterwards. This was introduced in
9705acd63b125dee8b15c705216d7186daea4625 and it is wrong. Fix it by
storing the handle->h_err value beforehand and avoid referencing
potentially freed handle.
Fixes: 9705acd63b125dee8b15c705216d7186daea4625
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Filipe Manana [Mon, 9 Nov 2015 00:33:58 +0000 (00:33 +0000)]
Btrfs: fix race leading to BUG_ON when running delalloc for nodatacow
commit
1d512cb77bdbda80f0dd0620a3b260d697fd581d upstream.
If we are using the NO_HOLES feature, we have a tiny time window when
running delalloc for a nodatacow inode where we can race with a concurrent
link or xattr add operation leading to a BUG_ON.
This happens because at run_delalloc_nocow() we end up casting a leaf item
of type BTRFS_INODE_[REF|EXTREF]_KEY or of type BTRFS_XATTR_ITEM_KEY to a
file extent item (struct btrfs_file_extent_item) and then analyse its
extent type field, which won't match any of the expected extent types
(values BTRFS_FILE_EXTENT_[REG|PREALLOC|INLINE]) and therefore trigger an
explicit BUG_ON(1).
The following sequence diagram shows how the race happens when running a
no-cow dellaloc range [4K, 8K[ for inode 257 and we have the following
neighbour leafs:
Leaf X (has N items) Leaf Y
[ ... (257 INODE_ITEM 0) (257 INODE_REF 256) ] [ (257 EXTENT_DATA 8192), ... ]
slot N - 2 slot N - 1 slot 0
(Note the implicit hole for inode 257 regarding the [0, 8K[ range)
CPU 1 CPU 2
run_dealloc_nocow()
btrfs_lookup_file_extent()
--> searches for a key with value
(257 EXTENT_DATA 4096) in the
fs/subvol tree
--> returns us a path with
path->nodes[0] == leaf X and
path->slots[0] == N
because path->slots[0] is >=
btrfs_header_nritems(leaf X), it
calls btrfs_next_leaf()
btrfs_next_leaf()
--> releases the path
hard link added to our inode,
with key (257 INODE_REF 500)
added to the end of leaf X,
so leaf X now has N + 1 keys
--> searches for the key
(257 INODE_REF 256), because
it was the last key in leaf X
before it released the path,
with path->keep_locks set to 1
--> ends up at leaf X again and
it verifies that the key
(257 INODE_REF 256) is no longer
the last key in the leaf, so it
returns with path->nodes[0] ==
leaf X and path->slots[0] == N,
pointing to the new item with
key (257 INODE_REF 500)
the loop iteration of run_dealloc_nocow()
does not break out the loop and continues
because the key referenced in the path
at path->nodes[0] and path->slots[0] is
for inode 257, its type is < BTRFS_EXTENT_DATA_KEY
and its offset (500) is less then our delalloc
range's end (8192)
the item pointed by the path, an inode reference item,
is (incorrectly) interpreted as a file extent item and
we get an invalid extent type, leading to the BUG_ON(1):
if (extent_type == BTRFS_FILE_EXTENT_REG ||
extent_type == BTRFS_FILE_EXTENT_PREALLOC) {
(...)
} else if (extent_type == BTRFS_FILE_EXTENT_INLINE) {
(...)
} else {
BUG_ON(1)
}
The same can happen if a xattr is added concurrently and ends up having
a key with an offset smaller then the delalloc's range end.
So fix this by skipping keys with a type smaller than
BTRFS_EXTENT_DATA_KEY.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Filipe Manana [Fri, 6 Nov 2015 13:33:33 +0000 (13:33 +0000)]
Btrfs: fix race leading to incorrect item deletion when dropping extents
commit
aeafbf8486c9e2bd53f5cc3c10c0b7fd7149d69c upstream.
While running a stress test I got the following warning triggered:
[191627.672810] ------------[ cut here ]------------
[191627.673949] WARNING: CPU: 8 PID: 8447 at fs/btrfs/file.c:779 __btrfs_drop_extents+0x391/0xa50 [btrfs]()
(...)
[191627.701485] Call Trace:
[191627.702037] [<
ffffffff8145f077>] dump_stack+0x4f/0x7b
[191627.702992] [<
ffffffff81095de5>] ? console_unlock+0x356/0x3a2
[191627.704091] [<
ffffffff8104b3b0>] warn_slowpath_common+0xa1/0xbb
[191627.705380] [<
ffffffffa0664499>] ? __btrfs_drop_extents+0x391/0xa50 [btrfs]
[191627.706637] [<
ffffffff8104b46d>] warn_slowpath_null+0x1a/0x1c
[191627.707789] [<
ffffffffa0664499>] __btrfs_drop_extents+0x391/0xa50 [btrfs]
[191627.709155] [<
ffffffff8115663c>] ? cache_alloc_debugcheck_after.isra.32+0x171/0x1d0
[191627.712444] [<
ffffffff81155007>] ? kmemleak_alloc_recursive.constprop.40+0x16/0x18
[191627.714162] [<
ffffffffa06570c9>] insert_reserved_file_extent.constprop.40+0x83/0x24e [btrfs]
[191627.715887] [<
ffffffffa065422b>] ? start_transaction+0x3bb/0x610 [btrfs]
[191627.717287] [<
ffffffffa065b604>] btrfs_finish_ordered_io+0x273/0x4e2 [btrfs]
[191627.728865] [<
ffffffffa065b888>] finish_ordered_fn+0x15/0x17 [btrfs]
[191627.730045] [<
ffffffffa067d688>] normal_work_helper+0x14c/0x32c [btrfs]
[191627.731256] [<
ffffffffa067d96a>] btrfs_endio_write_helper+0x12/0x14 [btrfs]
[191627.732661] [<
ffffffff81061119>] process_one_work+0x24c/0x4ae
[191627.733822] [<
ffffffff810615b0>] worker_thread+0x206/0x2c2
[191627.734857] [<
ffffffff810613aa>] ? process_scheduled_works+0x2f/0x2f
[191627.736052] [<
ffffffff810613aa>] ? process_scheduled_works+0x2f/0x2f
[191627.737349] [<
ffffffff810669a6>] kthread+0xef/0xf7
[191627.738267] [<
ffffffff810f3b3a>] ? time_hardirqs_on+0x15/0x28
[191627.739330] [<
ffffffff810668b7>] ? __kthread_parkme+0xad/0xad
[191627.741976] [<
ffffffff81465592>] ret_from_fork+0x42/0x70
[191627.743080] [<
ffffffff810668b7>] ? __kthread_parkme+0xad/0xad
[191627.744206] ---[ end trace
bbfddacb7aaada8d ]---
$ cat -n fs/btrfs/file.c
691 int __btrfs_drop_extents(struct btrfs_trans_handle *trans,
(...)
758 btrfs_item_key_to_cpu(leaf, &key, path->slots[0]);
759 if (key.objectid > ino ||
760 key.type > BTRFS_EXTENT_DATA_KEY || key.offset >= end)
761 break;
762
763 fi = btrfs_item_ptr(leaf, path->slots[0],
764 struct btrfs_file_extent_item);
765 extent_type = btrfs_file_extent_type(leaf, fi);
766
767 if (extent_type == BTRFS_FILE_EXTENT_REG ||
768 extent_type == BTRFS_FILE_EXTENT_PREALLOC) {
(...)
774 } else if (extent_type == BTRFS_FILE_EXTENT_INLINE) {
(...)
778 } else {
779 WARN_ON(1);
780 extent_end = search_start;
781 }
(...)
This happened because the item we were processing did not match a file
extent item (its key type != BTRFS_EXTENT_DATA_KEY), and even on this
case we cast the item to a struct btrfs_file_extent_item pointer and
then find a type field value that does not match any of the expected
values (BTRFS_FILE_EXTENT_[REG|PREALLOC|INLINE]). This scenario happens
due to a tiny time window where a race can happen as exemplified below.
For example, consider the following scenario where we're using the
NO_HOLES feature and we have the following two neighbour leafs:
Leaf X (has N items) Leaf Y
[ ... (257 INODE_ITEM 0) (257 INODE_REF 256) ] [ (257 EXTENT_DATA 8192), ... ]
slot N - 2 slot N - 1 slot 0
Our inode 257 has an implicit hole in the range [0, 8K[ (implicit rather
than explicit because NO_HOLES is enabled). Now if our inode has an
ordered extent for the range [4K, 8K[ that is finishing, the following
can happen:
CPU 1 CPU 2
btrfs_finish_ordered_io()
insert_reserved_file_extent()
__btrfs_drop_extents()
Searches for the key
(257 EXTENT_DATA 4096) through
btrfs_lookup_file_extent()
Key not found and we get a path where
path->nodes[0] == leaf X and
path->slots[0] == N
Because path->slots[0] is >=
btrfs_header_nritems(leaf X), we call
btrfs_next_leaf()
btrfs_next_leaf() releases the path
inserts key
(257 INODE_REF 4096)
at the end of leaf X,
leaf X now has N + 1 keys,
and the new key is at
slot N
btrfs_next_leaf() searches for
key (257 INODE_REF 256), with
path->keep_locks set to 1,
because it was the last key it
saw in leaf X
finds it in leaf X again and
notices it's no longer the last
key of the leaf, so it returns 0
with path->nodes[0] == leaf X and
path->slots[0] == N (which is now
< btrfs_header_nritems(leaf X)),
pointing to the new key
(257 INODE_REF 4096)
__btrfs_drop_extents() casts the
item at path->nodes[0], slot
path->slots[0], to a struct
btrfs_file_extent_item - it does
not skip keys for the target
inode with a type less than
BTRFS_EXTENT_DATA_KEY
(BTRFS_INODE_REF_KEY < BTRFS_EXTENT_DATA_KEY)
sees a bogus value for the type
field triggering the WARN_ON in
the trace shown above, and sets
extent_end = search_start (4096)
does the if-then-else logic to
fixup 0 length extent items created
by a past bug from hole punching:
if (extent_end == key.offset &&
extent_end >= search_start)
goto delete_extent_item;
that evaluates to true and it ends
up deleting the key pointed to by
path->slots[0], (257 INODE_REF 4096),
from leaf X
The same could happen for example for a xattr that ends up having a key
with an offset value that matches search_start (very unlikely but not
impossible).
So fix this by ensuring that keys smaller than BTRFS_EXTENT_DATA_KEY are
skipped, never casted to struct btrfs_file_extent_item and never deleted
by accident. Also protect against the unexpected case of getting a key
for a lower inode number by skipping that key and issuing a warning.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Dumazet [Tue, 1 Dec 2015 15:20:07 +0000 (07:20 -0800)]
ipv6: sctp: implement sctp_v6_destroy_sock()
[ Upstream commit
602dd62dfbda3e63a2d6a3cbde953ebe82bf5087 ]
Dmitry Vyukov reported a memory leak using IPV6 SCTP sockets.
We need to call inet6_destroy_sock() to properly release
inet6 specific fields.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Michal Kubeček [Tue, 24 Nov 2015 14:07:11 +0000 (15:07 +0100)]
ipv6: distinguish frag queues by device for multicast and link-local packets
[ Upstream commit
264640fc2c5f4f913db5c73fa3eb1ead2c45e9d7 ]
If a fragmented multicast packet is received on an ethernet device which
has an active macvlan on top of it, each fragment is duplicated and
received both on the underlying device and the macvlan. If some
fragments for macvlan are processed before the whole packet for the
underlying device is reassembled, the "overlapping fragments" test in
ip6_frag_queue() discards the whole fragment queue.
To resolve this, add device ifindex to the search key and require it to
match reassembling multicast packets and packets to link-local
addresses.
Note: similar patch has been already submitted by Yoshifuji Hideaki in
http://patchwork.ozlabs.org/patch/220979/
but got lost and forgotten for some reason.
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Aaro Koskinen [Sat, 21 Nov 2015 23:08:54 +0000 (01:08 +0200)]
broadcom: fix PHY_ID_BCM5481 entry in the id table
[ Upstream commit
3c25a860d17b7378822f35d8c9141db9507e3beb ]
Commit
fcb26ec5b18d ("broadcom: move all PHY_ID's to header")
updated broadcom_tbl to use PHY_IDs, but incorrectly replaced 0x0143bca0
with PHY_ID_BCM5482 (making a duplicate entry, and completely omitting
the original). Fix that.
Fixes: fcb26ec5b18d ("broadcom: move all PHY_ID's to header")
Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Nikolay Aleksandrov [Fri, 20 Nov 2015 12:54:20 +0000 (13:54 +0100)]
net: ip6mr: fix static mfc/dev leaks on table destruction
[ Upstream commit
4c6980462f32b4f282c5d8e5f7ea8070e2937725 ]
Similar to ipv4, when destroying an mrt table the static mfc entries and
the static devices are kept, which leads to devices that can never be
destroyed (because of refcnt taken) and leaked memory. Make sure that
everything is cleaned up on netns destruction.
Fixes: 8229efdaef1e ("netns: ip6mr: enable namespace support in ipv6 multicast forwarding code")
CC: Benjamin Thery <benjamin.thery@bull.net>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Reviewed-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Nikolay Aleksandrov [Fri, 20 Nov 2015 12:54:19 +0000 (13:54 +0100)]
net: ipmr: fix static mfc/dev leaks on table destruction
[ Upstream commit
0e615e9601a15efeeb8942cf7cd4dadba0c8c5a7 ]
When destroying an mrt table the static mfc entries and the static
devices are kept, which leads to devices that can never be destroyed
(because of refcnt taken) and leaked memory, for example:
unreferenced object 0xffff880034c144c0 (size 192):
comm "mfc-broken", pid 4777, jiffies
4320349055 (age 46001.964s)
hex dump (first 32 bytes):
98 53 f0 34 00 88 ff ff 98 53 f0 34 00 88 ff ff .S.4.....S.4....
ef 0a 0a 14 01 02 03 04 00 00 00 00 01 00 00 00 ................
backtrace:
[<
ffffffff815c1b9e>] kmemleak_alloc+0x4e/0xb0
[<
ffffffff811ea6e0>] kmem_cache_alloc+0x190/0x300
[<
ffffffff815931cb>] ip_mroute_setsockopt+0x5cb/0x910
[<
ffffffff8153d575>] do_ip_setsockopt.isra.11+0x105/0xff0
[<
ffffffff8153e490>] ip_setsockopt+0x30/0xa0
[<
ffffffff81564e13>] raw_setsockopt+0x33/0x90
[<
ffffffff814d1e14>] sock_common_setsockopt+0x14/0x20
[<
ffffffff814d0b51>] SyS_setsockopt+0x71/0xc0
[<
ffffffff815cdbf6>] entry_SYSCALL_64_fastpath+0x16/0x7a
[<
ffffffffffffffff>] 0xffffffffffffffff
Make sure that everything is cleaned on netns destruction.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Reviewed-by: Cong Wang <cwang@twopensource.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Daniel Borkmann [Thu, 19 Nov 2015 23:11:56 +0000 (00:11 +0100)]
net, scm: fix PaX detected msg_controllen overflow in scm_detach_fds
[ Upstream commit
6900317f5eff0a7070c5936e5383f589e0de7a09 ]
David and HacKurx reported a following/similar size overflow triggered
in a grsecurity kernel, thanks to PaX's gcc size overflow plugin:
(Already fixed in later grsecurity versions by Brad and PaX Team.)
[ 1002.296137] PAX: size overflow detected in function scm_detach_fds net/core/scm.c:314
cicus.202_127 min, count: 4, decl: msg_controllen; num: 0; context: msghdr;
[ 1002.296145] CPU: 0 PID: 3685 Comm: scm_rights_recv Not tainted 4.2.3-grsec+ #7
[ 1002.296149] Hardware name: Apple Inc. MacBookAir5,1/Mac-
66F35F19FE2A0D05, [...]
[ 1002.296153]
ffffffff81c27366 0000000000000000 ffffffff81c27375 ffffc90007843aa8
[ 1002.296162]
ffffffff818129ba 0000000000000000 ffffffff81c27366 ffffc90007843ad8
[ 1002.296169]
ffffffff8121f838 fffffffffffffffc fffffffffffffffc ffffc90007843e60
[ 1002.296176] Call Trace:
[ 1002.296190] [<
ffffffff818129ba>] dump_stack+0x45/0x57
[ 1002.296200] [<
ffffffff8121f838>] report_size_overflow+0x38/0x60
[ 1002.296209] [<
ffffffff816a979e>] scm_detach_fds+0x2ce/0x300
[ 1002.296220] [<
ffffffff81791899>] unix_stream_read_generic+0x609/0x930
[ 1002.296228] [<
ffffffff81791c9f>] unix_stream_recvmsg+0x4f/0x60
[ 1002.296236] [<
ffffffff8178dc00>] ? unix_set_peek_off+0x50/0x50
[ 1002.296243] [<
ffffffff8168fac7>] sock_recvmsg+0x47/0x60
[ 1002.296248] [<
ffffffff81691522>] ___sys_recvmsg+0xe2/0x1e0
[ 1002.296257] [<
ffffffff81693496>] __sys_recvmsg+0x46/0x80
[ 1002.296263] [<
ffffffff816934fc>] SyS_recvmsg+0x2c/0x40
[ 1002.296271] [<
ffffffff8181a3ab>] entry_SYSCALL_64_fastpath+0x12/0x85
Further investigation showed that this can happen when an *odd* number of
fds are being passed over AF_UNIX sockets.
In these cases CMSG_LEN(i * sizeof(int)) and CMSG_SPACE(i * sizeof(int)),
where i is the number of successfully passed fds, differ by 4 bytes due
to the extra CMSG_ALIGN() padding in CMSG_SPACE() to an 8 byte boundary
on 64 bit. The padding is used to align subsequent cmsg headers in the
control buffer.
When the control buffer passed in from the receiver side *lacks* these 4
bytes (e.g. due to buggy/wrong API usage), then msg->msg_controllen will
overflow in scm_detach_fds():
int cmlen = CMSG_LEN(i * sizeof(int)); <--- cmlen w/o tail-padding
err = put_user(SOL_SOCKET, &cm->cmsg_level);
if (!err)
err = put_user(SCM_RIGHTS, &cm->cmsg_type);
if (!err)
err = put_user(cmlen, &cm->cmsg_len);
if (!err) {
cmlen = CMSG_SPACE(i * sizeof(int)); <--- cmlen w/ 4 byte extra tail-padding
msg->msg_control += cmlen;
msg->msg_controllen -= cmlen; <--- iff no tail-padding space here ...
} ... wrap-around
F.e. it will wrap to a length of
18446744073709551612 bytes in case the
receiver passed in msg->msg_controllen of 20 bytes, and the sender
properly transferred 1 fd to the receiver, so that its CMSG_LEN results
in 20 bytes and CMSG_SPACE in 24 bytes.
In case of MSG_CMSG_COMPAT (scm_detach_fds_compat()), I haven't seen an
issue in my tests as alignment seems always on 4 byte boundary. Same
should be in case of native 32 bit, where we end up with 4 byte boundaries
as well.
In practice, passing msg->msg_controllen of 20 to recvmsg() while receiving
a single fd would mean that on successful return, msg->msg_controllen is
being set by the kernel to 24 bytes instead, thus more than the input
buffer advertised. It could f.e. become an issue if such application later
on zeroes or copies the control buffer based on the returned msg->msg_controllen
elsewhere.
Maximum number of fds we can send is a hard upper limit SCM_MAX_FD (253).
Going over the code, it seems like msg->msg_controllen is not being read
after scm_detach_fds() in scm_recv() anymore by the kernel, good!
Relevant recvmsg() handler are unix_dgram_recvmsg() (unix_seqpacket_recvmsg())
and unix_stream_recvmsg(). Both return back to their recvmsg() caller,
and ___sys_recvmsg() places the updated length, that is, new msg_control -
old msg_control pointer into msg->msg_controllen (hence the 24 bytes seen
in the example).
Long time ago, Wei Yongjun fixed something related in commit
1ac70e7ad24a
("[NET]: Fix function put_cmsg() which may cause usr application memory
overflow").
RFC3542, section 20.2. says:
The fields shown as "XX" are possible padding, between the cmsghdr
structure and the data, and between the data and the next cmsghdr
structure, if required by the implementation. While sending an
application may or may not include padding at the end of last
ancillary data in msg_controllen and implementations must accept both
as valid. On receiving a portable application must provide space for
padding at the end of the last ancillary data as implementations may
copy out the padding at the end of the control message buffer and
include it in the received msg_controllen. When recvmsg() is called
if msg_controllen is too small for all the ancillary data items
including any trailing padding after the last item an implementation
may set MSG_CTRUNC.
Since we didn't place MSG_CTRUNC for already quite a long time, just do
the same as in
1ac70e7ad24a to avoid an overflow.
Btw, even man-page author got this wrong :/ See
db939c9b26e9 ("cmsg.3: Fix
error in SCM_RIGHTS code sample"). Some people must have copied this (?),
thus it got triggered in the wild (reported several times during boot by
David and HacKurx).
No Fixes tag this time as pre 2002 (that is, pre history tree).
Reported-by: David Sterba <dave@jikos.cz>
Reported-by: HacKurx <hackurx@gmail.com>
Cc: PaX Team <pageexec@freemail.hu>
Cc: Emese Revfy <re.emese@gmail.com>
Cc: Brad Spengler <spender@grsecurity.net>
Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Cc: Eric Dumazet <edumazet@google.com>
Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Dumazet [Thu, 26 Nov 2015 16:18:14 +0000 (08:18 -0800)]
tcp: initialize tp->copied_seq in case of cross SYN connection
[ Upstream commit
142a2e7ece8d8ac0e818eb2c91f99ca894730e2a ]
Dmitry provided a syzkaller (http://github.com/google/syzkaller)
generated program that triggers the WARNING at
net/ipv4/tcp.c:1729 in tcp_recvmsg() :
WARN_ON(tp->copied_seq != tp->rcv_nxt &&
!(flags & (MSG_PEEK | MSG_TRUNC)));
His program is specifically attempting a Cross SYN TCP exchange,
that we support (for the pleasure of hackers ?), but it looks we
lack proper tcp->copied_seq initialization.
Thanks again Dmitry for your report and testings.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Dumazet [Wed, 18 Nov 2015 20:40:13 +0000 (12:40 -0800)]
tcp: md5: fix lockdep annotation
[ Upstream commit
1b8e6a01e19f001e9f93b39c32387961c91ed3cc ]
When a passive TCP is created, we eventually call tcp_md5_do_add()
with sk pointing to the child. It is not owner by the user yet (we
will add this socket into listener accept queue a bit later anyway)
But we do own the spinlock, so amend the lockdep annotation to avoid
following splat :
[ 8451.090932] net/ipv4/tcp_ipv4.c:923 suspicious rcu_dereference_protected() usage!
[ 8451.090932]
[ 8451.090932] other info that might help us debug this:
[ 8451.090932]
[ 8451.090934]
[ 8451.090934] rcu_scheduler_active = 1, debug_locks = 1
[ 8451.090936] 3 locks held by socket_sockopt_/214795:
[ 8451.090936] #0: (rcu_read_lock){.+.+..}, at: [<
ffffffff855c6ac1>] __netif_receive_skb_core+0x151/0xe90
[ 8451.090947] #1: (rcu_read_lock){.+.+..}, at: [<
ffffffff85618143>] ip_local_deliver_finish+0x43/0x2b0
[ 8451.090952] #2: (slock-AF_INET){+.-...}, at: [<
ffffffff855acda5>] sk_clone_lock+0x1c5/0x500
[ 8451.090958]
[ 8451.090958] stack backtrace:
[ 8451.090960] CPU: 7 PID: 214795 Comm: socket_sockopt_
[ 8451.091215] Call Trace:
[ 8451.091216] <IRQ> [<
ffffffff856fb29c>] dump_stack+0x55/0x76
[ 8451.091229] [<
ffffffff85123b5b>] lockdep_rcu_suspicious+0xeb/0x110
[ 8451.091235] [<
ffffffff8564544f>] tcp_md5_do_add+0x1bf/0x1e0
[ 8451.091239] [<
ffffffff85645751>] tcp_v4_syn_recv_sock+0x1f1/0x4c0
[ 8451.091242] [<
ffffffff85642b27>] ? tcp_v4_md5_hash_skb+0x167/0x190
[ 8451.091246] [<
ffffffff85647c78>] tcp_check_req+0x3c8/0x500
[ 8451.091249] [<
ffffffff856451ae>] ? tcp_v4_inbound_md5_hash+0x11e/0x190
[ 8451.091253] [<
ffffffff85647170>] tcp_v4_rcv+0x3c0/0x9f0
[ 8451.091256] [<
ffffffff85618143>] ? ip_local_deliver_finish+0x43/0x2b0
[ 8451.091260] [<
ffffffff856181b6>] ip_local_deliver_finish+0xb6/0x2b0
[ 8451.091263] [<
ffffffff85618143>] ? ip_local_deliver_finish+0x43/0x2b0
[ 8451.091267] [<
ffffffff85618d38>] ip_local_deliver+0x48/0x80
[ 8451.091270] [<
ffffffff85618510>] ip_rcv_finish+0x160/0x700
[ 8451.091273] [<
ffffffff8561900e>] ip_rcv+0x29e/0x3d0
[ 8451.091277] [<
ffffffff855c74b7>] __netif_receive_skb_core+0xb47/0xe90
Fixes: a8afca0329988 ("tcp: md5: protects md5sig_info with RCU")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bjørn Mork [Wed, 18 Nov 2015 20:13:07 +0000 (21:13 +0100)]
net: qmi_wwan: add XS Stick W100-2 from 4G Systems
[ Upstream commit
68242a5a1e2edce39b069385cbafb82304eac0f1 ]
Thomas reports
"
4gsystems sells two total different LTE-surfsticks under the same name.
..
The newer version of XS Stick W100 is from "omega"
..
Under windows the driver switches to the same ID, and uses MI03\6 for
network and MI01\6 for modem.
..
echo "1c9e 9b01" > /sys/bus/usb/drivers/qmi_wwan/new_id
echo "1c9e 9b01" > /sys/bus/usb-serial/drivers/option1/new_id
T: Bus=01 Lev=01 Prnt=01 Port=03 Cnt=01 Dev#= 4 Spd=480 MxCh= 0
D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1
P: Vendor=1c9e ProdID=9b01 Rev=02.32
S: Manufacturer=USB Modem
S: Product=USB Modem
S: SerialNumber=
C: #Ifs= 5 Cfg#= 1 Atr=80 MxPwr=500mA
I: If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
I: If#= 4 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=usb-storage
Now all important things are there:
wwp0s29f7u2i3 (net), ttyUSB2 (at), cdc-wdm0 (qmi), ttyUSB1 (at)
There is also ttyUSB0, but it is not usable, at least not for at.
The device works well with qmi and ModemManager-NetworkManager.
"
Reported-by: Thomas Schäfer <tschaefer@t-online.de>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Neil Horman [Mon, 16 Nov 2015 18:09:10 +0000 (13:09 -0500)]
snmp: Remove duplicate OUTMCAST stat increment
[ Upstream commit
41033f029e393a64e81966cbe34d66c6cf8a2e7e ]
the OUTMCAST stat is double incremented, getting bumped once in the mcast code
itself, and again in the common ip output path. Remove the mcast bump, as its
not needed
Validated by the reporter, with good results
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Reported-by: Claus Jensen <claus.jensen@microsemi.com>
CC: Claus Jensen <claus.jensen@microsemi.com>
CC: David Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jason A. Donenfeld [Thu, 12 Nov 2015 16:35:58 +0000 (17:35 +0100)]
ip_tunnel: disable preemption when updating per-cpu tstats
[ Upstream commit
b4fe85f9c9146f60457e9512fb6055e69e6a7a65 ]
Drivers like vxlan use the recently introduced
udp_tunnel_xmit_skb/udp_tunnel6_xmit_skb APIs. udp_tunnel6_xmit_skb
makes use of ip6tunnel_xmit, and ip6tunnel_xmit, after sending the
packet, updates the struct stats using the usual
u64_stats_update_begin/end calls on this_cpu_ptr(dev->tstats).
udp_tunnel_xmit_skb makes use of iptunnel_xmit, which doesn't touch
tstats, so drivers like vxlan, immediately after, call
iptunnel_xmit_stats, which does the same thing - calls
u64_stats_update_begin/end on this_cpu_ptr(dev->tstats).
While vxlan is probably fine (I don't know?), calling a similar function
from, say, an unbound workqueue, on a fully preemptable kernel causes
real issues:
[ 188.434537] BUG: using smp_processor_id() in preemptible [
00000000] code: kworker/u8:0/6
[ 188.435579] caller is debug_smp_processor_id+0x17/0x20
[ 188.435583] CPU: 0 PID: 6 Comm: kworker/u8:0 Not tainted 4.2.6 #2
[ 188.435607] Call Trace:
[ 188.435611] [<
ffffffff8234e936>] dump_stack+0x4f/0x7b
[ 188.435615] [<
ffffffff81915f3d>] check_preemption_disabled+0x19d/0x1c0
[ 188.435619] [<
ffffffff81915f77>] debug_smp_processor_id+0x17/0x20
The solution would be to protect the whole
this_cpu_ptr(dev->tstats)/u64_stats_update_begin/end blocks with
disabling preemption and then reenabling it.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
lucien [Thu, 12 Nov 2015 05:07:07 +0000 (13:07 +0800)]
sctp: translate host order to network order when setting a hmacid
[ Upstream commit
ed5a377d87dc4c87fb3e1f7f698cba38cd893103 ]
now sctp auth cannot work well when setting a hmacid manually, which
is caused by that we didn't use the network order for hmacid, so fix
it by adding the transformation in sctp_auth_ep_set_hmacs.
even we set hmacid with the network order in userspace, it still
can't work, because of this condition in sctp_auth_ep_set_hmacs():
if (id > SCTP_AUTH_HMAC_ID_MAX)
return -EOPNOTSUPP;
so this wasn't working before and thus it won't break compatibility.
Fixes: 65b07e5d0d09 ("[SCTP]: API updates to suport SCTP-AUTH extensions.")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Daniel Borkmann [Wed, 11 Nov 2015 22:25:43 +0000 (23:25 +0100)]
packet: infer protocol from ethernet header if unset
[ Upstream commit
c72219b75fde768efccf7666342282fab7f9e4e7 ]
In case no struct sockaddr_ll has been passed to packet
socket's sendmsg() when doing a TX_RING flush run, then
skb->protocol is set to po->num instead, which is the protocol
passed via socket(2)/bind(2).
Applications only xmitting can go the path of allocating the
socket as socket(PF_PACKET, <mode>, 0) and do a bind(2) on the
TX_RING with sll_protocol of 0. That way, register_prot_hook()
is neither called on creation nor on bind time, which saves
cycles when there's no interest in capturing anyway.
That leaves us however with po->num 0 instead and therefore
the TX_RING flush run sets skb->protocol to 0 as well. Eric
reported that this leads to problems when using tools like
trafgen over bonding device. I.e. the bonding's hash function
could invoke the kernel's flow dissector, which depends on
skb->protocol being properly set. In the current situation, all
the traffic is then directed to a single slave.
Fix it up by inferring skb->protocol from the Ethernet header
when not set and we have ARPHRD_ETHER device type. This is only
done in case of SOCK_RAW and where we have a dev->hard_header_len
length. In case of ARPHRD_ETHER devices, this is guaranteed to
cover ETH_HLEN, and therefore being accessed on the skb after
the skb_store_bits().
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Daniel Borkmann [Wed, 11 Nov 2015 22:25:41 +0000 (23:25 +0100)]
packet: always probe for transport header
[ Upstream commit
8fd6c80d9dd938ca338c70698533a7e304752846 ]
We concluded that the skb_probe_transport_header() should better be
called unconditionally. Avoiding the call into the flow dissector has
also not really much to do with the direct xmit mode.
While it seems that only virtio_net code makes use of GSO from non
RX/TX ring packet socket paths, we should probe for a transport header
nevertheless before they hit devices.
Reference: http://thread.gmane.org/gmane.linux.network/386173/
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Daniel Borkmann [Wed, 11 Nov 2015 22:25:40 +0000 (23:25 +0100)]
packet: do skb_probe_transport_header when we actually have data
[ Upstream commit
efdfa2f7848f64517008136fb41f53c4a1faf93a ]
In tpacket_fill_skb() commit
c1aad275b029 ("packet: set transport
header before doing xmit") and later on
40893fd0fd4e ("net: switch
to use skb_probe_transport_header()") was probing for a transport
header on the skb from a ring buffer slot, but at a time, where
the skb has _not even_ been filled with data yet. So that call into
the flow dissector is pretty useless. Lets do it after we've set
up the skb frags.
Fixes: c1aad275b029 ("packet: set transport header before doing xmit")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Kamal Mostafa [Wed, 11 Nov 2015 22:24:27 +0000 (14:24 -0800)]
tools/net: Use include/uapi with __EXPORTED_HEADERS__
[ Upstream commit
d7475de58575c904818efa369c82e88c6648ce2e ]
Use the local uapi headers to keep in sync with "recently" added #define's
(e.g. SKF_AD_VLAN_TPID). Refactored CFLAGS, and bpf_asm doesn't need -I.
Fixes: 3f356385e8a4 ("filter: bpf_asm: add minimal bpf asm tool")
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Rainer Weikusat [Fri, 20 Nov 2015 22:07:23 +0000 (22:07 +0000)]
unix: avoid use-after-free in ep_remove_wait_queue
[ Upstream commit
7d267278a9ece963d77eefec61630223fce08c6c ]
Rainer Weikusat <rweikusat@mobileactivedefense.com> writes:
An AF_UNIX datagram socket being the client in an n:1 association with
some server socket is only allowed to send messages to the server if the
receive queue of this socket contains at most sk_max_ack_backlog
datagrams. This implies that prospective writers might be forced to go
to sleep despite none of the message presently enqueued on the server
receive queue were sent by them. In order to ensure that these will be
woken up once space becomes again available, the present unix_dgram_poll
routine does a second sock_poll_wait call with the peer_wait wait queue
of the server socket as queue argument (unix_dgram_recvmsg does a wake
up on this queue after a datagram was received). This is inherently
problematic because the server socket is only guaranteed to remain alive
for as long as the client still holds a reference to it. In case the
connection is dissolved via connect or by the dead peer detection logic
in unix_dgram_sendmsg, the server socket may be freed despite "the
polling mechanism" (in particular, epoll) still has a pointer to the
corresponding peer_wait queue. There's no way to forcibly deregister a
wait queue with epoll.
Based on an idea by Jason Baron, the patch below changes the code such
that a wait_queue_t belonging to the client socket is enqueued on the
peer_wait queue of the server whenever the peer receive queue full
condition is detected by either a sendmsg or a poll. A wake up on the
peer queue is then relayed to the ordinary wait queue of the client
socket via wake function. The connection to the peer wait queue is again
dissolved if either a wake up is about to be relayed or the client
socket reconnects or a dead peer is detected or the client socket is
itself closed. This enables removing the second sock_poll_wait from
unix_dgram_poll, thus avoiding the use-after-free, while still ensuring
that no blocked writer sleeps forever.
Signed-off-by: Rainer Weikusat <rweikusat@mobileactivedefense.com>
Fixes: ec0d215f9420 ("af_unix: fix 'poll for write'/connected DGRAM sockets")
Reviewed-by: Jason Baron <jbaron@akamai.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Wed, 9 Dec 2015 18:43:38 +0000 (13:43 -0500)]
Linux 3.14.58
Clemens Ladisch [Sun, 15 Nov 2015 21:39:08 +0000 (22:39 +0100)]
ALSA: usb-audio: work around CH345 input SysEx corruption
commit
a91e627e3f0ed820b11d86cdc04df38f65f33a70 upstream.
One of the many faults of the QinHeng CH345 USB MIDI interface chip is
that it does not handle received SysEx messages correctly -- every second
event packet has a wrong code index number, which is the one from the last
seen message, instead of 4. For example, the two messages "FE F0 01 02 03
04 05 06 07 08 09 0A 0B 0C 0D 0E F7" result in the following event
packets:
correct: CH345:
0F FE 00 00 0F FE 00 00
04 F0 01 02 04 F0 01 02
04 03 04 05 0F 03 04 05
04 06 07 08 04 06 07 08
04 09 0A 0B 0F 09 0A 0B
04 0C 0D 0E 04 0C 0D 0E
05 F7 00 00 05 F7 00 00
A class-compliant driver must interpret an event packet with CIN 15 as
having a single data byte, so the other two bytes would be ignored. The
message received by the host would then be missing two bytes out of six;
in this example, "F0 01 02 03 06 07 08 09 0C 0D 0E F7".
These corrupted SysEx event packages contain only data bytes, while the
CH345 uses event packets with a correct CIN value only for messages with
a status byte, so it is possible to distinguish between these two cases by
checking for the presence of this status byte.
(Other bugs in the CH345's input handling, such as the corruption resulting
from running status, cannot be worked around.)
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Clemens Ladisch [Sun, 15 Nov 2015 21:38:29 +0000 (22:38 +0100)]
ALSA: usb-audio: prevent CH345 multiport output SysEx corruption
commit
1ca8b201309d842642f221db7f02f71c0af5be2d upstream.
The CH345 USB MIDI chip has two output ports. However, they are
multiplexed through one pin, and the number of ports cannot be reduced
even for hardware that implements only one connector, so for those
devices, data sent to either port ends up on the same hardware output.
This becomes a problem when both ports are used at the same time, as
longer MIDI commands (such as SysEx messages) are likely to be
interrupted by messages from the other port, and thus to get lost.
It would not be possible for the driver to detect how many ports the
device actually has, except that in practice, _all_ devices built with
the CH345 have only one port. So we can just ignore the device's
descriptors, and hardcode one output port.
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Clemens Ladisch [Sun, 15 Nov 2015 21:37:44 +0000 (22:37 +0100)]
ALSA: usb-audio: add packet size quirk for the Medeli DD305
commit
98d362becb6621bebdda7ed0eac7ad7ec6c37898 upstream.
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Bjørn Mork [Wed, 18 Nov 2015 20:12:33 +0000 (21:12 +0100)]
USB: option: add XS Stick W100-2 from 4G Systems
commit
638148e20c7f8f6e95017fdc13bce8549a6925e0 upstream.
Thomas reports
"
4gsystems sells two total different LTE-surfsticks under the same name.
..
The newer version of XS Stick W100 is from "omega"
..
Under windows the driver switches to the same ID, and uses MI03\6 for
network and MI01\6 for modem.
..
echo "1c9e 9b01" > /sys/bus/usb/drivers/qmi_wwan/new_id
echo "1c9e 9b01" > /sys/bus/usb-serial/drivers/option1/new_id
T: Bus=01 Lev=01 Prnt=01 Port=03 Cnt=01 Dev#= 4 Spd=480 MxCh= 0
D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1
P: Vendor=1c9e ProdID=9b01 Rev=02.32
S: Manufacturer=USB Modem
S: Product=USB Modem
S: SerialNumber=
C: #Ifs= 5 Cfg#= 1 Atr=80 MxPwr=500mA
I: If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
I: If#= 4 Alt= 0 #EPs= 2 Cls=08(stor.) Sub=06 Prot=50 Driver=usb-storage
Now all important things are there:
wwp0s29f7u2i3 (net), ttyUSB2 (at), cdc-wdm0 (qmi), ttyUSB1 (at)
There is also ttyUSB0, but it is not usable, at least not for at.
The device works well with qmi and ModemManager-NetworkManager.
"
Reported-by: Thomas Schäfer <tschaefer@t-online.de>
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Aleksander Morgado [Wed, 11 Nov 2015 18:51:40 +0000 (19:51 +0100)]
USB: serial: option: add support for Novatel MiFi USB620L
commit
e07af133c3e2716db25e3e1e1d9f10c2088e9c1a upstream.
Also known as Verizon U620L.
The device is modeswitched from 1410:9020 to 1410:9022 by selecting the
4th USB configuration:
$ sudo usb_modeswitch –v 0x1410 –p 0x9020 –u 4
This configuration provides a ECM interface as well as TTYs ('Enterprise
Mode' according to the U620 Linux integration guide).
Signed-off-by: Aleksander Morgado <aleksander@aleksander.es>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
David Woodhouse [Sat, 14 Nov 2015 16:49:30 +0000 (16:49 +0000)]
USB: ti_usb_3410_5052: Add Honeywell HGI80 ID
commit
1bcb49e663f88bccee35b8688e6a3da2bea31fd4 upstream.
The Honeywell HGI80 is a wireless interface to the evohome connected
thermostat. It uses a TI 3410 USB-serial port.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Uwe Kleine-König [Fri, 23 Oct 2015 07:53:50 +0000 (09:53 +0200)]
usb: musb: core: fix order of arguments to ulpi write callback
commit
705e63d2b29c8bbf091119084544d353bda70393 upstream.
There is a bit of a mess in the order of arguments to the ulpi write
callback. There is
int ulpi_write(struct ulpi *ulpi, u8 addr, u8 val)
in drivers/usb/common/ulpi.c;
struct usb_phy_io_ops {
...
int (*write)(struct usb_phy *x, u32 val, u32 reg);
}
in include/linux/usb/phy.h.
The callback registered by the musb driver has to comply to the latter,
but up to now had "offset" first which effectively made the function
broken for correct users. So flip the order and while at it also
switch to the parameter names of struct usb_phy_io_ops's write.
Fixes: ffb865b1e460 ("usb: musb: add ulpi access operations")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jiri Slaby [Mon, 2 Nov 2015 09:27:00 +0000 (10:27 +0100)]
usblp: do not set TASK_INTERRUPTIBLE before lock
commit
19cd80a214821f4b558560ebd76bfb2c38b4f3d8 upstream.
It is not permitted to set task state before lock. usblp_wwait sets
the state to TASK_INTERRUPTIBLE and calls mutex_lock_interruptible.
Upon return from that function, the state will be TASK_RUNNING again.
This is clearly a bug and a warning is generated with LOCKDEP too:
WARNING: CPU: 1 PID: 5109 at kernel/sched/core.c:7404 __might_sleep+0x7d/0x90()
do not call blocking ops when !TASK_RUNNING; state=1 set at [<
ffffffffa0c588d0>] usblp_wwait+0xa0/0x310 [usblp]
Modules linked in: ...
CPU: 1 PID: 5109 Comm: captmon Tainted: G W 4.2.5-0.gef2823b-default #1
Hardware name: LENOVO 23252SG/23252SG, BIOS G2ET33WW (1.13 ) 07/24/2012
ffffffff81a4edce ffff880236ec7ba8 ffffffff81716651 0000000000000000
ffff880236ec7bf8 ffff880236ec7be8 ffffffff8106e146 0000000000000282
ffffffff81a50119 000000000000028b 0000000000000000 ffff8802dab7c508
Call Trace:
...
[<
ffffffff8106e1c6>] warn_slowpath_fmt+0x46/0x50
[<
ffffffff8109a8bd>] __might_sleep+0x7d/0x90
[<
ffffffff8171b20f>] mutex_lock_interruptible_nested+0x2f/0x4b0
[<
ffffffffa0c588fc>] usblp_wwait+0xcc/0x310 [usblp]
[<
ffffffffa0c58bb2>] usblp_write+0x72/0x350 [usblp]
[<
ffffffff8121ed98>] __vfs_write+0x28/0xf0
...
Commit
7f477358e2384c54b190cc3b6ce28277050a041b (usblp: Implement the
ENOSPC convention) moved the set prior locking. So move it back after
the lock.
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Fixes: 7f477358e2 ("usblp: Implement the ENOSPC convention")
Acked-By: Pete Zaitcev <zaitcev@yahoo.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Robin Murphy [Thu, 22 Oct 2015 14:41:52 +0000 (15:41 +0100)]
arm64: Fix compat register mappings
commit
5accd17d0eb523350c9ef754d655e379c9bb93b3 upstream.
For reasons not entirely apparent, but now enshrined in history, the
architectural mapping of AArch32 banked registers to AArch64 registers
actually orders SP_<mode> and LR_<mode> backwards compared to the
intuitive r13/r14 order, for all modes except FIQ.
Fix the compat_<reg>_<mode> macros accordingly, in the hope of avoiding
subtle bugs with KVM and AArch32 guests.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Mirza Krak [Tue, 10 Nov 2015 13:59:34 +0000 (14:59 +0100)]
can: sja1000: clear interrupts on start
commit
7cecd9ab80f43972c056dc068338f7bcc407b71c upstream.
According to SJA1000 data sheet error-warning (EI) interrupt is not
cleared by setting the controller in to reset-mode.
Then if we have the following case:
- system is suspended (echo mem > /sys/power/state) and SJA1000 is left
in operating state
- A bus error condition occurs which activates EI interrupt, system is
still suspended which means EI interrupt will be not be handled nor
cleared.
If the above two events occur, on resume there is no way to return the
SJA1000 to operating state, except to cycle power to it.
By simply reading the IR register on start we will clear any previous
conditions that could be present.
Signed-off-by: Mirza Krak <mirza.krak@hostmobility.com>
Reported-by: Christian Magnusson <Christian.Magnusson@semcon.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dmitry Tunin [Fri, 16 Oct 2015 08:45:26 +0000 (11:45 +0300)]
Bluetooth: ath3k: Add support of AR3012 0cf3:817b device
commit
18e0afab8ce3f1230ce3fef52b2e73374fd9c0e7 upstream.
T: Bus=04 Lev=02 Prnt=02 Port=04 Cnt=01 Dev#= 3 Spd=12 MxCh= 0
D: Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1
P: Vendor=0cf3 ProdID=817b Rev=00.02
C: #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA
I: If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
I: If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
BugLink: https://bugs.launchpad.net/bugs/1506615
Signed-off-by: Dmitry Tunin <hanipouspilot@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dmitry Tunin [Mon, 5 Oct 2015 16:29:33 +0000 (19:29 +0300)]
Bluetooth: ath3k: Add new AR3012 0930:021c id
commit
cd355ff071cd37e7197eccf9216770b2b29369f7 upstream.
This adapter works with the existing linux-firmware.
T: Bus=01 Lev=01 Prnt=01 Port=03 Cnt=02 Dev#= 3 Spd=12 MxCh= 0
D: Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1
P: Vendor=0930 ProdID=021c Rev=00.01
C: #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA
I: If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
I: If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
BugLink: https://bugs.launchpad.net/bugs/1502781
Signed-off-by: Dmitry Tunin <hanipouspilot@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
David Herrmann [Mon, 7 Sep 2015 10:05:41 +0000 (12:05 +0200)]
Bluetooth: hidp: fix device disconnect on idle timeout
commit
660f0fc07d21114549c1862e67e78b1cf0c90c29 upstream.
The HIDP specs define an idle-timeout which automatically disconnects a
device. This has always been implemented in the HIDP layer and forced a
synchronous shutdown of the hidp-scheduler. This works just fine, but
lacks a forced disconnect on the underlying l2cap channels. This has been
broken since:
commit
5205185d461d5902325e457ca80bd421127b7308
Author: David Herrmann <dh.herrmann@gmail.com>
Date: Sat Apr 6 20:28:47 2013 +0200
Bluetooth: hidp: remove old session-management
The old session-management always forced an l2cap error on the ctrl/intr
channels when shutting down. The new session-management skips this, as we
don't want to enforce channel policy on the caller. In other words, if
user-space removes an HIDP device, the underlying channels (which are
*owned* and *referenced* by user-space) are still left active. User-space
needs to call shutdown(2) or close(2) to release them.
Unfortunately, this does not work with idle-timeouts. There is no way to
signal user-space that the HIDP layer has been stopped. The API simply
does not support any event-passing except for poll(2). Hence, we restore
old behavior and force EUNATCH on the sockets if the HIDP layer is
disconnected due to idle-timeouts (behavior of explicit disconnects
remains unmodified). User-space can still call
getsockopt(..., SO_ERROR, ...)
..to retrieve the EUNATCH error and clear sk_err. Hence, the channels can
still be re-used (which nobody does so far, though). Therefore, the API
still supports the new behavior, but with this patch it's also compatible
to the old implicit channel shutdown.
Reported-by: Mark Haun <haunma@keteu.org>
Reported-by: Luiz Augusto von Dentz <luiz.dentz@gmail.com>
Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Larry Finger [Mon, 19 Oct 2015 03:14:48 +0000 (22:14 -0500)]
staging: rtl8712: Add device ID for Sitecom WLA2100
commit
1e6e63283691a2a9048a35d9c6c59cf0abd342e4 upstream.
This adds the USB ID for the Sitecom WLA2100. The Windows 10 inf file
was checked to verify that the addition is correct.
Reported-by: Frans van de Wiel <fvdw@fvdw.eu>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Cc: Frans van de Wiel <fvdw@fvdw.eu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dan Carpenter [Mon, 21 Sep 2015 16:19:53 +0000 (19:19 +0300)]
mwifiex: fix mwifiex_rdeeprom_read()
commit
1f9c6e1bc1ba5f8a10fcd6e99d170954d7c6d382 upstream.
There were several bugs here.
1) The done label was in the wrong place so we didn't copy any
information out when there was no command given.
2) We were using PAGE_SIZE as the size of the buffer instead of
"PAGE_SIZE - pos".
3) snprintf() returns the number of characters that would have been
printed if there were enough space. If there was not enough space
(and we had fixed the memory corruption bug #2) then it would result
in an information leak when we do simple_read_from_buffer(). I've
changed it to use scnprintf() instead.
I also removed the initialization at the start of the function, because
I thought it made the code a little more clear.
Fixes: 5e6e3a92b9a4 ('wireless: mwifiex: initial commit for Marvell mwifiex driver')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Amitkumar Karwar <akarwar@marvell.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Maxime Ripard [Fri, 25 Sep 2015 16:09:35 +0000 (18:09 +0200)]
net: mvneta: Fix CPU_MAP registers initialisation
commit
2502d0ef272da7058ef303b849a2c8dc324c2e2e upstream.
The CPU_MAP register is duplicated for each CPUs at different addresses,
each instance being at a different address.
However, the code so far was using CONFIG_NR_CPUS to initialise the CPU_MAP
registers for each registers, while the SoCs embed at most 4 CPUs.
This is especially an issue with multi_v7_defconfig, where CONFIG_NR_CPUS
is currently set to 16, resulting in writes to registers that are not
CPU_MAP.
Fixes: c5aff18204da ("net: mvneta: driver for Marvell Armada 370/XP network unit")
Signed-off-by: Maxime Ripard <maxime.ripard@free-electrons.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Johannes Berg [Fri, 28 Aug 2015 08:52:53 +0000 (10:52 +0200)]
mac80211: fix driver RSSI event calculations
commit
8ec6d97871f37e4743678ea4a455bd59580aa0f4 upstream.
The ifmgd->ave_beacon_signal value cannot be taken as is for
comparisons, it must be divided by since it's represented
like that for better accuracy of the EWMA calculations. This
would lead to invalid driver RSSI events. Fix the used value.
Fixes: 615f7b9bb1f8 ("mac80211: add driver RSSI threshold events")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Andrew Cooper [Wed, 3 Jun 2015 09:31:14 +0000 (10:31 +0100)]
x86/cpu: Fix SMAP check in PVOPS environments
commit
581b7f158fe0383b492acd1ce3fb4e99d4e57808 upstream.
There appears to be no formal statement of what pv_irq_ops.save_fl() is
supposed to return precisely. Native returns the full flags, while lguest and
Xen only return the Interrupt Flag, and both have comments by the
implementations stating that only the Interrupt Flag is looked at. This may
have been true when initially implemented, but no longer is.
To make matters worse, the Xen PVOP leaves the upper bits undefined, making
the BUG_ON() undefined behaviour. Experimentally, this now trips for 32bit PV
guests on Broadwell hardware. The BUG_ON() is consistent for an individual
build, but not consistent for all builds. It has also been a sitting timebomb
since SMAP support was introduced.
Use native_save_fl() instead, which will obtain an accurate view of the AC
flag.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Tested-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: <lguest@lists.ozlabs.org>
Cc: Xen-devel <xen-devel@lists.xen.org>
Link: http://lkml.kernel.org/r/1433323874-6927-1-git-send-email-andrew.cooper3@citrix.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Borislav Petkov [Thu, 5 Nov 2015 15:57:56 +0000 (16:57 +0100)]
x86/cpu: Call verify_cpu() after having entered long mode too
commit
04633df0c43d710e5f696b06539c100898678235 upstream.
When we get loaded by a 64-bit bootloader, kernel entry point is
startup_64 in head_64.S. We don't trust any and all bootloaders because
some will fiddle with CPU configuration so we go ahead and massage each
CPU into sanity again.
For example, some dell BIOSes have this XD disable feature which set
IA32_MISC_ENABLE[34] and disable NX. This might be some dumb workaround
for other OSes but Linux sure doesn't need it.
A similar thing is present in the Surface 3 firmware - see
https://bugzilla.kernel.org/show_bug.cgi?id=106051 - which sets this bit
only on the BSP:
# rdmsr -a 0x1a0
400850089
850089
850089
850089
I know, right?!
There's not even an off switch in there.
So fix all those cases by sanitizing the 64-bit entry point too. For
that, make verify_cpu() callable in 64-bit mode also.
Requested-and-debugged-by: "H. Peter Anvin" <hpa@zytor.com>
Reported-and-tested-by: Bastien Nocera <bugzilla@hadess.net>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1446739076-21303-1-git-send-email-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Krzysztof Mazur [Fri, 6 Nov 2015 13:18:36 +0000 (14:18 +0100)]
x86/setup: Fix low identity map for >= 2GB kernel range
commit
68accac392d859d24adcf1be3a90e41f978bd54c upstream.
The commit
f5f3497cad8c extended the low identity mapping. However, if
the kernel uses more than 2 GB (VMSPLIT_2G_OPT or VMSPLIT_1G memory
split), the normal memory mapping is overwritten by the low identity
mapping causing a crash. To avoid overwritting, limit the low identity
map to cover only memory before kernel range (PAGE_OFFSET).
Fixes: f5f3497cad8c "x86/setup: Extend low identity map to cover whole kernel range
Signed-off-by: Krzysztof Mazur <krzysiek@podlesie.net>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Matt Fleming <matt.fleming@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: http://lkml.kernel.org/r/1446815916-22105-1-git-send-email-krzysiek@podlesie.net
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Paolo Bonzini [Wed, 14 Oct 2015 11:30:45 +0000 (13:30 +0200)]
x86/setup: Extend low identity map to cover whole kernel range
commit
f5f3497cad8c8416a74b9aaceb127908755d020a upstream.
On 32-bit systems, the initial_page_table is reused by
efi_call_phys_prolog as an identity map to call
SetVirtualAddressMap. efi_call_phys_prolog takes care of
converting the current CPU's GDT to a physical address too.
For PAE kernels the identity mapping is achieved by aliasing the
first PDPE for the kernel memory mapping into the first PDPE
of initial_page_table. This makes the EFI stub's trick "just work".
However, for non-PAE kernels there is no guarantee that the identity
mapping in the initial_page_table extends as far as the GDT; in this
case, accesses to the GDT will cause a page fault (which quickly becomes
a triple fault). Fix this by copying the kernel mappings from
swapper_pg_dir to initial_page_table twice, both at PAGE_OFFSET and at
identity mapping.
For some reason, this is only reproducible with QEMU's dynamic translation
mode, and not for example with KVM. However, even under KVM one can clearly
see that the page table is bogus:
$ qemu-system-i386 -pflash OVMF.fd -M q35 vmlinuz0 -s -S -daemonize
$ gdb
(gdb) target remote localhost:1234
(gdb) hb *0x02858f6f
Hardware assisted breakpoint 1 at 0x2858f6f
(gdb) c
Continuing.
Breakpoint 1, 0x02858f6f in ?? ()
(gdb) monitor info registers
...
GDT=
0724e000 000000ff
IDT=
fffbb000 000007ff
CR0=
0005003b CR2=
ff896000 CR3=
032b7000 CR4=
00000690
...
The page directory is sane:
(gdb) x/4wx 0x32b7000
0x32b7000: 0x03398063 0x03399063 0x0339a063 0x0339b063
(gdb) x/4wx 0x3398000
0x3398000: 0x00000163 0x00001163 0x00002163 0x00003163
(gdb) x/4wx 0x3399000
0x3399000: 0x00400003 0x00401003 0x00402003 0x00403003
but our particular page directory entry is empty:
(gdb) x/1wx 0x32b7000 + (0x724e000 >> 22) * 4
0x32b7070: 0x00000000
[ It appears that you can skate past this issue if you don't receive
any interrupts while the bogus GDT pointer is loaded, or if you avoid
reloading the segment registers in general.
Andy Lutomirski provides some additional insight:
"AFAICT it's entirely permissible for the GDTR and/or LDT
descriptor to point to unmapped memory. Any attempt to use them
(segment loads, interrupts, IRET, etc) will try to access that memory
as if the access came from CPL 0 and, if the access fails, will
generate a valid page fault with CR2 pointing into the GDT or
LDT."
Up until commit
23a0d4e8fa6d ("efi: Disable interrupts around EFI
calls, not in the epilog/prolog calls") interrupts were disabled
around the prolog and epilog calls, and the functional GDT was
re-installed before interrupts were re-enabled.
Which explains why no one has hit this issue until now. ]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reported-by: Laszlo Ersek <lersek@redhat.com>
Cc: <stable@vger.kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
[ Updated changelog. ]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Peter Ujfalusi [Wed, 14 Oct 2015 11:42:43 +0000 (14:42 +0300)]
ARM: common: edma: Fix channel parameter for irq callbacks
commit
696d8b70c09dd421c4d037fab04341e5b30585cf upstream.
In case when the interrupt happened for the second eDMA the channel
number was incorrectly passed to the client driver.
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Florian Fainelli [Sat, 3 Oct 2015 20:03:47 +0000 (13:03 -0700)]
ARM: orion: Fix DSA platform device after mvmdio conversion
commit
d836ace65ee98d7079bc3c5afdbcc0e27dca20a3 upstream.
DSA expects the host_dev pointer to be the device structure associated
with the MDIO bus controller driver. First commit breaking that was
c3a07134e6aa ("mv643xx_eth: convert to use the Marvell Orion MDIO
driver"), and then, it got completely under the radar for a while.
Reported-by: Frans van de Wiel <fvdw@fvdw.eu>
Fixes: c3a07134e6aa ("mv643xx_eth: convert to use the Marvell Orion MDIO driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Marek Szyprowski [Fri, 28 Aug 2015 08:42:09 +0000 (09:42 +0100)]
ARM: 8427/1: dma-mapping: add support for offset parameter in dma_mmap()
commit
7e31210349e9e03a9a4dff31ab5f2bc83e8e84f5 upstream.
IOMMU-based dma_mmap() implementation lacked proper support for offset
parameter used in mmap call (it always assumed that mapping starts from
offset zero). This patch adds support for offset parameter to IOMMU-based
implementation.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Marek Szyprowski [Fri, 28 Aug 2015 08:41:39 +0000 (09:41 +0100)]
ARM: 8426/1: dma-mapping: add missing range check in dma_mmap()
commit
371f0f085f629fc0f66695f572373ca4445a67ad upstream.
dma_mmap() function in IOMMU-based dma-mapping implementation lacked
a check for valid range of mmap parameters (offset and buffer size), what
might have caused access beyond the allocated buffer. This patch fixes
this issue.
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sasha Levin [Tue, 8 Sep 2015 14:53:40 +0000 (10:53 -0400)]
RDS: verify the underlying transport exists before creating a connection
[ Upstream commit
74e98eb085889b0d2d4908f59f6e00026063014f ]
There was no verification that an underlying transport exists when creating
a connection, this would cause dereferencing a NULL ptr.
It might happen on sockets that weren't properly bound before attempting to
send a message, which will cause a NULL ptr deref:
[135546.047719] kasan: GPF could be caused by NULL-ptr deref or user memory accessgeneral protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN
[135546.051270] Modules linked in:
[135546.051781] CPU: 4 PID: 15650 Comm: trinity-c4 Not tainted
4.2.0-next-20150902-sasha-00041-gbaa1222-dirty #2527
[135546.053217] task:
ffff8800835bc000 ti:
ffff8800bc708000 task.ti:
ffff8800bc708000
[135546.054291] RIP: __rds_conn_create (net/rds/connection.c:194)
[135546.055666] RSP: 0018:
ffff8800bc70fab0 EFLAGS:
00010202
[135546.056457] RAX:
dffffc0000000000 RBX:
0000000000000f2c RCX:
ffff8800835bc000
[135546.057494] RDX:
0000000000000007 RSI:
ffff8800835bccd8 RDI:
0000000000000038
[135546.058530] RBP:
ffff8800bc70fb18 R08:
0000000000000001 R09:
0000000000000000
[135546.059556] R10:
ffffed014d7a3a23 R11:
ffffed014d7a3a21 R12:
0000000000000000
[135546.060614] R13:
0000000000000001 R14:
ffff8801ec3d0000 R15:
0000000000000000
[135546.061668] FS:
00007faad4ffb700(0000) GS:
ffff880252000000(0000) knlGS:
0000000000000000
[135546.062836] CS: 0010 DS: 0000 ES: 0000 CR0:
000000008005003b
[135546.063682] CR2:
000000000000846a CR3:
000000009d137000 CR4:
00000000000006a0
[135546.064723] Stack:
[135546.065048]
ffffffffafe2055c ffffffffafe23fc1 ffffed00493097bf ffff8801ec3d0008
[135546.066247]
0000000000000000 00000000000000d0 0000000000000000 ac194a24c0586342
[135546.067438]
1ffff100178e1f78 ffff880320581b00 ffff8800bc70fdd0 ffff880320581b00
[135546.068629] Call Trace:
[135546.069028] ? __rds_conn_create (include/linux/rcupdate.h:856 net/rds/connection.c:134)
[135546.069989] ? rds_message_copy_from_user (net/rds/message.c:298)
[135546.071021] rds_conn_create_outgoing (net/rds/connection.c:278)
[135546.071981] rds_sendmsg (net/rds/send.c:1058)
[135546.072858] ? perf_trace_lock (include/trace/events/lock.h:38)
[135546.073744] ? lockdep_init (kernel/locking/lockdep.c:3298)
[135546.074577] ? rds_send_drop_to (net/rds/send.c:976)
[135546.075508] ? __might_fault (./arch/x86/include/asm/current.h:14 mm/memory.c:3795)
[135546.076349] ? __might_fault (mm/memory.c:3795)
[135546.077179] ? rds_send_drop_to (net/rds/send.c:976)
[135546.078114] sock_sendmsg (net/socket.c:611 net/socket.c:620)
[135546.078856] SYSC_sendto (net/socket.c:1657)
[135546.079596] ? SYSC_connect (net/socket.c:1628)
[135546.080510] ? trace_dump_stack (kernel/trace/trace.c:1926)
[135546.081397] ? ring_buffer_unlock_commit (kernel/trace/ring_buffer.c:2479 kernel/trace/ring_buffer.c:2558 kernel/trace/ring_buffer.c:2674)
[135546.082390] ? trace_buffer_unlock_commit (kernel/trace/trace.c:1749)
[135546.083410] ? trace_event_raw_event_sys_enter (include/trace/events/syscalls.h:16)
[135546.084481] ? do_audit_syscall_entry (include/trace/events/syscalls.h:16)
[135546.085438] ? trace_buffer_unlock_commit (kernel/trace/trace.c:1749)
[135546.085515] rds_ib_laddr_check(): addr 36.74.25.172 ret -99 node type -1
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jason Wang [Wed, 5 Aug 2015 02:34:04 +0000 (10:34 +0800)]
virtio-net: drop NETIF_F_FRAGLIST
[ Upstream commit
48900cb6af4282fa0fb6ff4d72a81aa3dadb5c39 ]
virtio declares support for NETIF_F_FRAGLIST, but assumes
that there are at most MAX_SKB_FRAGS + 2 fragments which isn't
always true with a fraglist.
A longer fraglist in the skb will make the call to skb_to_sgvec overflow
the sg array, leading to memory corruption.
Drop NETIF_F_FRAGLIST so we only get what we can handle.
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Dumazet [Tue, 10 Nov 2015 01:51:23 +0000 (17:51 -0800)]
net: fix a race in dst_release()
[ Upstream commit
d69bbf88c8d0b367cf3e3a052f6daadf630ee566 ]
Only cpu seeing dst refcount going to 0 can safely
dereference dst->flags.
Otherwise an other cpu might already have freed the dst.
Fixes: 27b75c95f10d ("net: avoid RCU for NOCACHE dst")
Reported-by: Greg Thelen <gthelen@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Francesco Ruggeri [Thu, 5 Nov 2015 16:16:14 +0000 (08:16 -0800)]
packet: race condition in packet_bind
[ Upstream commit
30f7ea1c2b5f5fb7462c5ae44fe2e40cb2d6a474 ]
There is a race conditions between packet_notifier and packet_bind{_spkt}.
It happens if packet_notifier(NETDEV_UNREGISTER) executes between the
time packet_bind{_spkt} takes a reference on the new netdevice and the
time packet_do_bind sets po->ifindex.
In this case the notification can be missed.
If this happens during a dev_change_net_namespace this can result in the
netdevice to be moved to the new namespace while the packet_sock in the
old namespace still holds a reference on it. When the netdevice is later
deleted in the new namespace the deletion hangs since the packet_sock
is not found in the new namespace' &net->packet.sklist.
It can be reproduced with the script below.
This patch makes packet_do_bind check again for the presence of the
netdevice in the packet_sock's namespace after the synchronize_net
in unregister_prot_hook.
More in general it also uses the rcu lock for the duration of the bind
to stop dev_change_net_namespace/rollback_registered_many from
going past the synchronize_net following unlist_netdevice, so that
no NETDEV_UNREGISTER notifications can happen on the new netdevice
while the bind is executing. In order to do this some code from
packet_bind{_spkt} is consolidated into packet_do_dev.
import socket, os, time, sys
proto=7
realDev='em1'
vlanId=400
if len(sys.argv) > 1:
vlanId=int(sys.argv[1])
dev='vlan%d' % vlanId
os.system('taskset -p 0x10 %d' % os.getpid())
s = socket.socket(socket.PF_PACKET, socket.SOCK_RAW, proto)
os.system('ip link add link %s name %s type vlan id %d' %
(realDev, dev, vlanId))
os.system('ip netns add dummy')
pid=os.fork()
if pid == 0:
# dev should be moved while packet_do_bind is in synchronize net
os.system('taskset -p 0x20000 %d' % os.getpid())
os.system('ip link set %s netns dummy' % dev)
os.system('ip netns exec dummy ip link del %s' % dev)
s.close()
sys.exit(0)
time.sleep(.004)
try:
s.bind(('%s' % dev, proto+1))
except:
print 'Could not bind socket'
s.close()
os.system('ip netns del dummy')
sys.exit(0)
os.waitpid(pid, 0)
s.close()
os.system('ip netns del dummy')
sys.exit(0)
Signed-off-by: Francesco Ruggeri <fruggeri@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Dumazet [Mon, 2 Nov 2015 15:50:07 +0000 (07:50 -0800)]
net: avoid NULL deref in inet_ctl_sock_destroy()
[ Upstream commit
8fa677d2706d325d71dab91bf6e6512c05214e37 ]
Under low memory conditions, tcp_sk_init() and icmp_sk_init()
can both iterate on all possible cpus and call inet_ctl_sock_destroy(),
with eventual NULL pointer.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ani Sinha [Fri, 30 Oct 2015 23:54:31 +0000 (16:54 -0700)]
ipmr: fix possible race resulting from improper usage of IP_INC_STATS_BH() in preemptible context.
[ Upstream commit
44f49dd8b5a606870a1f21101522a0f9c4414784 ]
Fixes the following kernel BUG :
BUG: using __this_cpu_add() in preemptible [
00000000] code: bash/2758
caller is __this_cpu_preempt_check+0x13/0x15
CPU: 0 PID: 2758 Comm: bash Tainted: P O 3.18.19 #2
ffffffff8170eaca ffff880110d1b788 ffffffff81482b2a 0000000000000000
0000000000000000 ffff880110d1b7b8 ffffffff812010ae ffff880007cab800
ffff88001a060800 ffff88013a899108 ffff880108b84240 ffff880110d1b7c8
Call Trace:
[<
ffffffff81482b2a>] dump_stack+0x52/0x80
[<
ffffffff812010ae>] check_preemption_disabled+0xce/0xe1
[<
ffffffff812010d4>] __this_cpu_preempt_check+0x13/0x15
[<
ffffffff81419d60>] ipmr_queue_xmit+0x647/0x70c
[<
ffffffff8141a154>] ip_mr_forward+0x32f/0x34e
[<
ffffffff8141af76>] ip_mroute_setsockopt+0xe03/0x108c
[<
ffffffff810553fc>] ? get_parent_ip+0x11/0x42
[<
ffffffff810e6974>] ? pollwake+0x4d/0x51
[<
ffffffff81058ac0>] ? default_wake_function+0x0/0xf
[<
ffffffff810553fc>] ? get_parent_ip+0x11/0x42
[<
ffffffff810613d9>] ? __wake_up_common+0x45/0x77
[<
ffffffff81486ea9>] ? _raw_spin_unlock_irqrestore+0x1d/0x32
[<
ffffffff810618bc>] ? __wake_up_sync_key+0x4a/0x53
[<
ffffffff8139a519>] ? sock_def_readable+0x71/0x75
[<
ffffffff813dd226>] do_ip_setsockopt+0x9d/0xb55
[<
ffffffff81429818>] ? unix_seqpacket_sendmsg+0x3f/0x41
[<
ffffffff813963fe>] ? sock_sendmsg+0x6d/0x86
[<
ffffffff813959d4>] ? sockfd_lookup_light+0x12/0x5d
[<
ffffffff8139650a>] ? SyS_sendto+0xf3/0x11b
[<
ffffffff810d5738>] ? new_sync_read+0x82/0xaa
[<
ffffffff813ddd19>] compat_ip_setsockopt+0x3b/0x99
[<
ffffffff813fb24a>] compat_raw_setsockopt+0x11/0x32
[<
ffffffff81399052>] compat_sock_common_setsockopt+0x18/0x1f
[<
ffffffff813c4d05>] compat_SyS_setsockopt+0x1a9/0x1cf
[<
ffffffff813c4149>] compat_SyS_socketcall+0x180/0x1e3
[<
ffffffff81488ea1>] cstar_dispatch+0x7/0x1e
Signed-off-by: Ani Sinha <ani@arista.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Phil Reid [Fri, 30 Oct 2015 08:43:55 +0000 (16:43 +0800)]
stmmac: Correctly report PTP capabilities.
[ Upstream commit
e6dbe1eb2db0d7a14991c06278dd3030c45fb825 ]
priv->hwts_*_en indicate if timestamping is enabled/disabled at run
time. But priv->dma_cap.time_stamp and priv->dma_cap.atime_stamp
indicates HW is support for PTPv1/PTPv2.
Signed-off-by: Phil Reid <preid@electromag.com.au>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Carol L Soto [Tue, 27 Oct 2015 15:36:20 +0000 (17:36 +0200)]
net/mlx4: Copy/set only sizeof struct mlx4_eqe bytes
[ Upstream commit
c02b05011fadf8e409e41910217ca689f2fc9d91 ]
When doing memcpy/memset of EQEs, we should use sizeof struct
mlx4_eqe as the base size and not caps.eqe_size which could be bigger.
If caps.eqe_size is bigger than the struct mlx4_eqe then we corrupt
data in the master context.
When using a 64 byte stride, the memcpy copied over 63 bytes to the
slave_eq structure. This resulted in copying over the entire eqe of
interest, including its ownership bit -- and also 31 bytes of garbage
into the next WQE in the slave EQ -- which did NOT include the ownership
bit (and therefore had no impact).
However, once the stride is increased to 128, we are overwriting the
ownership bits of *three* eqes in the slave_eq struct. This results
in an incorrect ownership bit for those eqes, which causes the eq to
seem to be full. The issue therefore surfaced only once 128-byte EQEs
started being used in SRIOV and (overarchitectures that have 128/256
byte cache-lines such as PPC) - e.g after commit
77507aa249ae
"net/mlx4_core: Enable CQE/EQE stride support".
Fixes: 08ff32352d6f ('mlx4: 64-byte CQE/EQE support')
Signed-off-by: Carol L Soto <clsoto@linux.vnet.ibm.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sowmini Varadhan [Mon, 26 Oct 2015 16:46:37 +0000 (12:46 -0400)]
RDS-TCP: Recover correctly from pskb_pull()/pksb_trim() failure in rds_tcp_data_recv
[ Upstream commit
8ce675ff39b9958d1c10f86cf58e357efaafc856 ]
Either of pskb_pull() or pskb_trim() may fail under low memory conditions.
If rds_tcp_data_recv() ignores such failures, the application will
receive corrupted data because the skb has not been correctly
carved to the RDS datagram size.
Avoid this by handling pskb_pull/pskb_trim failure in the same
manner as the skb_clone failure: bail out of rds_tcp_data_recv(), and
retry via the deferred call to rds_send_worker() that gets set up on
ENOMEM from rds_tcp_read_sock()
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Guillaume Nault [Thu, 22 Oct 2015 14:57:10 +0000 (16:57 +0200)]
ppp: fix pppoe_dev deletion condition in pppoe_release()
[ Upstream commit
1acea4f6ce1b1c0941438aca75dd2e5c6b09db60 ]
We can't rely on PPPOX_ZOMBIE to decide whether to clear po->pppoe_dev.
PPPOX_ZOMBIE can be set by pppoe_disc_rcv() even when po->pppoe_dev is
NULL. So we have no guarantee that (sk->sk_state & PPPOX_ZOMBIE) implies
(po->pppoe_dev != NULL).
Since we're releasing a PPPoE socket, we want to release the pppoe_dev
if it exists and reset sk_state to PPPOX_DEAD, no matter the previous
value of sk_state. So we can just check for po->pppoe_dev and avoid any
assumption on sk->sk_state.
Fixes: 2b018d57ff18 ("pppoe: drop PPPOX_ZOMBIEs in pppoe_release")
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jason Wang [Fri, 23 Oct 2015 04:57:05 +0000 (00:57 -0400)]
macvtap: unbreak receiving of gro skb with frag list
[ Upstream commit
f23d538bc24a83c16127c2eb82c9cf1adc2b5149 ]
We don't have fraglist support in TAP_FEATURES. This will lead
software segmentation of gro skb with frag list. Fixes by having
frag list support in TAP_FEATURES.
With this patch single session of netperf receiving were restored from
about 5Gb/s to about 12Gb/s on mlx4.
Fixes
a567dd6252 ("macvtap: simplify usage of tap_features")
Cc: Vlad Yasevich <vyasevic@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dan Carpenter [Mon, 19 Oct 2015 10:16:49 +0000 (13:16 +0300)]
irda: precedence bug in irlmp_seq_hb_idx()
[ Upstream commit
50010c20597d14667eff0fdb628309986f195230 ]
This is decrementing the pointer, instead of the value stored in the
pointer. KASan detects it as an out of bounds reference.
Reported-by: "Berry Cheng 程君(成淼)" <chengmiao.cj@alibaba-inc.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Mon, 9 Nov 2015 20:50:58 +0000 (12:50 -0800)]
Linux 3.14.57
Dan Carpenter [Thu, 5 Feb 2015 07:37:33 +0000 (10:37 +0300)]
vhost/scsi: potential memory corruption
commit
59c816c1f24df0204e01851431d3bab3eb76719c upstream.
This code in vhost_scsi_make_tpg() is confusing because we limit "tpgt"
to UINT_MAX but the data type of "tpg->tport_tpgt" and that is a u16.
I looked at the context and it turns out that in
vhost_scsi_set_endpoint(), "tpg->tport_tpgt" is used as an offset into
the vs_tpg[] array which has VHOST_SCSI_MAX_TARGET (256) elements so
anything higher than 255 then it is invalid. I have made that the limit
now.
In vhost_scsi_send_evt() we mask away values higher than 255, but now
that the limit has changed, we don't need the mask.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Cc: Ray Yang <ray.yang@mediatek.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Fri, 6 Nov 2015 19:07:07 +0000 (11:07 -0800)]
xen: fix backport of previous kexec patch
Fixes the backport of
0b34a166f291d255755be46e43ed5497cdd194f2 upstream
Commit
0b34a166f291d255755be46e43ed5497cdd194f2 "x86/xen: Support
kexec/kdump in HVM guests by doing a soft reset" has been added to the
4.2-stable tree" needed to correct the CONFIG variable, as
CONFIG_KEXEC_CORE only showed up in 4.3.
Reported-by: David Vrabel <david.vrabel@citrix.com>
Reported-by: Luis Henriques <luis.henriques@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Soeren Grunewald [Thu, 11 Jun 2015 07:25:04 +0000 (09:25 +0200)]
serial: 8250_pci: Add support for 12 port Exar boards
commit
be32c0cf0462c36f482b5ddcff1d8371be1e183c upstream.
The Exar XR17V358 can also be combined with a XR17V354 chip to act as a
single 12 port chip. This works the same way as the combining two XR17V358
chips. But the reported device id then is 0x4358.
Signed-off-by: Soeren Grunewald <soeren.grunewald@desy.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Soeren Grunewald [Tue, 28 Apr 2015 14:29:49 +0000 (16:29 +0200)]
serial: 8250_pci: Add support for 16 port Exar boards
commit
96a5d18bc1338786fecac73599f1681f59a59a8e upstream.
The Exar XR17V358 chip usually provides only 8 ports. But two chips can be
combined to act as a single 16 port chip. Therefor one chip is configured
as master the second as slave by connecting the mode pin to VCC (master)
or GND (slave).
Then the master chip is reporting a different device-id depending on
whether a slave is detected or not. The UARTs 8-15 are addressed from
0x2000-0x3fff. So the offset of 0x400 from UART to UART can be used to
address all 16 ports as before.
See: https://www.exar.com/common/content/document.ashx?id=1587 page 11
Signed-off-by: Soeren Grunewald <soeren.grunewald@desy.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Roman Gushchin [Fri, 30 Oct 2015 23:53:50 +0000 (10:53 +1100)]
md/raid5: fix locking in handle_stripe_clean_event()
commit
b8a9d66d043ffac116100775a469f05f5158c16f upstream.
After commit
566c09c53455 ("raid5: relieve lock contention in get_active_stripe()")
__find_stripe() is called under conf->hash_locks + hash.
But handle_stripe_clean_event() calls remove_hash() under
conf->device_lock.
Under some cirscumstances the hash chain can be circuited,
and we get an infinite loop with disabled interrupts and locked hash
lock in __find_stripe(). This leads to hard lockup on multiple CPUs
and following system crash.
I was able to reproduce this behavior on raid6 over 6 ssd disks.
The devices_handle_discard_safely option should be set to enable trim
support. The following script was used:
for i in `seq 1 32`; do
dd if=/dev/zero of=large$i bs=10M count=100 &
done
neilb: original was against a 3.x kernel. I forward-ported
to 4.3-rc. This verison is suitable for any kernel since
Commit:
59fc630b8b5f ("RAID5: batch adjacent full stripe write")
(v4.1+). I'll post a version for earlier kernels to stable.
Signed-off-by: Roman Gushchin <klamm@yandex-team.ru>
Fixes: 566c09c53455 ("raid5: relieve lock contention in get_active_stripe()")
Signed-off-by: NeilBrown <neilb@suse.com>
Cc: Shaohua Li <shli@kernel.org>
Cc: <stable@vger.kernel.org> # 3.13 - 4.2
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Doron Tsur [Sun, 11 Oct 2015 12:58:17 +0000 (15:58 +0300)]
IB/cm: Fix rb-tree duplicate free and use-after-free
commit
0ca81a2840f77855bbad1b9f172c545c4dc9e6a4 upstream.
ib_send_cm_sidr_rep could sometimes erase the node from the sidr
(depending on errors in the process). Since ib_send_cm_sidr_rep is
called both from cm_sidr_req_handler and cm_destroy_id, cm_id_priv
could be either erased from the rb_tree twice or not erased at all.
Fixing that by making sure it's erased only once before freeing
cm_id_priv.
Fixes: a977049dacde ('[PATCH] IB: Add the kernel CM implementation')
Signed-off-by: Doron Tsur <doront@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dāvis Mosāns [Fri, 21 Aug 2015 04:29:22 +0000 (07:29 +0300)]
mvsas: Fix NULL pointer dereference in mvs_slot_task_free
commit
2280521719e81919283b82902ac24058f87dfc1b upstream.
When pci_pool_alloc fails in mvs_task_prep then task->lldd_task stays
NULL but it's later used in mvs_abort_task as slot which is passed
to mvs_slot_task_free causing NULL pointer dereference.
Just return from mvs_slot_task_free when passed with NULL slot.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=101891
Signed-off-by: Dāvis Mosāns <davispuh@gmail.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: James Bottomley <JBottomley@Odin.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
NeilBrown [Sat, 31 Oct 2015 00:00:56 +0000 (11:00 +1100)]
Revert "md: allow a partially recovered device to be hot-added to an array."
commit
d01552a76d71f9879af448e9142389ee9be6e95b upstream.
This reverts commit
7eb418851f3278de67126ea0c427641ab4792c57.
This commit is poorly justified, I can find not discusison in email,
and it clearly causes a problem.
If a device which is being recovered fails and is subsequently
re-added to an array, there could easily have been changes to the
array *before* the point where the recovery was up to. So the
recovery must start again from the beginning.
If a spare is being recovered and fails, then when it is re-added we
really should do a bitmap-based recovery up to the recovery-offset,
and then a full recovery from there. Before this reversion, we only
did the "full recovery from there" which is not corect. After this
reversion with will do a full recovery from the start, which is safer
but not ideal.
It will be left to a future patch to arrange the two different styles
of recovery.
Reported-and-tested-by: Nate Dailey <nate.dailey@stratus.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Fixes: 7eb418851f32 ("md: allow a partially recovered device to be hot-added to an array.")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jes Sorensen [Tue, 20 Oct 2015 16:09:13 +0000 (12:09 -0400)]
md/raid10: submit_bio_wait() returns 0 on success
commit
681ab4696062f5aa939c9e04d058732306a97176 upstream.
This was introduced with
9e882242c6193ae6f416f2d8d8db0d9126bd996b
which changed the return value of submit_bio_wait() to return != 0 on
error, but didn't update the caller accordingly.
Fixes: 9e882242c6 ("block: Add submit_bio_wait(), remove from md")
Reported-by: Bill Kuzeja <William.Kuzeja@stratus.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jes Sorensen [Tue, 20 Oct 2015 16:09:12 +0000 (12:09 -0400)]
md/raid1: submit_bio_wait() returns 0 on success
commit
203d27b0226a05202438ddb39ef0ef1acb14a759 upstream.
This was introduced with
9e882242c6193ae6f416f2d8d8db0d9126bd996b
which changed the return value of submit_bio_wait() to return != 0 on
error, but didn't update the caller accordingly.
Fixes: 9e882242c6 ("block: Add submit_bio_wait(), remove from md")
Reported-by: Bill Kuzeja <William.Kuzeja@stratus.com>
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Herbert Xu [Mon, 19 Oct 2015 10:23:57 +0000 (18:23 +0800)]
crypto: api - Only abort operations on fatal signal
commit
3fc89adb9fa4beff31374a4bf50b3d099d88ae83 upstream.
Currently a number of Crypto API operations may fail when a signal
occurs. This causes nasty problems as the caller of those operations
are often not in a good position to restart the operation.
In fact there is currently no need for those operations to be
interrupted by user signals at all. All we need is for them to
be killable.
This patch replaces the relevant calls of signal_pending with
fatal_signal_pending, and wait_for_completion_interruptible with
wait_for_completion_killable, respectively.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Peter Zijlstra [Thu, 20 Aug 2015 01:04:59 +0000 (10:34 +0930)]
module: Fix locking in symbol_put_addr()
commit
275d7d44d802ef271a42dc87ac091a495ba72fc5 upstream.
Poma (on the way to another bug) reported an assertion triggering:
[<
ffffffff81150529>] module_assert_mutex_or_preempt+0x49/0x90
[<
ffffffff81150822>] __module_address+0x32/0x150
[<
ffffffff81150956>] __module_text_address+0x16/0x70
[<
ffffffff81150f19>] symbol_put_addr+0x29/0x40
[<
ffffffffa04b77ad>] dvb_frontend_detach+0x7d/0x90 [dvb_core]
Laura Abbott <labbott@redhat.com> produced a patch which lead us to
inspect symbol_put_addr(). This function has a comment claiming it
doesn't need to disable preemption around the module lookup
because it holds a reference to the module it wants to find, which
therefore cannot go away.
This is wrong (and a false optimization too, preempt_disable() is really
rather cheap, and I doubt any of this is on uber critical paths,
otherwise it would've retained a pointer to the actual module anyway and
avoided the second lookup).
While its true that the module cannot go away while we hold a reference
on it, the data structure we do the lookup in very much _CAN_ change
while we do the lookup. Therefore fix the comment and add the
required preempt_disable().
Reported-by: poma <pomidorabelisima@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Fixes: a6e6abd575fc ("module: remove module_text_address()")
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cathy Avery [Fri, 2 Oct 2015 13:35:01 +0000 (09:35 -0400)]
xen-blkfront: check for null drvdata in blkback_changed (XenbusStateClosing)
commit
a54c8f0f2d7df525ff997e2afe71866a1a013064 upstream.
xen-blkfront will crash if the check to talk_to_blkback()
in blkback_changed()(XenbusStateInitWait) returns an error.
The driver data is freed and info is set to NULL. Later during
the close process via talk_to_blkback's call to xenbus_dev_fatal()
the null pointer is passed to and dereference in blkfront_closing.
Signed-off-by: Cathy Avery <cathy.avery@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Laura Abbott [Mon, 12 Oct 2015 08:30:13 +0000 (11:30 +0300)]
xhci: Add spurious wakeup quirk for LynxPoint-LP controllers
commit
fd7cd061adcf5f7503515ba52b6a724642a839c8 upstream.
We received several reports of systems rebooting and powering on
after an attempted shutdown. Testing showed that setting
XHCI_SPURIOUS_WAKEUP quirk in addition to the XHCI_SPURIOUS_REBOOT
quirk allowed the system to shutdown as expected for LynxPoint-LP
xHCI controllers. Set the quirk back.
Note that the quirk was originally introduced for LynxPoint and
LynxPoint-LP just for this same reason. See:
commit
638298dc66ea ("xhci: Fix spurious wakeups after S5 on Haswell")
It was later limited to only concern HP machines as it caused
regression on some machines, see both bug and commit:
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=66171
commit
6962d914f317 ("xhci: Limit the spurious wakeup fix only to HP machines")
Later it was discovered that the powering on after shutdown
was limited to LynxPoint-LP (Haswell-ULT) and that some non-LP HP
machine suffered from spontaneous resume from S3 (which should
not be related to the SPURIOUS_WAKEUP quirk at all). An attempt
to fix this then removed the SPURIOUS_WAKEUP flag usage completely.
commit
b45abacde3d5 ("xhci: no switching back on non-ULT Haswell")
Current understanding is that LynxPoint-LP (Haswell ULT) machines
need the SPURIOUS_WAKEUP quirk, otherwise they will restart, and
plain Lynxpoint (Haswell) machines may _not_ have the quirk
set otherwise they again will restart.
Signed-off-by: Laura Abbott <labbott@fedoraproject.org>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Oliver Neukum <oneukum@suse.com>
[Added more history to commit message -Mathias]
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Mathias Nyman [Mon, 12 Oct 2015 08:30:12 +0000 (11:30 +0300)]
xhci: handle no ping response error properly
commit
3b4739b8951d650becbcd855d7d6f18ac98a9a85 upstream.
If a host fails to wake up a isochronous SuperSpeed device from U1/U2
in time for a isoch transfer it will generate a "No ping response error"
Host will then move to the next transfer descriptor.
Handle this case in the same way as missed service errors, tag the
current TD as skipped and handle it on the next transfer event.
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Mike Snitzer [Thu, 22 Oct 2015 14:56:40 +0000 (10:56 -0400)]
dm btree: fix leak of bufio-backed block in btree_split_beneath error path
commit
4dcb8b57df3593dcb20481d9d6cf79d1dc1534be upstream.
btree_split_beneath()'s error path had an outstanding FIXME that speaks
directly to the potential for _not_ cleaning up a previously allocated
bufio-backed block.
Fix this by releasing the previously allocated bufio block using
unlock_block().
Reported-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Acked-by: Joe Thornber <thornber@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Joe Thornber [Wed, 21 Oct 2015 17:36:49 +0000 (18:36 +0100)]
dm btree remove: fix a bug when rebalancing nodes after removal
commit
2871c69e025e8bc507651d5a9cf81a8a7da9d24b upstream.
Commit
4c7e309340ff ("dm btree remove: fix bug in redistribute3") wasn't
a complete fix for redistribute3().
The redistribute3 function takes 3 btree nodes and shares out the entries
evenly between them. If the three nodes in total contained
(MAX_ENTRIES * 3) - 1 entries between them then this was erroneously getting
rebalanced as (MAX_ENTRIES - 1) on the left and right, and (MAX_ENTRIES + 1) in
the center.
Fix this issue by being more careful about calculating the target number
of entries for the left and right nodes.
Unit tested in userspace using this program:
https://github.com/jthornber/redistribute3-test/blob/master/redistribute3_t.c
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Will Deacon [Wed, 28 Oct 2015 16:56:13 +0000 (16:56 +0000)]
Revert "ARM64: unwind: Fix PC calculation"
commit
9702970c7bd3e2d6fecb642a190269131d4ac16c upstream.
This reverts commit
e306dfd06fcb44d21c80acb8e5a88d55f3d1cf63.
With this patch applied, we were the only architecture making this sort
of adjustment to the PC calculation in the unwinder. This causes
problems for ftrace, where the PC values are matched against the
contents of the stack frames in the callchain and fail to match any
records after the address adjustment.
Whilst there has been some effort to change ftrace to workaround this,
those patches are not yet ready for mainline and, since we're the odd
architecture in this regard, let's just step in line with other
architectures (like arch/arm/) for now.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ilya Dryomov [Sun, 11 Oct 2015 17:38:00 +0000 (19:38 +0200)]
rbd: prevent kernel stack blow up on rbd map
commit
6d69bb536bac0d403d83db1ca841444981b280cd upstream.
Mapping an image with a long parent chain (e.g. image foo, whose parent
is bar, whose parent is baz, etc) currently leads to a kernel stack
overflow, due to the following recursion in the reply path:
rbd_osd_req_callback()
rbd_obj_request_complete()
rbd_img_obj_callback()
rbd_img_parent_read_callback()
rbd_obj_request_complete()
...
Limit the parent chain to 16 images, which is ~5K worth of stack. When
the above recursion is eliminated, this limit can be lifted.
Fixes: http://tracker.ceph.com/issues/12538
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
[idryomov@gmail.com: backport to 3.14: rbd_dev->opts, context]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ilya Dryomov [Sun, 11 Oct 2015 17:38:00 +0000 (19:38 +0200)]
rbd: don't leak parent_spec in rbd_dev_probe_parent()
commit
1f2c6651f69c14d0d3a9cfbda44ea101b02160ba upstream.
Currently we leak parent_spec and trigger a "parent reference
underflow" warning if rbd_dev_create() in rbd_dev_probe_parent() fails.
The problem is we take the !parent out_err branch and that only drops
refcounts; parent_spec that would've been freed had we called
rbd_dev_unparent() remains and triggers rbd_warn() in
rbd_dev_parent_put() - at that point we have parent_spec != NULL and
parent_ref == 0, so counter ends up being -1 after the decrement.
Redo rbd_dev_probe_parent() to fix this.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
[idryomov@gmail.com: backport to < 4.2: rbd_dev->opts]
Reviewed-by: Alex Elder <elder@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ronny Hegewald [Thu, 15 Oct 2015 18:50:46 +0000 (18:50 +0000)]
rbd: require stable pages if message data CRCs are enabled
commit
bae818ee1577c27356093901a0ea48f672eda514 upstream.
rbd requires stable pages, as it performs a crc of the page data before
they are send to the OSDs.
But since kernel 3.9 (patch
1d1d1a767206fbe5d4c69493b7e6d2a8d08cc0a0
"mm: only enforce stable page writes if the backing device requires
it") it is not assumed anymore that block devices require stable pages.
This patch sets the necessary flag to get stable pages back for rbd.
In a ceph installation that provides multiple ext4 formatted rbd
devices "bad crc" messages appeared regularly (ca 1 message every 1-2
minutes on every OSD that provided the data for the rbd) in the
OSD-logs before this patch. After this patch this messages are pretty
much gone (only ca 1-2 / month / OSD).
Signed-off-by: Ronny Hegewald <Ronny.Hegewald@online.de>
[idryomov@gmail.com: require stable pages only in crc case, changelog]
[idryomov@gmail.com: backport to 3.9-3.17: context]
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alexandre Belloni [Wed, 7 Oct 2015 11:10:54 +0000 (13:10 +0200)]
iio: mxs-lradc: Fix temperature offset
commit
b94e22805a2224061bb263a82b72e09544a5fbb3 upstream.
0° Kelvin is actually −273.15°C, not -272.15°C. Fix the temperature offset.
Also improve the comment explaining the calculation.
Reported-by: Janusz Użycki <j.uzycki@elpromaelectronics.com>
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Stefan Wahren <stefan.wahren@i2se.com>
Acked-by: Marek Vasut <marex@denx.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>