Jiri Pirko [Sat, 8 Jun 2013 13:00:54 +0000 (15:00 +0200)]
team: move add to port list before port enablement
[ Upstream commit
72df935d985c1575ed44ad2c8c653b28147993fa ]
team_port_enable() adds port to port_hashlist. Reader sees port
in team_get_port_by_index_rcu() and returns it, but
team_get_first_port_txable_rcu() tries to go through port_list, where the
port is not inserted yet -> NULL pointer dereference.
Fix this by reordering port_list and port_hashlist insertion.
Panic is easily triggeable when txing packets and adding/removing port
in a loop.
Introduced by commit
3d249d4c "net: introduce ethernet teaming device"
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jiri Pirko [Sat, 8 Jun 2013 13:00:53 +0000 (15:00 +0200)]
team: check return value of team_get_port_by_index_rcu() for NULL
[ Upstream commit
76c455decbbad31de21c727edb184a963f42b40b ]
team_get_port_by_index_rcu() might return NULL due to race between port
removal and skb tx path. Panic is easily triggeable when txing packets
and adding/removing port in a loop.
introduced by commit
3d249d4ca "net: introduce ethernet teaming device"
and commit
753f993911b "team: introduce random mode" (for random mode)
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jason Wang [Sat, 8 Jun 2013 06:17:41 +0000 (14:17 +0800)]
tuntap: set SOCK_ZEROCOPY flag during open
[ Upstream commit
19a6afb23e5d323e1245baa4e62755492b2f1200 ]
Commit
54f968d6efdbf7dec36faa44fc11f01b0e4d1990
(tuntap: move socket to tun_file) forgets to set SOCK_ZEROCOPY flag, which will
prevent vhost_net from doing zercopy w/ tap. This patch fixes this by setting
it during file open.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Daniel Borkmann [Thu, 6 Jun 2013 13:53:47 +0000 (15:53 +0200)]
net: sctp: fix NULL pointer dereference in socket destruction
[ Upstream commit
1abd165ed757db1afdefaac0a4bc8a70f97d258c ]
While stress testing sctp sockets, I hit the following panic:
BUG: unable to handle kernel NULL pointer dereference at
0000000000000020
IP: [<
ffffffffa0490c4e>] sctp_endpoint_free+0xe/0x40 [sctp]
PGD
7cead067 PUD
7ce76067 PMD 0
Oops: 0000 [#1] SMP
Modules linked in: sctp(F) libcrc32c(F) [...]
CPU: 7 PID: 2950 Comm: acc Tainted: GF 3.10.0-rc2+ #1
Hardware name: Dell Inc. PowerEdge T410/0H19HD, BIOS 1.6.3 02/01/2011
task:
ffff88007ce0e0c0 ti:
ffff88007b568000 task.ti:
ffff88007b568000
RIP: 0010:[<
ffffffffa0490c4e>] [<
ffffffffa0490c4e>] sctp_endpoint_free+0xe/0x40 [sctp]
RSP: 0018:
ffff88007b569e08 EFLAGS:
00010292
RAX:
0000000000000000 RBX:
ffff88007db78a00 RCX:
dead000000200200
RDX:
ffffffffa049fdb0 RSI:
ffff8800379baf38 RDI:
0000000000000000
RBP:
ffff88007b569e18 R08:
ffff88007c230da0 R09:
0000000000000001
R10:
0000000000000000 R11:
0000000000000000 R12:
0000000000000000
R13:
ffff880077990d00 R14:
0000000000000084 R15:
ffff88007db78a00
FS:
00007fc18ab61700(0000) GS:
ffff88007fc60000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
000000008005003b
CR2:
0000000000000020 CR3:
000000007cf9d000 CR4:
00000000000007e0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000ffff0ff0 DR7:
0000000000000400
Stack:
ffff88007b569e38 ffff88007db78a00 ffff88007b569e38 ffffffffa049fded
ffffffff81abf0c0 ffff88007db78a00 ffff88007b569e58 ffffffff8145b60e
0000000000000000 0000000000000000 ffff88007b569eb8 ffffffff814df36e
Call Trace:
[<
ffffffffa049fded>] sctp_destroy_sock+0x3d/0x80 [sctp]
[<
ffffffff8145b60e>] sk_common_release+0x1e/0xf0
[<
ffffffff814df36e>] inet_create+0x2ae/0x350
[<
ffffffff81455a6f>] __sock_create+0x11f/0x240
[<
ffffffff81455bf0>] sock_create+0x30/0x40
[<
ffffffff8145696c>] SyS_socket+0x4c/0xc0
[<
ffffffff815403be>] ? do_page_fault+0xe/0x10
[<
ffffffff8153cb32>] ? page_fault+0x22/0x30
[<
ffffffff81544e02>] system_call_fastpath+0x16/0x1b
Code: 0c c9 c3 66 2e 0f 1f 84 00 00 00 00 00 e8 fb fe ff ff c9 c3 66 0f
1f 84 00 00 00 00 00 55 48 89 e5 53 48 83 ec 08 66 66 66 66 90 <48>
8b 47 20 48 89 fb c6 47 1c 01 c6 40 12 07 e8 9e 68 01 00 48
RIP [<
ffffffffa0490c4e>] sctp_endpoint_free+0xe/0x40 [sctp]
RSP <
ffff88007b569e08>
CR2:
0000000000000020
---[ end trace
e0d71ec1108c1dd9 ]---
I did not hit this with the lksctp-tools functional tests, but with a
small, multi-threaded test program, that heavily allocates, binds,
listens and waits in accept on sctp sockets, and then randomly kills
some of them (no need for an actual client in this case to hit this).
Then, again, allocating, binding, etc, and then killing child processes.
This panic then only occurs when ``echo 1 > /proc/sys/net/sctp/auth_enable''
is set. The cause for that is actually very simple: in sctp_endpoint_init()
we enter the path of sctp_auth_init_hmacs(). There, we try to allocate
our crypto transforms through crypto_alloc_hash(). In our scenario,
it then can happen that crypto_alloc_hash() fails with -EINTR from
crypto_larval_wait(), thus we bail out and release the socket via
sk_common_release(), sctp_destroy_sock() and hit the NULL pointer
dereference as soon as we try to access members in the endpoint during
sctp_endpoint_free(), since endpoint at that time is still NULL. Now,
if we have that case, we do not need to do any cleanup work and just
leave the destruction handler.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jason Wang [Wed, 5 Jun 2013 07:40:46 +0000 (15:40 +0800)]
vhost_net: clear msg.control for non-zerocopy case during tx
[ Upstream commit
4364d5f96eed7994a2c625bd9216656e55fba0cb ]
When we decide not use zero-copy, msg.control should be set to NULL otherwise
macvtap/tap may set zerocopy callbacks which may decrease the kref of ubufs
wrongly.
Bug were introduced by commit
cedb9bdce099206290a2bdd02ce47a7b253b6a84
(vhost-net: skip head management if no outstanding).
This solves the following warnings:
WARNING: at include/linux/kref.h:47 handle_tx+0x477/0x4b0 [vhost_net]()
Modules linked in: vhost_net macvtap macvlan tun nfsd exportfs bridge stp llc openvswitch kvm_amd kvm bnx2 megaraid_sas [last unloaded: tun]
CPU: 5 PID: 8670 Comm: vhost-8668 Not tainted 3.10.0-rc2+ #1566
Hardware name: Dell Inc. PowerEdge R715/00XHKG, BIOS 1.5.2 04/19/2011
ffffffffa0198323 ffff88007c9ebd08 ffffffff81796b73 ffff88007c9ebd48
ffffffff8103d66b 000000007b773e20 ffff8800779f0000 ffff8800779f43f0
ffff8800779f8418 000000000000015c 0000000000000062 ffff88007c9ebd58
Call Trace:
[<
ffffffff81796b73>] dump_stack+0x19/0x1e
[<
ffffffff8103d66b>] warn_slowpath_common+0x6b/0xa0
[<
ffffffff8103d6b5>] warn_slowpath_null+0x15/0x20
[<
ffffffffa0197627>] handle_tx+0x477/0x4b0 [vhost_net]
[<
ffffffffa0197690>] handle_tx_kick+0x10/0x20 [vhost_net]
[<
ffffffffa019541e>] vhost_worker+0xfe/0x1a0 [vhost_net]
[<
ffffffffa0195320>] ? vhost_attach_cgroups_work+0x30/0x30 [vhost_net]
[<
ffffffffa0195320>] ? vhost_attach_cgroups_work+0x30/0x30 [vhost_net]
[<
ffffffff81061f46>] kthread+0xc6/0xd0
[<
ffffffff81061e80>] ? kthread_freezable_should_stop+0x70/0x70
[<
ffffffff817a1aec>] ret_from_fork+0x7c/0xb0
[<
ffffffff81061e80>] ? kthread_freezable_should_stop+0x70/0x70
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Dumazet [Tue, 4 Jun 2013 07:11:48 +0000 (07:11 +0000)]
net_sched: htb: do not mix 1ns and 64ns time units
[ Upstream commit
5343a7f8be11951cb3095b91e8e4eb506cfacc0f ]
commit
56b765b79 ("htb: improved accuracy at high rates") added another
regression for low rates, because it mixes 1ns and 64ns time units.
So the maximum delay (mbuffer) was not 60 second, but 937 ms.
Lets convert all time fields to 1ns as 64bit arches are becoming the
norm.
Reported-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Tested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Gao feng [Sun, 2 Jun 2013 22:16:21 +0000 (22:16 +0000)]
ipv6: assign rt6_info to inet6_ifaddr in init_loopback
[ Upstream commit
534c877928a16ae5f9776436a497109639bf67dc ]
Commit
25fb6ca4ed9cad72f14f61629b68dc03c0d9713f
"net IPv6 : Fix broken IPv6 routing table after loopback down-up"
forgot to assign rt6_info to the inet6_ifaddr.
When disable the net device, the rt6_info which allocated
in init_loopback will not be destroied in __ipv6_ifa_notify.
This will trigger the waring message below
[23527.916091] unregister_netdevice: waiting for tap0 to become free. Usage count = 1
Reported-by: Arkadiusz Miskiewicz <a.miskiewicz@gmail.com>
Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Dumazet [Sun, 2 Jun 2013 13:55:05 +0000 (13:55 +0000)]
net_sched: restore "overhead xxx" handling
[ Upstream commit
01cb71d2d47b78354358e4bb938bb06323e17498 ]
commit
56b765b79 ("htb: improved accuracy at high rates")
broke the "overhead xxx" handling, as well as the "linklayer atm"
attribute.
tc class add ... htb rate X ceil Y linklayer atm overhead 10
This patch restores the "overhead xxx" handling, for htb, tbf
and act_police
The "linklayer atm" thing needs a separate fix.
Reported-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Vimalkumar <j.vimal@gmail.com>
Cc: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Dumazet [Wed, 29 May 2013 09:06:27 +0000 (09:06 +0000)]
net: force a reload of first item in hlist_nulls_for_each_entry_rcu
[ Upstream commit
c87a124a5d5e8cf8e21c4363c3372bcaf53ea190 ]
Roman Gushchin discovered that udp4_lib_lookup2() was not reloading
first item in the rcu protected list, in case the loop was restarted.
This produced soft lockups as in https://lkml.org/lkml/2013/4/16/37
rcu_dereference(X)/ACCESS_ONCE(X) seem to not work as intended if X is
ptr->field :
In some cases, gcc caches the value or ptr->field in a register.
Use a barrier() to disallow such caching, as documented in
Documentation/atomic_ops.txt line 114
Thanks a lot to Roman for providing analysis and numerous patches.
Diagnosed-by: Roman Gushchin <klamm@yandex-team.ru>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Boris Zhmurov <zhmurov@yandex-team.ru>
Signed-off-by: Roman Gushchin <klamm@yandex-team.ru>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pravin B Shelar [Thu, 30 May 2013 06:45:27 +0000 (06:45 +0000)]
udp6: Fix udp fragmentation for tunnel traffic.
[ Upstream commit
1e2bd517c108816220f262d7954b697af03b5f9c ]
udp6 over GRE tunnel does not work after to GRE tso changes. GRE
tso handler passes inner packet but keeps track of outer header
start in SKB_GSO_CB(skb)->mac_offset. udp6 fragment need to
take care of outer header, which start at the mac_offset, while
adding fragment header.
This bug is introduced by commit
68c3316311 (GRE: Add TCP
segmentation offload for GRE).
Reported-by: Dmitry Kravkov <dkravkov@gmail.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Tested-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jason Wang [Tue, 28 May 2013 18:32:11 +0000 (18:32 +0000)]
tuntap: forbid changing mq flag for persistent device
[ Upstream commit
8e6d91ae0917bf934ed86411148f79d904728d51 ]
We currently allow changing the mq flag (IFF_MULTI_QUEUE) for a persistent
device. This will result a mismatch between the number the queues in netdev and
tuntap. This is because we only allocate a 1q netdevice when IFF_MULTI_QUEUE was
not specified, so when we set the IFF_MULTI_QUEUE and try to attach more queues
later, netif_set_real_num_tx_queues() may fail which result a single queue
netdevice with multiple sockets attached.
Solve this by disallowing changing the mq flag for persistent device.
Bug was introduced by commit
edfb6a148ce62e5e19354a1dcd9a34e00815c2a1
(tuntap: reduce memory using of queues).
Reported-by: Sriram Narasimhan <sriram.narasimhan@hp.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Federico Vaga [Tue, 28 May 2013 05:02:44 +0000 (05:02 +0000)]
net/core/sock.c: add missing VSOCK string in af_family_*_key_strings
[ Upstream commit
456db6a4d495f40777da6f1f32f62f13026f52db ]
The three arrays of strings: af_family_key_strings,
af_family_slock_key_strings and af_family_clock_key_strings have not
VSOCK's string
Signed-off-by: Federico Vaga <federico.vaga@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Andy Lutomirski [Wed, 22 May 2013 21:07:44 +0000 (14:07 -0700)]
net: Block MSG_CMSG_COMPAT in send(m)msg and recv(m)msg
[ Upstream commits
1be374a0518a288147c6a7398792583200a67261 and
a7526eb5d06b0084ef12d7b168d008fcf516caab ]
MSG_CMSG_COMPAT is (AFAIK) not intended to be part of the API --
it's a hack that steals a bit to indicate to other networking code
that a compat entry was used. So don't allow it from a non-compat
syscall.
This prevents an oops when running this code:
int main()
{
int s;
struct sockaddr_in addr;
struct msghdr *hdr;
char *highpage = mmap((void*)(TASK_SIZE_MAX - 4096), 4096,
PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED, -1, 0);
if (highpage == MAP_FAILED)
err(1, "mmap");
s = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP);
if (s == -1)
err(1, "socket");
addr.sin_family = AF_INET;
addr.sin_port = htons(1);
addr.sin_addr.s_addr = htonl(INADDR_LOOPBACK);
if (connect(s, (struct sockaddr*)&addr, sizeof(addr)) != 0)
err(1, "connect");
void *evil = highpage + 4096 - COMPAT_MSGHDR_SIZE;
printf("Evil address is %p\n", evil);
if (syscall(__NR_sendmmsg, s, evil, 1, MSG_CMSG_COMPAT) < 0)
err(1, "sendmmsg");
return 0;
}
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Michal Kubecek [Tue, 28 May 2013 06:26:49 +0000 (08:26 +0200)]
ipv4: fix redirect handling for TCP packets
[ Upstream commit
f96ef988cc603487c03a6de07807b06cbe641829 ]
Unlike ipv4_redirect() and ipv4_sk_redirect(), ip_do_redirect()
doesn't call __build_flow_key() directly but via
ip_rt_build_flow_key() wrapper. This leads to __build_flow_key()
getting pointer to IPv4 header of the ICMP redirect packet
rather than pointer to the embedded IPv4 header of the packet
initiating the redirect.
As a result, handling of ICMP redirects initiated by TCP packets
is broken. Issue was introduced by
4895c771c ("ipv4: Add FIB nexthop exceptions.")
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Giuseppe CAVALLARO [Sun, 26 May 2013 21:31:28 +0000 (21:31 +0000)]
net: phy: fix a bug when verify the EEE support
[ Upstream commit
9a9c56cb34e65000d1f0a4b7553399bfcf7c5a52 ]
The phy_init_eee has to exit with an error when the
local device and its link partner both do not support EEE.
So this patch fixes a problem when verify this.
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Dumazet [Fri, 24 May 2013 05:49:58 +0000 (05:49 +0000)]
ip_tunnel: fix kernel panic with icmp_dest_unreach
[ Upstream commit
a622260254ee481747cceaaa8609985b29a31565 ]
Daniel Petre reported crashes in icmp_dst_unreach() with following call
graph:
Daniel found a similar problem mentioned in
http://lkml.indiana.edu/hypermail/linux/kernel/1007.0/00961.html
And indeed this is the root cause : skb->cb[] contains data fooling IP
stack.
We must clear IPCB in ip_tunnel_xmit() sooner in case dst_link_failure()
is called. Or else skb->cb[] might contain garbage from GSO segmentation
layer.
A similar fix was tested on linux-3.9, but gre code was refactored in
linux-3.10. I'll send patches for stable kernels as well.
Many thanks to Daniel for providing reports, patches and testing !
Reported-by: Daniel Petre <daniel.petre@rcs-rds.ro>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Dumazet [Thu, 23 May 2013 07:44:20 +0000 (07:44 +0000)]
tcp: xps: fix reordering issues
[ Upstream commit
547669d483e5783d722772af1483fa474da7caf9 ]
commit
3853b5841c01a ("xps: Improvements in TX queue selection")
introduced ooo_okay flag, but the condition to set it is slightly wrong.
In our traces, we have seen ACK packets being received out of order,
and RST packets sent in response.
We should test if we have any packets still in host queue.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Timo Teräs [Wed, 22 May 2013 01:40:47 +0000 (01:40 +0000)]
xfrm: properly handle invalid states as an error
[ Upstream commit
497574c72c9922cf20c12aed15313c389f722fa0 ]
The error exit path needs err explicitly set. Otherwise it
returns success and the only caller, xfrm_output_resume(),
would oops in skb_dst(skb)->ops derefence as skb_dst(skb) is
NULL.
Bug introduced in commit
bb65a9cb (xfrm: removes a superfluous
check and add a statistic).
Signed-off-by: Timo Teräs <timo.teras@iki.fi>
Cc: Li RongQing <roy.qing.li@gmail.com>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Nandita Dukkipati [Tue, 21 May 2013 15:12:07 +0000 (15:12 +0000)]
tcp: bug fix in proportional rate reduction.
[ Upstream commit
35f079ebbc860dcd1cca70890c9c8d59c1145525 ]
This patch is a fix for a bug triggering newly_acked_sacked < 0
in tcp_ack(.).
The bug is triggered by sacked_out decreasing relative to prior_sacked,
but packets_out remaining the same as pior_packets. This is because the
snapshot of prior_packets is taken after tcp_sacktag_write_queue() while
prior_sacked is captured before tcp_sacktag_write_queue(). The problem
is: tcp_sacktag_write_queue (tcp_match_skb_to_sack() -> tcp_fragment)
adjusts the pcount for packets_out and sacked_out (MSS change or other
reason). As a result, this delta in pcount is reflected in
(prior_sacked - sacked_out) but not in (prior_packets - packets_out).
This patch does the following:
1) initializes prior_packets at the start of tcp_ack() so as to
capture the delta in packets_out created by tcp_fragment.
2) introduces a new "previous_packets_out" variable that snapshots
packets_out right before tcp_clean_rtx_queue, so pkts_acked can be
correctly computed as before.
3) Computes pkts_acked using previous_packets_out, and computes
newly_acked_sacked using prior_packets.
Signed-off-by: Nandita Dukkipati <nanditad@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
stephen hemminger [Mon, 20 May 2013 06:54:43 +0000 (06:54 +0000)]
8139cp: reset BQL when ring tx ring cleared
[ Upstream commit
98962baad72fd6d393bf39dbb7c2076532c363c6 ]
This patch cures transmit timeout's with DHCP observed
while running under KVM. When the transmit ring is cleaned out,
the Byte Queue Limit values need to be reset.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Francois Romieu [Sat, 18 May 2013 01:24:46 +0000 (01:24 +0000)]
r8169: fix offloaded tx checksum for small packets.
[ Upstream commit
b423e9ae49d78ea3f53b131c8d5a6087aed16fd6 ]
8168evl offloaded checksums are wrong since commit
e5195c1f31f399289347e043d6abf3ffa80f0005 ("r8169: fix 8168evl frame padding.")
pads small packets to 60 bytes (without ethernet checksum). Typical symptoms
appear as UDP checksums which are wrong by the count of added bytes.
It isn't worth compensating. Let the driver checksum.
Due to the skb length changes, TSO code is moved before the Tx descriptor gets
written.
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Tested-by: Holger Hoffstätte <holger.hoffstaette@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Paul Moore [Fri, 17 May 2013 09:08:50 +0000 (09:08 +0000)]
netlabel: improve domain mapping validation
[ Upstream commit
6b21e1b77d1a3d58ebfd513264c885695e8a0ba5 ]
The net/netlabel/netlabel_domainhash.c:netlbl_domhsh_add() function
does not properly validate new domain hash entries resulting in
potential problems when an administrator attempts to add an invalid
entry. One such problem, as reported by Vlad Halilov, is a kernel
BUG (found in netlabel_domainhash.c:netlbl_domhsh_audit_add()) when
adding an IPv6 outbound mapping with a CIPSO configuration.
This patch corrects this problem by adding the necessary validation
code to netlbl_domhsh_add() via the newly created
netlbl_domhsh_validate() function.
Ideally this patch should also be pushed to the currently active
-stable trees.
Reported-by: Vlad Halilov <vlad.halilov@gmail.com>
Signed-off-by: Paul Moore <pmoore@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Dumazet [Fri, 17 May 2013 04:53:13 +0000 (04:53 +0000)]
ipv6: fix possible crashes in ip6_cork_release()
[ Upstream commit
284041ef21fdf2e0d216ab6b787bc9072b4eb58a ]
commit
0178b695fd6b4 ("ipv6: Copy cork options in ip6_append_data")
added some code duplication and bad error recovery, leading to potential
crash in ip6_cork_release() as kfree() could be called with garbage.
use kzalloc() to make sure this wont happen.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sridhar Samudrala [Fri, 17 May 2013 06:39:07 +0000 (06:39 +0000)]
vxlan: Update vxlan fdb 'used' field after each usage
[ Upstream commit
014be2c8eac3381e202f684c1f35ae184a8b152b ]
Fix some instances where vxlan fdb 'used' field is not updated after the entry
is used.
v2: rename vxlan_find_mac() as __vxlan_find_mac() and create a new vxlan_find_mac()
that also updates ->used field.
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Wei Yongjun [Thu, 16 May 2013 22:25:34 +0000 (22:25 +0000)]
gianfar: add missing iounmap() on error in gianfar_ptp_probe()
[ Upstream commit
e5f5e380e0f3bb11f04ca5bc66a551e58e0ad26e ]
Add the missing iounmap() before return from gianfar_ptp_probe()
in the error handling case.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Dumazet [Mon, 13 May 2013 02:24:11 +0000 (02:24 +0000)]
net/802/mrp: fix lockdep splat
[ Upstream commit
faff57a92ba1d7247c5e86ecea2886d2c9d54507 ]
commit
fb745e9a037895 ("net/802/mrp: fix possible race condition when
calling mrp_pdu_queue()") introduced a lockdep splat.
[ 19.735147] =================================
[ 19.735235] [ INFO: inconsistent lock state ]
[ 19.735324] 3.9.2-build-0063 #4 Not tainted
[ 19.735412] ---------------------------------
[ 19.735500] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
[ 19.735592] rmmod/1840 [HC0[0]:SC0[0]:HE1:SE1] takes:
[ 19.735682] (&(&app->lock)->rlock#2){+.?...}, at: [<
f862bb5b>]
mrp_uninit_applicant+0x69/0xba [mrp]
app->lock is normally taken under softirq context, so disable BH to
avoid the splat.
Reported-by: Denys Fedoryshchenko <denys@visp.net.lb>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: David Ward <david.ward@ll.mit.edu>
Cc: Cong Wang <amwang@redhat.com>
Tested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Dumazet [Mon, 13 May 2013 21:25:52 +0000 (21:25 +0000)]
tcp: fix tcp_md5_hash_skb_data()
[ Upstream commit
54d27fcb338bd9c42d1dfc5a39e18f6f9d373c2e ]
TCP md5 communications fail [1] for some devices, because sg/crypto code
assume page offsets are below PAGE_SIZE.
This was discovered using mlx4 driver [2], but I suspect loopback
might trigger the same bug now we use order-3 pages in tcp_sendmsg()
[1] Failure is giving following messages.
huh, entered softirq 3 NET_RX
ffffffff806ad230 preempt_count
00000100,
exited with
00000101?
[2] mlx4 driver uses order-2 pages to allocate RX frags
Reported-by: Matt Schnall <mischnal@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Bernhard Beck <bbeck@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Frank Li [Tue, 7 May 2013 14:08:44 +0000 (14:08 +0000)]
net: fec: fix kernel oops when plug/unplug cable many times
[ Upstream commits
54309fa60b5f57b90c1842176f6045e665d21142 and
3169134478a9638baf0dbb4fdca5a0718cbe8e27 ]
reproduce steps
1. flood ping from other machine
ping -f -s 41000 IP
2. run below script
while [ 1 ]; do ethtool -s eth0 autoneg off;
sleep 3;ethtool -s eth0 autoneg on; sleep 4; done;
You can see oops in one hour.
The reason is fec_restart clear BD but NAPI may use it.
The solution is disable NAPI and stop xmit when reset BD.
disable NAPI may sleep, so fec_restart can't be call in
atomic context.
Signed-off-by: Frank Li <Frank.Li@freescale.com>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Tested-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Paul Gortmaker [Wed, 19 Jun 2013 15:15:26 +0000 (11:15 -0400)]
x86: Fix section mismatch on load_ucode_ap
commit
949785996ec2250fa958fc3a924e5186e9a8fa2c upstream.
We are in the process of removing all the __cpuinit annotations.
While working on making that change, an existing problem was
made evident:
WARNING: arch/x86/kernel/built-in.o(.text+0x198f2): Section mismatch
in reference from the function cpu_init() to the function
.init.text:load_ucode_ap() The function cpu_init() references
the function __init load_ucode_ap(). This is often because cpu_init
lacks a __init annotation or the annotation of load_ucode_ap is wrong.
This now appears because in my working tree, cpu_init() is no longer
tagged as __cpuinit, and so the audit picks up the mismatch. The 2nd
hypothesis from the audit is the correct one, as there was an incorrect
__init tag on the prototype in the header (but __cpuinit was used on
the function itself.)
The audit is telling us that the prototype's __init annotation took
effect and the function did land in the .init.text section. Checking
with objdump on a mainline tree that still has __cpuinit shows that
the __cpuinit on the function takes precedence over the __init on the
prototype, but that won't be true once we make __cpuinit a no-op.
Even though we are removing __cpuinit, we temporarily align both
the function and the prototype on __cpuinit so that the changeset
can be applied to stable trees if desired.
[ hpa: build fix only, no object code change ]
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: http://lkml.kernel.org/r/1371654926-11729-1-git-send-email-paul.gortmaker@windriver.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Randy Dunlap [Tue, 18 Jun 2013 19:33:40 +0000 (12:33 -0700)]
x86: fix build error and kconfig for ia32_emulation and binfmt
commit
d1603990ea626668c78527376d9ec084d634202d upstream.
Fix kconfig warning and build errors on x86_64 by selecting BINFMT_ELF
when COMPAT_BINFMT_ELF is being selected.
warning: (IA32_EMULATION) selects COMPAT_BINFMT_ELF which has unmet direct dependencies (COMPAT && BINFMT_ELF)
fs/built-in.o: In function `elf_core_dump':
compat_binfmt_elf.c:(.text+0x3e093): undefined reference to `elf_core_extra_phdrs'
compat_binfmt_elf.c:(.text+0x3ebcd): undefined reference to `elf_core_extra_data_size'
compat_binfmt_elf.c:(.text+0x3eddd): undefined reference to `elf_core_write_extra_phdrs'
compat_binfmt_elf.c:(.text+0x3f004): undefined reference to `elf_core_write_extra_data'
[ hpa: This was sent to me for -next but it is a low risk build fix ]
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Link: http://lkml.kernel.org/r/51C0B614.5000708@infradead.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Yinghai Lu [Thu, 13 Jun 2013 20:17:01 +0000 (13:17 -0700)]
x86, mtrr: Fix original mtrr range get for mtrr_cleanup
commit
d8d386c10630d8f7837700f4c466443d49e12cc0 upstream.
Joshua reported: Commit
cd7b304dfaf1 (x86, range: fix missing merge
during add range) broke mtrr cleanup on his setup in 3.9.5.
corresponding commit in upstream is
fbe06b7bae7c.
*BAD*gran_size: 64K chunk_size: 16M num_reg: 6 lose cover RAM: -0G
https://bugzilla.kernel.org/show_bug.cgi?id=59491
So it rejects new var mtrr layout.
It turns out we have some problem with initial mtrr range retrieval.
The current sequence is:
x86_get_mtrr_mem_range
==> bunchs of add_range_with_merge
==> bunchs of subract_range
==> clean_sort_range
add_range_with_merge for [0,1M)
sort_range()
add_range_with_merge could have blank slots, so we can not just
sort only, that will have final result have extra blank slot in head.
So move that calling add_range_with_merge for [0,1M), with that we
could avoid extra clean_sort_range calling.
Reported-by: Joshua Covington <joshuacov@googlemail.com>
Tested-by: Joshua Covington <joshuacov@googlemail.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1371154622-8929-2-git-send-email-yinghai@kernel.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Yinghai Lu [Thu, 13 Jun 2013 20:17:02 +0000 (13:17 -0700)]
range: Do not add new blank slot with add_range_with_merge
commit
0541881502a1276149889fe468662ff6a8fc8f6d upstream.
Joshua reported: Commit
cd7b304dfaf1 (x86, range: fix missing merge
during add range) broke mtrr cleanup on his setup in 3.9.5.
corresponding commit in upstream is
fbe06b7bae7c.
The reason is add_range_with_merge could generate blank spot.
We could avoid that by searching new expanded start/end, that
new range should include all connected ranges in range array.
At last add the new expanded start/end to the range array.
Also move up left array so do not add new blank slot in the
range array.
-v2: move left array to avoid enhance add_range()
-v3: include fix from Joshua about memmove declaring when
DYN_DEBUG is used.
Reported-by: Joshua Covington <joshuacov@googlemail.com>
Tested-by: Joshua Covington <joshuacov@googlemail.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1371154622-8929-3-git-send-email-yinghai@kernel.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jerome Glisse [Wed, 19 Jun 2013 14:02:28 +0000 (10:02 -0400)]
drm/radeon: update lockup tracking when scheduling in empty ring
commit
8444d5c69549aa0f0b574cc608742d4669e1cc01 upstream.
There might be issue with lockup detection when scheduling on an
empty ring that have been sitting idle for a while. Thus update
the lockup tracking data when scheduling new work in an empty ring.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Tested-by: Andy Lutomirski <luto@amacapital.net>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jerome Glisse [Thu, 6 Jun 2013 16:41:17 +0000 (12:41 -0400)]
drm/radeon: do not try to uselessly update virtual memory pagetable
commit
3813f5ca9ab7a00e80a17aab34f155453c66c78a upstream.
If a buffer is never bound to a virtual memory pagetable than don't try
to unbind it. Only drawback is that we don't update the pagetable when
unbinding the ib pool buffer which is fine because it only happens at
suspend or module unload/shutdown.
Fixes spurious messages about buffers without VM mappings. E.g.:
radeon 0000:01:00.0: bo
ffff88020afac400 don't has a mapping in vm
ffff88021ca2b900
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Laurent Pinchart [Wed, 19 Jun 2013 01:14:20 +0000 (03:14 +0200)]
drm/prime: Honor requested file flags when exporting a buffer
commit
ebc0bad4a05ad63979e8bc115cea3b8abdf814c7 upstream.
The DRM PRIME API passes file flags to the driver for the exported
buffer. Honor them instead of hardcoding 0600.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Rafael J. Wysocki [Tue, 18 Jun 2013 22:44:45 +0000 (00:44 +0200)]
ACPI / PM: Fix error code path for power resources initialization
commit
6ee22e9d59151550a55d370b14109bdae8b58bda upstream.
Commit
781d737 (ACPI: Drop power resources driver) introduced a
bug in the power resources initialization error code path causing
a NULL pointer to be referenced in acpi_release_power_resource()
if there's an error triggering a jump to the 'err' label in
acpi_add_power_resource(). This happens because the list_node
field of struct acpi_power_resource has not been initialized yet
at this point and doing a list_del() on it is a bad idea.
To prevent this problem from occuring, initialize the list_node
field of struct acpi_power_resource upfront.
Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Tested-by: Lan Tianyu <tianyu.lan@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Rafael J. Wysocki [Sat, 15 Jun 2013 22:38:30 +0000 (00:38 +0200)]
ACPI / dock: Take ACPI scan lock in write_undock()
commit
8112006f41fd76ddf4988f8ddd904563db85613c upstream.
Since commit
3757b94 (ACPI / hotplug: Fix concurrency issues and
memory leaks) acpi_bus_scan() and acpi_bus_trim() must always be
called under acpi_scan_lock, but currently the following scenario
violating that requirement is possible:
write_undock()
handle_eject_request()
hotplug_dock_devices()
dock_remove_acpi_device()
acpi_bus_trim()
Fix that by making write_undock() acquire acpi_scan_lock before
calling handle_eject_request() as appropriate (begin_undock() is
under the lock too in analogy with acpi_dock_deferred_cb()).
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Toshi Kani <toshi.kani@hp.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Mika Westerberg [Mon, 20 May 2013 15:41:45 +0000 (15:41 +0000)]
ACPI / resources: call acpi_get_override_irq() only for legacy IRQ resources
commit
204ebc0aa30a7115f300cac39fbb7eeb66524881 upstream.
acpi_get_override_irq() was added because there was a problem with
buggy BIOSes passing wrong IRQ() resource for the RTC IRQ. The
commit that added the workaround was
61fd47e0c8476 (ACPI: fix two
IRQ8 issues in IOAPIC mode).
With ACPI 5 enumerated devices there are typically one or more
extended IRQ resources per device (and these IRQs can be shared).
However, the acpi_get_override_irq() workaround forces all IRQs in
range 0 - 15 (the legacy ISA IRQs) to be edge triggered, active high
as can be seen from the dmesg below:
ACPI: IRQ 6 override to edge, high
ACPI: IRQ 7 override to edge, high
ACPI: IRQ 7 override to edge, high
ACPI: IRQ 13 override to edge, high
Also /proc/interrupts for the I2C controllers (INT33C2 and INT33C3) shows
the same thing:
7: 4 0 0 0 IO-APIC-edge INT33C2:00, INT33C3:00
The _CSR method for INT33C2 (and INT33C3) device returns following
resource:
Interrupt (ResourceConsumer, Level, ActiveLow, Shared,,, )
{
0x00000007,
}
which states that this is supposed to be level triggered, active low,
shared IRQ instead.
Fix this by making sure that acpi_get_override_irq() gets only called
when we are dealing with legacy IRQ() or IRQNoFlags() descriptors.
While we are there, correct pr_warning() to print the right triggering
value.
This change turns out to be necessary to make DMA work correctly on
systems based on the Intel Lynxpoint PCH (Platform Controller Hub).
[rjw: Changelog]
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Zhanghaoyu (A) [Fri, 14 Jun 2013 07:36:13 +0000 (07:36 +0000)]
KVM: x86: remove vcpu's CPL check in host-invoked XCR set
commit
764bcbc5a6d7a2f3e75c9f0e4caa984e2926e346 upstream.
__kvm_set_xcr function does the CPL check when set xcr. __kvm_set_xcr is
called in two flows, one is invoked by guest, call stack shown as below,
handle_xsetbv(or xsetbv_interception)
kvm_set_xcr
__kvm_set_xcr
the other one is invoked by host, for example during system reset:
kvm_arch_vcpu_ioctl
kvm_vcpu_ioctl_x86_set_xcrs
__kvm_set_xcr
The former does need the CPL check, but the latter does not.
Signed-off-by: Zhang Haoyu <haoyu.zhang@huawei.com>
[Tweaks to commit message. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Igor Mammedov [Mon, 10 Jun 2013 16:31:11 +0000 (18:31 +0200)]
x86: kvmclock: zero initialize pvclock shared memory area
commit
07868fc6aaf57847b0f3a3d53086b7556eb83f4a upstream.
kernel might hung in pvclock_clocksource_read() due to
uninitialized memory might contain odd version value in
following cycle:
do {
version = __pvclock_read_cycles(src, &ret, &flags);
} while ((src->version & 1) || version != src->version);
if secondary kvmclock is accessed before it's registered with kvm.
Clear garbage in pvclock shared memory area right after it's
allocated to avoid this issue.
Ref: https://bugzilla.kernel.org/show_bug.cgi?id=59521
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
[See BZ for analysis. We may want a different fix for 3.11, but
this is the safest for now - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ben Hutchings [Sun, 16 Jun 2013 20:27:12 +0000 (21:27 +0100)]
x86/efi: Fix dummy variable buffer allocation
commit
b8cb62f82103083a6e8fa5470bfe634a2c06514d upstream.
1. Check for allocation failure
2. Clear the buffer contents, as they may actually be written to flash
3. Don't leak the buffer
Compile-tested only.
[ Tested successfully on my buggy ASUS machine - Matt ]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Christian Lamparter [Fri, 22 Feb 2013 00:30:44 +0000 (01:30 +0100)]
carl9170: fix frame drop and WARN due to minstrel_ht change
commit
5f34608fa2acbfef5a06d0072a978c9943c28a2d upstream.
With "mac80211/minstrel_ht: add support for using CCK rates"
minstrel_ht selects legacy CCK rates as viable rates for
outgoing frames which might be sent as part of an A-MPDU
[IEEE80211_TX_CTL_AMPDU is set].
This behavior triggered the following WARN_ON in the driver:
> WARNING: at carl9170/tx.c:995 carl9170_op_tx+0x1dd/0x6fd
The driver assumed that the rate control algorithm made a
mistake and dropped the frame.
This patch removes the noisy warning altogether and allows
said A-MPDU frames with CCK sample and/or fallback rates to
be transmitted seamlessly.
Signed-off-by: Christian Lamparter <chunkeey@googlemail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Thomas Bogendoerfer [Fri, 14 Jun 2013 07:05:41 +0000 (09:05 +0200)]
parisc: provide pci_mmap_page_range() for parisc
commit
2cc7138f4347df939ce03f313e3d87794bab36f8 upstream.
pci_mmap_page_range() is needed for X11-server support on C8000 with ATI
FireGL card.
Signed-off-by Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Thomas Bogendoerfer [Sun, 9 Jun 2013 21:00:21 +0000 (23:00 +0200)]
parisc: fix serial ports on C8000 workstation
commit
9a66d1869d90f13fbaf83dcce5b1aeec86fbc699 upstream.
The C8000 workstation (64 bit kernel only) has a somewhat different
serial port configuration than other models.
Thomas Bogendoerfer sent a patch to fix this in September 2010, which
was now minimally modified by me.
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Helge Deller [Wed, 5 Jun 2013 20:50:01 +0000 (20:50 +0000)]
parisc: fix kernel BUG at arch/parisc/include/asm/mmzone.h:50 (part 2)
commit
91ea8207168793b365322be3c90a4ee9e8b03ed4 upstream.
Make sure that we really return -1 (instead of 0x00ff) as node id for
page frame numbers which are not physically available.
This finally fixes the kernel panic when running
cat /proc/kpageflags /proc/kpagecount.
Theoretically this patch now limits the number of physical memory ranges
to 127 instead of 254, but currently we have MAX_PHYSMEM_RANGES
hardcoded to 8 which is sufficient for all existing parisc machines.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Helge Deller [Thu, 30 May 2013 21:06:39 +0000 (21:06 +0000)]
parisc: parport0: fix this legacy no-device port driver!
commit
4edb38695d9a3cd62739f8595e21f36f0aabf4c2 upstream.
Fix the above kernel error from parport_announce_port() on 32bit GSC
machines (e.g. B160L). The parport driver requires now a pointer to the
device struct.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Chen Gang [Thu, 30 May 2013 01:18:43 +0000 (01:18 +0000)]
parisc: kernel: using strlcpy() instead of strcpy()
commit
ea99b1adf22abd62bdcf14b1c9a0a4d3664eefd8 upstream.
'boot_args' is an input args, and 'boot_command_line' has a fix length.
So use strlcpy() instead of strcpy() to avoid memory overflow.
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Paul Bolle [Wed, 29 May 2013 09:56:58 +0000 (09:56 +0000)]
parisc: rename "CONFIG_PA7100" to "CONFIG_PA7000"
commit
766039022a480ede847659daaa78772bdcc598ae upstream.
There's a Makefile line setting cflags for CONFIG_PA7100. But that
Kconfig macro doesn't exist. There is a Kconfig symbol PA7000, which
covers both PA7000 and PA7100 processors. So let's use the corresponding
Kconfig macro.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Helge Deller [Tue, 28 May 2013 20:35:54 +0000 (20:35 +0000)]
parisc: fix kernel BUG at arch/parisc/include/asm/mmzone.h:50
commit
ae249b5fa27f9fba25aa59664d4338efc2dd2394 upstream.
With CONFIG_DISCONTIGMEM=y and multiple physical memory areas,
cat /proc/kpageflags triggers this kernel bug:
kernel BUG at arch/parisc/include/asm/mmzone.h:50!
CPU: 2 PID: 7848 Comm: cat Tainted: G D W 3.10.0-rc3-64bit #44
IAOQ[0]: kpageflags_read0x128/0x238
IAOQ[1]: kpageflags_read0x12c/0x238
RP(r2): proc_reg_read0xbc/0x130
Backtrace:
[<
00000000402ca2d4>] proc_reg_read0xbc/0x130
[<
0000000040235bcc>] vfs_read0xc4/0x1d0
[<
0000000040235f0c>] SyS_read0x94/0xf0
[<
0000000040105fc0>] syscall_exit0x0/0x14
kpageflags_read() walks through the whole memory, even if some memory
areas are physically not available. So, we should better not BUG on an
unavailable pfn in pfn_to_nid() but just return the expected value -1 or
0.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Chen Gang [Mon, 27 May 2013 04:57:09 +0000 (04:57 +0000)]
parisc: memory overflow, 'name' length is too short for using
commit
3f108de96ba449a8df3d7e3c053bf890fee2cb95 upstream.
'path.bc[i]' can be asigned by PCI_SLOT() which can '> 10', so sizeof(6
* "%u:" + "%u" + '\0') may be 21.
Since 'name' length is 20, it may be memory overflow.
And 'path.bc[i]' is 'unsigned char' for printing, we can be sure the
max length of 'name' must be less than 28.
So simplify thinking, we can use 28 instead of 20 directly, and do not
think of whether 'patchc.bc[i]' can '> 100'.
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Helge Deller [Fri, 24 May 2013 21:27:35 +0000 (21:27 +0000)]
parisc: fix irq stack on UP and SMP
commit
d96b51ec14650b490ab98e738bcc02309396e5bc upstream.
The logic to detect if the irq stack was already in use with
raw_spin_trylock() is wrong, because it will generate a "trylock failure
on UP" error message with CONFIG_SMP=n and CONFIG_DEBUG_SPINLOCK=y.
arch_spin_trylock() can't be used either since in the CONFIG_SMP=n case
no atomic protection is given and we are reentrant here. A mutex didn't
worked either and brings more overhead by turning off interrupts.
So, let's use the fastest path for parisc which is the ldcw instruction.
Counting how often the irq stack was used is pretty useless, so just
drop this piece of code.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
John David Anglin [Mon, 20 May 2013 16:42:53 +0000 (16:42 +0000)]
parisc: make interrupt and interruption stack allocation reentrant
commit
b63a2bbc0b9b106a93e11952ab057e2408f2eb02 upstream.
The get_stack_use_cr30 and get_stack_use_r30 macros allocate a stack
frame for external interrupts and interruptions requiring a stack frame.
They are currently not reentrant in that they save register context
before the stack is set or adjusted.
I have observed a number of system crashes where there was clear
evidence of stack corruption during interrupt processing, and as a
result register corruption. Some interruptions can still occur during
interruption processing, however external interrupts are disabled and
data TLB misses don't occur for absolute accesses. So, it's not entirely
clear what triggers this issue. Also, if an interruption occurs when
Q=0, it is generally not possible to recover as the shadowed registers
are not copied.
The attached patch reworks the get_stack_use_cr30 and get_stack_use_r30
macros to allocate stack before doing register saves. The new code is a
couple of instructions shorter than the old implementation. Thus, it's
an improvement even if it doesn't fully resolve the stack corruption
issue. Based on limited testing, it improves SMP system stability.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Helge Deller [Sat, 18 May 2013 19:35:44 +0000 (19:35 +0000)]
parisc: show number of FPE and unaligned access handler calls in /proc/interrupts
commit
d0c3be806a3fe7f4abdb0f7e7287addb55e73f35 upstream.
Show number of floating point assistant and unaligned access fixup
handler in /proc/interrupts file.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Helge Deller [Thu, 16 May 2013 20:42:39 +0000 (20:42 +0000)]
parisc: add rp5470 entry to machine database
commit
949451b9b19da5e998778eb78929aafc73b5c227 upstream.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Helge Deller [Fri, 10 May 2013 21:24:01 +0000 (21:24 +0000)]
parisc: implement irq stacks - part 2 (v2)
commit
416821d3d68164909b2cbcf398e4ba0797f5f8a2 upstream.
This patch fixes few build issues which were introduced with the last
irq stack patch, e.g. the combination of stack overflow check and irq
stack.
Furthermore we now do proper locking and change the irq bh handler
to use the irq stack as well.
In /proc/interrupts one now can monitor how huge the irq stack has grown
and how often it was preferred over the kernel stack.
IRQ stacks are now enabled by default just to make sure that we not
overflow the kernel stack by accident.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Zhao Hongjiang [Tue, 7 May 2013 23:18:12 +0000 (16:18 -0700)]
parisc: remove the second argument of kmap_atomic()
commit
1ab4ce762370b82870834899e49c08129d7ae271 upstream.
kmap_atomic() requires only one argument now.
Signed-off-by: Zhao Hongjiang <zhaohongjiang@huawei.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Cc: Helge Deller <deller@gmx.de>
Cc: Rolf Eike Beer <eike-kernel@sf-tec.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Helge Deller [Tue, 7 May 2013 21:42:47 +0000 (21:42 +0000)]
parisc: tlb flush counting fix for SMP and UP
commit
0fc537d1d655cdae69b489dbba46ad617cfc1373 upstream.
Fix up build error on UP and show correctly number of function call
(ipi) irqs.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Helge Deller [Mon, 6 May 2013 19:20:26 +0000 (19:20 +0000)]
parisc: more irq statistics in /proc/interrupts
commit
cd85d5514d5c4d7e78abac923fc032457d0c5091 upstream.
Add framework and initial values for more fine grained statistics in
/proc/interrupts.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Helge Deller [Tue, 7 May 2013 20:25:42 +0000 (20:25 +0000)]
parisc: implement irq stacks
commit
200c880420a2c02a0899120ce52d801fad705b90 upstream.
Default kernel stack size on parisc is 16k. During tests we found that the
kernel stack can easily grow beyond 13k, which leaves 3k left for irq
processing.
This patch adds the possibility to activate an additional stack of 16k per CPU
which is being used during irq processing. This implementation does not yet
uses this irq stack for the irq bh handler.
The assembler code for call_on_stack was heavily cleaned up by John
David Anglin.
Signed-off-by: Helge Deller <deller@gmx.de>
CC: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Helge Deller [Tue, 7 May 2013 19:28:52 +0000 (19:28 +0000)]
parisc: add kernel stack overflow check
commit
9372450cc22d185f708e5cc3557cf991be4b7dc5 upstream.
Add the CONFIG_DEBUG_STACKOVERFLOW config option to enable checks to
detect kernel stack overflows.
Stack overflows can not be detected reliable since we do not want to
introduce too much overhead.
Instead, during irq processing in do_cpu_irq_mask() we check kernel
stack usage of the interrupted kernel process. Kernel threads can be
easily detected by checking the value of space register 7 (sr7) which
is zero when running inside the kernel.
Since THREAD_SIZE is 16k and PAGE_SIZE is 4k, reduce the alignment of
the init thread to the lower value (PAGE_SIZE) in the kernel
vmlinux.ld.S linker script.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hante Meuleman [Mon, 27 May 2013 19:09:53 +0000 (21:09 +0200)]
brcmfmac: Turn off ARP offloading when configured for AP.
commit
b3657453f16a7b84eab9b93bb9a9a2901ffc70af upstream.
ARP offloading should only be used in STA or P2P client mode. It
is currently configured once at init. When being configured for AP
ARP offloading should be turned off and when AP mode is left it can
be turned back on.
Reviewed-by: Arend Van Spriel <arend@broadcom.com>
Signed-off-by: Hante Meuleman <meuleman@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Larry Finger [Thu, 30 May 2013 23:05:55 +0000 (18:05 -0500)]
rtlwifi: rtl8192cu: Fix problem in connecting to WEP or WPA(1) networks
commit
5b8df24e22e0b00b599cb9ae63dbb96e1959be30 upstream.
Driver rtl8192cu can connect to WPA2 networks, but fails for any other
encryption method. The cause is a failure to set the rate control data
blocks. These changes fix https://bugzilla.redhat.com/show_bug.cgi?id=952793
and https://bugzilla.redhat.com/show_bug.cgi?id=761525.
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Chris Metcalf [Sat, 15 Jun 2013 20:47:47 +0000 (16:47 -0400)]
tilepro: work around module link error with gcc 4.7
commit
3cb3f839d306443f3d1e79b0bde1a2ad2c12b555 upstream.
gcc 4.7.x is emitting calls to __ffsdi2 where previously
it used to inline the appropriate ctz instructions.
While this needs to be fixed in gcc, it's also easy to avoid
having it cause build failures when building with those
compilers by exporting __ffsdi2 to modules.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Lai Jiangshan [Mon, 3 Jun 2013 09:17:15 +0000 (17:17 +0800)]
clk: remove notifier from list before freeing it
commit
72b5322f11ff0abf6a52b3007486656578d2c982 upstream.
The @cn is stay in @clk_notifier_list after it is freed, it cause
memory corruption.
Example, if @clk is registered(first), unregistered(first),
registered(second), unregistered(second).
The freed @cn will be used when @clk is registered(second),
and the bug will be happened when @clk is unregistered(second):
[ 517.040000] clk_notif_dbg clk_notif_dbg.1: clk_notifier_unregister()
[ 517.040000] Unable to handle kernel paging request at virtual address
00df3008
[ 517.050000] pgd =
ed858000
[ 517.050000] [
00df3008] *pgd=
00000000
[ 517.060000] Internal error: Oops: 5 [#1] PREEMPT SMP ARM
[ 517.060000] Modules linked in: clk_notif_dbg(O-) [last unloaded: clk_notif_dbg]
[ 517.060000] CPU: 1 PID: 499 Comm: modprobe Tainted: G O
3.10.0-rc3-00119-ga93cb29-dirty #85
[ 517.060000] task:
ee1e0180 ti:
ee3e6000 task.ti:
ee3e6000
[ 517.060000] PC is at srcu_readers_seq_idx+0x48/0x84
[ 517.060000] LR is at srcu_readers_seq_idx+0x60/0x84
[ 517.060000] pc : [<
c0052720>] lr : [<
c0052738>] psr:
80070013
[ 517.060000] sp :
ee3e7d48 ip :
00000000 fp :
ee3e7d6c
[ 517.060000] r10:
00000000 r9 :
ee3e6000 r8 :
00000000
[ 517.060000] r7 :
ed84fe4c r6 :
c068ec90 r5 :
c068e430 r4 :
00000000
[ 517.060000] r3 :
00df3000 r2 :
00000000 r1 :
00000002 r0 :
00000000
[ 517.060000] Flags: Nzcv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user
[ 517.060000] Control:
18c5387d Table:
2d85804a DAC:
00000015
[ 517.060000] Process modprobe (pid: 499, stack limit = 0xee3e6238)
[ 517.060000] Stack: (0xee3e7d48 to 0xee3e8000)
....
[ 517.060000] [<
c0052720>] (srcu_readers_seq_idx+0x48/0x84) from [<
c0052790>] (try_check_zero+0x34/0xfc)
[ 517.060000] [<
c0052790>] (try_check_zero+0x34/0xfc) from [<
c00528b0>] (srcu_advance_batches+0x58/0x114)
[ 517.060000] [<
c00528b0>] (srcu_advance_batches+0x58/0x114) from [<
c0052c30>] (__synchronize_srcu+0x114/0x1ac)
[ 517.060000] [<
c0052c30>] (__synchronize_srcu+0x114/0x1ac) from [<
c0052d14>] (synchronize_srcu+0x2c/0x34)
[ 517.060000] [<
c0052d14>] (synchronize_srcu+0x2c/0x34) from [<
c0053a08>] (srcu_notifier_chain_unregister+0x68/0x74)
[ 517.060000] [<
c0053a08>] (srcu_notifier_chain_unregister+0x68/0x74) from [<
c0375a78>] (clk_notifier_unregister+0x7c/0xc0)
[ 517.060000] [<
c0375a78>] (clk_notifier_unregister+0x7c/0xc0) from [<
bf008034>] (clk_notif_dbg_remove+0x34/0x9c [clk_notif_dbg])
[ 517.060000] [<
bf008034>] (clk_notif_dbg_remove+0x34/0x9c [clk_notif_dbg]) from [<
c02bb974>] (platform_drv_remove+0x24/0x28)
[ 517.060000] [<
c02bb974>] (platform_drv_remove+0x24/0x28) from [<
c02b9bf8>] (__device_release_driver+0x8c/0xd4)
[ 517.060000] [<
c02b9bf8>] (__device_release_driver+0x8c/0xd4) from [<
c02ba680>] (driver_detach+0x9c/0xc4)
[ 517.060000] [<
c02ba680>] (driver_detach+0x9c/0xc4) from [<
c02b99c4>] (bus_remove_driver+0xcc/0xfc)
[ 517.060000] [<
c02b99c4>] (bus_remove_driver+0xcc/0xfc) from [<
c02bace4>] (driver_unregister+0x54/0x78)
[ 517.060000] [<
c02bace4>] (driver_unregister+0x54/0x78) from [<
c02bbb44>] (platform_driver_unregister+0x1c/0x20)
[ 517.060000] [<
c02bbb44>] (platform_driver_unregister+0x1c/0x20) from [<
bf0081f8>] (clk_notif_dbg_driver_exit+0x14/0x1c [clk_notif_dbg])
[ 517.060000] [<
bf0081f8>] (clk_notif_dbg_driver_exit+0x14/0x1c [clk_notif_dbg]) from [<
c00835e4>] (SyS_delete_module+0x200/0x28c)
[ 517.060000] [<
c00835e4>] (SyS_delete_module+0x200/0x28c) from [<
c000edc0>] (ret_fast_syscall+0x0/0x48)
[ 517.060000] Code:
e5973004 e7911102 e0833001 e2881002 (
e7933101)
Reported-by: Sören Brinkmann <soren.brinkmann@xilinx.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Tested-by: Sören Brinkmann <soren.brinkmann@xilinx.com>
Signed-off-by: Mike Turquette <mturquette@linaro.org>
[mturquette@linaro.org: shortened $SUBJECT]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Clemens Ladisch [Sat, 15 Jun 2013 09:21:09 +0000 (11:21 +0200)]
ALSA: usb-audio: work around Android accessory firmware bug
commit
342cda29343a6272c630f94ed56810a76740251b upstream.
When the Android firmware enables the audio interfaces in accessory
mode, it always declares in the control interface's baInterfaceNr array
that interfaces 0 and 1 belong to the audio function. However, the
accessory interface itself, if also enabled, already is at index 0 and
shifts the actual audio interface numbers to 1 and 2, which prevents the
PCM streaming interface from being seen by the host driver.
To get the PCM interface interface to work, detect when the descriptors
point to the (for this driver useless) accessory interface, and redirect
to the correct one.
Reported-by: Jeremy Rosen <jeremy.rosen@openwide.fr>
Tested-by: Jeremy Rosen <jeremy.rosen@openwide.fr>
Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Takashi Iwai [Mon, 17 Jun 2013 08:19:49 +0000 (10:19 +0200)]
ALSA: hda - Fix pin configurations for MacBook Air 4,2
commit
6ab982e8cf8e5760da407ccdc4abc815bea23179 upstream.
MacBook Air 4,2 requires the whole default pin configuration table to
be overridden by the driver, as usual, as Apple's machines don't set
up properly after boot. Otherwise mic won't work, and other ill
effect may happen.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=59381
Reported-and-tested-by: Peter John Hartman <peterjohnhartman@gmail.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Takashi Iwai [Mon, 17 Jun 2013 08:25:02 +0000 (10:25 +0200)]
ALSA: usb-audio: Fix invalid volume resolution for Logitech HD Webcam c310
commit
36691e1be6ec551eef4a5225f126a281f8c051c2 upstream.
Just like the previous fix for LogitechHD Webcam c270 in commit
11e7064f35bb87da8f427d1aa4bbd8b7473a3993, c310 model also requires the
same workaround for avoiding the kernel warning.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=59741
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jed Davis [Thu, 20 Jun 2013 03:07:14 +0000 (04:07 +0100)]
perf: arm64: Record the user-mode PC in the call chain.
commit
abc41254181e901ef5eda2c884ca6cd88a186b6d upstream.
With this change, we no longer lose the innermost entry in the user-mode
part of the call chain. See also the x86 port, which includes the ip,
and the corresponding change in arch/arm.
Signed-off-by: Jed Davis <jld@mozilla.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Gregory CLEMENT [Mon, 10 Jun 2013 17:05:51 +0000 (18:05 +0100)]
ARM: 7754/1: Fix the CPU ID and the mask associated to the PJ4B
commit
049be07053ebbf0ee8543caea23ae7bdf0765bb2 upstream.
This commit fixes the ID and mask for the PJ4B which was too
restrictive and didn't match the CPU of the Armada 370 SoC.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jon Medhurst [Fri, 7 Jun 2013 09:35:35 +0000 (10:35 +0100)]
ARM: 7752/1: errata: LoUIS bit field in CLIDR register is incorrect
commit
691557941af4c12bd307ad81a4d9fa9c7743ac28 upstream.
On Cortex-A9 before version r1p0, the LoUIS bit field of the CLIDR
register returns zero when it should return one. This leads to cache
maintenance operations which rely on this value to not function as
intended, causing data corruption.
The workaround for this errata is to detect affected CPUs and correct
the LoUIS value read.
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Thu, 20 Jun 2013 19:01:46 +0000 (12:01 -0700)]
Linux 3.9.7
Nicolas Schichan [Thu, 6 Jun 2013 17:00:46 +0000 (19:00 +0200)]
ARM: Kirkwood: handle mv88f6282 cpu in __kirkwood_variant().
commit
4089fe95bfed295c8ad38251d5fe02b6b0ba684c upstream.
MPP_F6281_MASK would be previously be returned when on mv88f6282,
which would disallow some valid MPP configurations.
Commit
830f8b91 (arm: plat-orion: fix printing of "MPP config
unavailable on this hardware") made this problem visible as an invalid
MPP configuration is now correctly detected and not applied.
Signed-off-by: Nicolas Schichan <nschichan@freebox.fr>
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Nithin Sujir [Wed, 12 Jun 2013 18:08:59 +0000 (11:08 -0700)]
tg3: Wait for boot code to finish after power on
commit
df465abfe06f7dc4f33f4a96d17f096e9e8ac917 upstream.
Some systems that don't need wake-on-lan may choose to power down the
chip on system standby. Upon resume, the power on causes the boot code
to startup and initialize the hardware. On one new platform, this is
causing the device to go into a bad state due to a race between the
driver and boot code, once every several hundred resumes. The same race
exists on open since we come up from a power on.
This patch adds a wait for boot code signature at the beginning of
tg3_init_hw() which is common to both cases. If there has not been a
power-off or the boot code has already completed, the signature will be
present and poll_fw() returns immediately. Also return immediately if
the device does not have firmware.
Signed-off-by: Nithin Nayak Sujir <nsujir@broadcom.com>
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Johan Hovold [Mon, 10 Jun 2013 16:29:39 +0000 (18:29 +0200)]
USB: spcp8x5: fix device initialisation at open
commit
5e4211f1c47560c36a8b3d4544dfd866dcf7ccd0 upstream.
Do not use uninitialised termios data to determine when to configure the
device at open.
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Johan Hovold [Mon, 10 Jun 2013 16:29:37 +0000 (18:29 +0200)]
USB: f81232: fix device initialisation at open
commit
21886725d58e92188159731c7c1aac803dd6b9dc upstream.
Do not use uninitialised termios data to determine when to configure the
device at open.
This also prevents stack data from leaking to userspace.
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Johan Hovold [Mon, 10 Jun 2013 16:29:38 +0000 (18:29 +0200)]
USB: pl2303: fix device initialisation at open
commit
2d8f4447b58bba5f8cb895c07690434c02307eaf upstream.
Do not use uninitialised termios data to determine when to configure the
device at open.
This also prevents stack data from leaking to userspace in the OOM error
path.
Signed-off-by: Johan Hovold <jhovold@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alexander Shishkin [Tue, 11 Jun 2013 10:41:48 +0000 (13:41 +0300)]
usb: chipidea: fix id change handling
commit
0c3f3dc68bb6e6950e8cd7851e7778c550e8dfb4 upstream.
Re-enable chipidea irq even if there's no role changing to do. This is
a problem since
b183c19f ("USB: chipidea: re-order irq handling to avoid
unhandled irqs"); when it manifests, chipidea irq gets disabled for good.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Benjamin Herrenschmidt [Sat, 15 Jun 2013 02:13:40 +0000 (12:13 +1000)]
powerpc: Fix missing/delayed calls to irq_work
commit
230b3034793247f61e6a0b08c44cf415f6d92981 upstream.
When replaying interrupts (as a result of the interrupt occurring
while soft-disabled), in the case of the decrementer, we are exclusively
testing for a pending timer target. However we also use decrementer
interrupts to trigger the new "irq_work", which in this case would
be missed.
This change the logic to force a replay in both cases of a timer
boundary reached and a decrementer interrupt having actually occurred
while disabled. The former test is still useful to catch cases where
a CPU having been hard-disabled for a long time completely misses the
interrupt due to a decrementer rollover.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Paul Mackerras [Fri, 14 Jun 2013 10:07:41 +0000 (20:07 +1000)]
powerpc: Fix emulation of illegal instructions on PowerNV platform
commit
bf593907f7236e95698a76b7c7a2bbf8b1165327 upstream.
Normally, the kernel emulates a few instructions that are unimplemented
on some processors (e.g. the old dcba instruction), or privileged (e.g.
mfpvr). The emulation of unimplemented instructions is currently not
working on the PowerNV platform. The reason is that on these machines,
unimplemented and illegal instructions cause a hypervisor emulation
assist interrupt, rather than a program interrupt as on older CPUs.
Our vector for the emulation assist interrupt just calls
program_check_exception() directly, without setting the bit in SRR1
that indicates an illegal instruction interrupt. This fixes it by
making the emulation assist interrupt set that bit before calling
program_check_interrupt(). With this, old programs that use no-longer
implemented instructions such as dcba now work again.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Michael Ellerman [Thu, 13 Jun 2013 11:04:56 +0000 (21:04 +1000)]
powerpc: Fix stack overflow crash in resume_kernel when ftracing
commit
0e37739b1c96d65e6433998454985de994383019 upstream.
It's possible for us to crash when running with ftrace enabled, eg:
Bad kernel stack pointer
bffffd12 at
c00000000000a454
cpu 0x3: Vector: 300 (Data Access) at [
c00000000ffe3d40]
pc:
c00000000000a454: resume_kernel+0x34/0x60
lr:
c00000000000335c: performance_monitor_common+0x15c/0x180
sp:
bffffd12
msr:
8000000000001032
dar:
bffffd12
dsisr:
42000000
If we look at current's stack (paca->__current->stack) we see it is
equal to
c0000002ecab0000. Our stack is 16K, and comparing to
paca->kstack (
c0000002ecab3e30) we can see that we have overflowed our
kernel stack. This leads to us writing over our struct thread_info, and
in this case we have corrupted thread_info->flags and set
_TIF_EMULATE_STACK_STORE.
Dumping the stack we see:
3:mon> t
c0000002ecab0000
[
c0000002ecab0000]
c00000000002131c .performance_monitor_exception+0x5c/0x70
[
c0000002ecab0080]
c00000000000335c performance_monitor_common+0x15c/0x180
--- Exception: f01 (Performance Monitor) at
c0000000000fb2ec .trace_hardirqs_off+0x1c/0x30
[
c0000002ecab0370]
c00000000016fdb0 .trace_graph_entry+0xb0/0x280 (unreliable)
[
c0000002ecab0410]
c00000000003d038 .prepare_ftrace_return+0x98/0x130
[
c0000002ecab04b0]
c00000000000a920 .ftrace_graph_caller+0x14/0x28
[
c0000002ecab0520]
c0000000000d6b58 .idle_cpu+0x18/0x90
[
c0000002ecab05a0]
c00000000000a934 .return_to_handler+0x0/0x34
[
c0000002ecab0620]
c00000000001e660 .timer_interrupt+0x160/0x300
[
c0000002ecab06d0]
c0000000000025dc decrementer_common+0x15c/0x180
--- Exception: 901 (Decrementer) at
c0000000000104d4 .arch_local_irq_restore+0x74/0xa0
[
c0000002ecab09c0]
c0000000000fe044 .trace_hardirqs_on+0x14/0x30 (unreliable)
[
c0000002ecab0fb0]
c00000000016fe3c .trace_graph_entry+0x13c/0x280
[
c0000002ecab1050]
c00000000003d038 .prepare_ftrace_return+0x98/0x130
[
c0000002ecab10f0]
c00000000000a920 .ftrace_graph_caller+0x14/0x28
[
c0000002ecab1160]
c0000000000161f0 .__ppc64_runlatch_on+0x10/0x40
[
c0000002ecab11d0]
c00000000000a934 .return_to_handler+0x0/0x34
--- Exception: 901 (Decrementer) at
c0000000000104d4 .arch_local_irq_restore+0x74/0xa0
... and so on
__ppc64_runlatch_on() is called from RUNLATCH_ON in the exception entry
path. At that point the irq state is not consistent, ie. interrupts are
hard disabled (by the exception entry), but the paca soft-enabled flag
may be out of sync.
This leads to the local_irq_restore() in trace_graph_entry() actually
enabling interrupts, which we do not want. Because we have not yet
reprogrammed the decrementer we immediately take another decrementer
exception, and recurse.
The fix is twofold. Firstly make sure we call DISABLE_INTS before
calling RUNLATCH_ON. The badly named DISABLE_INTS actually reconciles
the irq state in the paca with the hardware, making it safe again to
call local_irq_save/restore().
Although that should be sufficient to fix the bug, we also mark the
runlatch routines as notrace. They are called very early in the
exception entry and we are asking for trouble tracing them. They are
also fairly uninteresting and tracing them just adds unnecessary
overhead.
[ This regression was introduced by
fe1952fc0afb9a2e4c79f103c08aef5d13db1873
"powerpc: Rework runlatch code" by myself --BenH
]
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Matthew Garrett [Sat, 1 Jun 2013 20:06:20 +0000 (16:06 -0400)]
Modify UEFI anti-bricking code
commit
f8b8404337de4e2466e2e1139ea68b1f8295974f upstream.
This patch reworks the UEFI anti-bricking code, including an effective
reversion of
cc5a080c and
31ff2f20. It turns out that calling
QueryVariableInfo() from boot services results in some firmware
implementations jumping to physical addresses even after entering virtual
mode, so until we have 1:1 mappings for UEFI runtime space this isn't
going to work so well.
Reverting these gets us back to the situation where we'd refuse to create
variables on some systems because they classify deleted variables as "used"
until the firmware triggers a garbage collection run, which they won't do
until they reach a lower threshold. This results in it being impossible to
install a bootloader, which is unhelpful.
Feedback from Samsung indicates that the firmware doesn't need more than
5KB of storage space for its own purposes, so that seems like a reasonable
threshold. However, there's still no guarantee that a platform will attempt
garbage collection merely because it drops below this threshold. It seems
that this is often only triggered if an attempt to write generates a
genuine EFI_OUT_OF_RESOURCES error. We can force that by attempting to
create a variable larger than the remaining space. This should fail, but if
it somehow succeeds we can then immediately delete it.
I've tested this on the UEFI machines I have available, but I don't have
a Samsung and so can't verify that it avoids the bricking problem.
Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com>
Signed-off-by: Lee, Chun-Y <jlee@suse.com> [ dummy variable cleanup ]
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sage Weil [Mon, 25 Mar 2013 17:26:30 +0000 (10:26 -0700)]
libceph: wrap auth methods in a mutex
commit
e9966076cdd952e19f2dd4854cd719be0d7cbebc upstream.
The auth code is called from a variety of contexts, include the mon_client
(protected by the monc's mutex) and the messenger callbacks (currently
protected by nothing). Avoid chaos by protecting all auth state with a
mutex. Nothing is blocking, so this should be simple and lightweight.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sage Weil [Mon, 25 Mar 2013 17:26:14 +0000 (10:26 -0700)]
libceph: wrap auth ops in wrapper functions
commit
27859f9773e4a0b2042435b13400ee2c891a61f4 upstream.
Use wrapper functions that check whether the auth op exists so that callers
do not need a bunch of conditional checks. Simplifies the external
interface.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sage Weil [Mon, 25 Mar 2013 17:26:01 +0000 (10:26 -0700)]
libceph: add update_authorizer auth method
commit
0bed9b5c523d577378b6f83eab5835fe30c27208 upstream.
Currently the messenger calls out to a get_authorizer con op, which will
create a new authorizer if it doesn't yet have one. In the meantime, when
we rotate our service keys, the authorizer doesn't get updated. Eventually
it will be rejected by the server on a new connection attempt and get
invalidated, and we will then rebuild a new authorizer, but this is not
ideal.
Instead, if we do have an authorizer, call a new update_authorizer op that
will verify that the current authorizer is using the latest secret. If it
is not, we will build a new one that does. This avoids the transient
failure.
This fixes one of the sorry sequence of events for bug
http://tracker.ceph.com/issues/4282
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sage Weil [Mon, 25 Mar 2013 17:25:49 +0000 (10:25 -0700)]
libceph: fix authorizer invalidation
commit
4b8e8b5d78b8322351d44487c1b76f7e9d3412bc upstream.
We were invalidating the authorizer by removing the ticket handler
entirely. This was effective in inducing us to request a new authorizer,
but in the meantime it mean that any authorizer we generated would get a
new and initialized handler with secret_id=0, which would always be
rejected by the server side with a confusing error message:
auth: could not find secret_id=0
cephx: verify_authorizer could not get service secret for service osd secret_id=0
Instead, simply clear the validity field. This will still induce the auth
code to request a new secret, but will let us continue to use the old
ticket in the meantime. The messenger code will probably continue to fail,
but the exponential backoff will kick in, and eventually the we will get a
new (hopefully more valid) ticket from the mon and be able to continue.
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sage Weil [Mon, 25 Mar 2013 16:30:13 +0000 (09:30 -0700)]
libceph: clear messenger auth_retry flag when we authenticate
commit
20e55c4cc758e4dccdfd92ae8e9588dd624b2cd7 upstream.
We maintain a counter of failed auth attempts to allow us to retry once
before failing. However, if the second attempt succeeds, the flag isn't
cleared, which makes us think auth failed again later when the connection
resets for other reasons (like a socket error).
This is one part of the sorry sequence of events in bug
http://tracker.ceph.com/issues/4282
Signed-off-by: Sage Weil <sage@inktank.com>
Reviewed-by: Alex Elder <elder@inktank.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ben Skeggs [Mon, 3 Jun 2013 06:40:14 +0000 (16:40 +1000)]
drm/nv50/kms: use dac loadval from vbios, where it's available
commit
d40ee48acde16894fb3b241d7e896d5fa84e0f10 upstream.
Regression from merging the old nv50/nvd9 code together, and may be
needed to fully fix fdo#64904.
The value is ignored completely by the hardware starting from nva3.
Reported-by: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Ben Skeggs [Mon, 3 Jun 2013 06:07:06 +0000 (16:07 +1000)]
drm/nv50/disp: force dac power state during load detect
commit
ea9197cc323839ef3d5280c0453b2c622caa6bc7 upstream.
fdo#64904
Reported-by: Gerhard Bräunlich <wippbox@gmx.net>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Kees Cook [Wed, 5 Jun 2013 18:47:18 +0000 (11:47 -0700)]
x86: Fix typo in kexec register clearing
commit
c8a22d19dd238ede87aa0ac4f7dbea8da039b9c1 upstream.
Fixes a typo in register clearing code. Thanks to PaX Team for fixing
this originally, and James Troup for pointing it out.
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: http://lkml.kernel.org/r/20130605184718.GA8396@www.outflux.net
Cc: PaX Team <pageexec@freemail.hu>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Yinghai Lu [Fri, 31 May 2013 15:53:07 +0000 (08:53 -0700)]
x86: Fix adjust_range_size_mask calling position
commit
7de3d66b1387ddf5a37d9689e5eb8510fb75c765 upstream.
Commit
8d57470d x86, mm: setup page table in top-down
causes a kernel panic while setting mem=2G.
[mem 0x00000000-0x000fffff] page 4k
[mem 0x7fe00000-0x7fffffff] page 1G
[mem 0x7c000000-0x7fdfffff] page 1G
[mem 0x00100000-0x001fffff] page 4k
[mem 0x00200000-0x7bffffff] page 2M
for last entry is not what we want, we should have
[mem 0x00200000-0x3fffffff] page 2M
[mem 0x40000000-0x7bffffff] page 1G
Actually we merge the continuous ranges with same page size too early.
in this case, before merging we have
[mem 0x00200000-0x3fffffff] page 2M
[mem 0x40000000-0x7bffffff] page 2M
after merging them, will get
[mem 0x00200000-0x7bffffff] page 2M
even we can use 1G page to map
[mem 0x40000000-0x7bffffff]
that will cause problem, because we already map
[mem 0x7fe00000-0x7fffffff] page 1G
[mem 0x7c000000-0x7fdfffff] page 1G
with 1G page, aka [0x40000000-0x7fffffff] is mapped with 1G page already.
During phys_pud_init() for [0x40000000-0x7bffffff], it will not
reuse existing that pud page, and allocate new one then try to use
2M page to map it instead, as page_size_mask does not include
PG_LEVEL_1G. At end will have [
7c000000-0x7fffffff] not mapped, loop
in phys_pmd_init stop mapping at 0x7bffffff.
That is right behavoir, it maps exact range with exact page size that
we ask, and we should explicitly call it to map [
7c000000-0x7fffffff]
before or after mapping 0x40000000-0x7bffffff.
Anyway we need to make sure ranges' page_size_mask correct and consistent
after split_mem_range for each range.
Fix that by calling adjust_range_size_mask before merging range
with same page size.
-v2: update change log.
-v3: add more explanation why [
7c000000-0x7fffffff] is not mapped, and
it causes panic.
Bisected-by: "Xie, ChanglongX" <changlongx.xie@intel.com>
Bisected-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reported-and-tested-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1370015587-20835-1-git-send-email-yinghai@kernel.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Naoya Horiguchi [Wed, 12 Jun 2013 21:05:04 +0000 (14:05 -0700)]
mm: migration: add migrate_entry_wait_huge()
commit
30dad30922ccc733cfdbfe232090cf674dc374dc upstream.
When we have a page fault for the address which is backed by a hugepage
under migration, the kernel can't wait correctly and do busy looping on
hugepage fault until the migration finishes. As a result, users who try
to kick hugepage migration (via soft offlining, for example) occasionally
experience long delay or soft lockup.
This is because pte_offset_map_lock() can't get a correct migration entry
or a correct page table lock for hugepage. This patch introduces
migration_entry_wait_huge() to solve this.
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tomasz Stanislawski [Wed, 12 Jun 2013 21:05:02 +0000 (14:05 -0700)]
mm/page_alloc.c: fix watermark check in __zone_watermark_ok()
commit
026b08147923142e925a7d0aaa39038055ae0156 upstream.
The watermark check consists of two sub-checks. The first one is:
if (free_pages <= min + lowmem_reserve)
return false;
The check assures that there is minimal amount of RAM in the zone. If
CMA is used then the free_pages is reduced by the number of free pages
in CMA prior to the over-mentioned check.
if (!(alloc_flags & ALLOC_CMA))
free_pages -= zone_page_state(z, NR_FREE_CMA_PAGES);
This prevents the zone from being drained from pages available for
non-movable allocations.
The second check prevents the zone from getting too fragmented.
for (o = 0; o < order; o++) {
free_pages -= z->free_area[o].nr_free << o;
min >>= 1;
if (free_pages <= min)
return false;
}
The field z->free_area[o].nr_free is equal to the number of free pages
including free CMA pages. Therefore the CMA pages are subtracted twice.
This may cause a false positive fail of __zone_watermark_ok() if the CMA
area gets strongly fragmented. In such a case there are many 0-order
free pages located in CMA. Those pages are subtracted twice therefore
they will quickly drain free_pages during the check against
fragmentation. The test fails even though there are many free non-cma
pages in the zone.
This patch fixes this issue by subtracting CMA pages only for a purpose of
(free_pages <= min + lowmem_reserve) check.
Laura said:
We were observing allocation failures of higher order pages (order 5 =
128K typically) under tight memory conditions resulting in driver
failure. The output from the page allocation failure showed plenty of
free pages of the appropriate order/type/zone and mostly CMA pages in
the lower orders.
For full disclosure, we still observed some page allocation failures
even after applying the patch but the number was drastically reduced and
those failures were attributed to fragmentation/other system issues.
Signed-off-by: Tomasz Stanislawski <t.stanislaws@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Tested-by: Laura Abbott <lauraa@codeaurora.org>
Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Acked-by: Minchan Kim <minchan@kernel.org>
Cc: Mel Gorman <mel@csn.ul.ie>
Tested-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
NeilBrown [Wed, 12 Jun 2013 01:01:22 +0000 (11:01 +1000)]
md/raid1,raid10: use freeze_array in place of raise_barrier in various places.
commit
e2d59925221cd562e07fee38ec8839f7209ae603 upstream.
Various places in raid1 and raid10 are calling raise_barrier when they
really should call freeze_array.
The former is only intended to be called from "make_request".
The later has extra checks for 'nr_queued' and makes a call to
flush_pending_writes(), so it is safe to call it from within the
management thread.
Using raise_barrier will sometimes deadlock. Using freeze_array
should not.
As 'freeze_array' currently expects one request to be pending (in
handle_read_error - the only previous caller), we need to pass
it the number of pending requests (extra) to ignore.
The deadlock was made particularly noticeable by commits
050b66152f87c7 (raid10) and
6b740b8d79252f13 (raid1) which
appeared in 3.4, so the fix is appropriate for any -stable
kernel since then.
This patch probably won't apply directly to some early kernels and
will need to be applied by hand.
Reported-by: Alexander Lyakas <alex.bolshoy@gmail.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
H. Peter Anvin [Wed, 12 Jun 2013 14:37:43 +0000 (07:37 -0700)]
md/raid1,5,10: Disable WRITE SAME until a recovery strategy is in place
commit
5026d7a9b2f3eb1f9bda66c18ac6bc3036ec9020 upstream.
There are cases where the kernel will believe that the WRITE SAME
command is supported by a block device which does not, in fact,
support WRITE SAME. This currently happens for SATA drivers behind a
SAS controller, but there are probably a hundred other ways that can
happen, including drive firmware bugs.
After receiving an error for WRITE SAME the block layer will retry the
request as a plain write of zeroes, but mdraid will consider the
failure as fatal and consider the drive failed. This has the effect
that all the mirrors containing a specific set of data are each
offlined in very rapid succession resulting in data loss.
However, just bouncing the request back up to the block layer isn't
ideal either, because the whole initial request-retry sequence should
be inside the write bitmap fence, which probably means that md needs
to do its own conversion of WRITE SAME to write zero.
Until the failure scenario has been sorted out, disable WRITE SAME for
raid1, raid5, and raid10.
[neilb: added raid5]
This patch is appropriate for any -stable since 3.7 when write_same
support was added.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alex Lyakas [Tue, 4 Jun 2013 17:42:21 +0000 (20:42 +0300)]
md/raid1: consider WRITE as successful only if at least one non-Faulty and non-rebuilding drive completed it.
commit
3056e3aec8d8ba61a0710fb78b2d562600aa2ea7 upstream.
Without that fix, the following scenario could happen:
- RAID1 with drives A and B; drive B was freshly-added and is rebuilding
- Drive A fails
- WRITE request arrives to the array. It is failed by drive A, so
r1_bio is marked as R1BIO_WriteError, but the rebuilding drive B
succeeds in writing it, so the same r1_bio is marked as
R1BIO_Uptodate.
- r1_bio arrives to handle_write_finished, badblocks are disabled,
md_error()->error() does nothing because we don't fail the last drive
of raid1
- raid_end_bio_io() calls call_bio_endio()
- As a result, in call_bio_endio():
if (!test_bit(R1BIO_Uptodate, &r1_bio->state))
clear_bit(BIO_UPTODATE, &bio->bi_flags);
this code doesn't clear the BIO_UPTODATE flag, and the whole master
WRITE succeeds, back to the upper layer.
So we returned success to the upper layer, even though we had written
the data onto the rebuilding drive only. But when we want to read the
data back, we would not read from the rebuilding drive, so this data
is lost.
[neilb - applied identical change to raid10 as well]
This bug can result in lost data, so it is suitable for any
-stable kernel.
Signed-off-by: Alex Lyakas <alex@zadarastorage.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Rafael Aquini [Wed, 12 Jun 2013 21:04:49 +0000 (14:04 -0700)]
swap: avoid read_swap_cache_async() race to deadlock while waiting on discard I/O completion
commit
cbab0e4eec299e9059199ebe6daf48730be46d2b upstream.
read_swap_cache_async() can race against get_swap_page(), and stumble
across a SWAP_HAS_CACHE entry in the swap map whose page wasn't brought
into the swapcache yet.
This transient swap_map state is expected to be transitory, but the
actual placement of discard at scan_swap_map() inserts a wait for I/O
completion thus making the thread at read_swap_cache_async() to loop
around its -EEXIST case, while the other end at get_swap_page() is
scheduled away at scan_swap_map(). This can leave the system deadlocked
if the I/O completion happens to be waiting on the CPU waitqueue where
read_swap_cache_async() is busy looping and !CONFIG_PREEMPT.
This patch introduces a cond_resched() call to make the aforementioned
read_swap_cache_async() busy loop condition to bail out when necessary,
thus avoiding the subtle race window.
Signed-off-by: Rafael Aquini <aquini@redhat.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Shaohua Li <shli@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Daniel Vetter [Mon, 10 Jun 2013 07:47:58 +0000 (09:47 +0200)]
drm/i915: prefer VBT modes for SVDO-LVDS over EDID
commit
c3456fb3e4712d0448592af3c5d644c9472cd3c1 upstream.
In
commit
53d3b4d7778daf15900867336c85d3f8dd70600c
Author: Egbert Eich <eich@suse.de>
Date: Tue Jun 4 17:13:21 2013 +0200
drm/i915/sdvo: Use &intel_sdvo->ddc instead of intel_sdvo->i2c for DDC
Egbert Eich fixed a long-standing bug where we simply used a
non-working i2c controller to read the EDID for SDVO-LVDS panels.
Unfortunately some machines seem to not be able to cope with the mode
provided in the EDID. Specifically they seem to not be able to cope
with a 4x pixel mutliplier instead of a 2x one, which seems to have
been worked around by slightly changing the panels native mode in the
VBT so that the dotclock is just barely above 50MHz.
Since it took forever to notice the breakage it's fairly safe to
assume that at least for SDVO-LVDS panels the VBT contains fairly sane
data. So just switch around the order and use VBT modes first.
v2: Also add EDID modes just in case, and spell Egbert correctly.
v3: Elaborate a bit more about what's going on on Chris' machine.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65524
Reported-and-tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Egbert Eich <eich@suse.de>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Luciano Coelho [Fri, 10 May 2013 07:19:38 +0000 (10:19 +0300)]
wl12xx: fix minimum required firmware version for wl127x multirole
commit
60c28cf18f970e1c1bd40d615596eeab6efbd9d7 upstream.
There was a typo in commit 8675f9 (wlcore/wl12xx/wl18xx: verify
multi-role and single-role fw versions), which was causing the
multirole firmware for wl127x (WiLink6) to be rejected. The actual
minimum version needed for wl127x multirole is 6.5.7.0.42.
Reported-by: Levi Pearson <levipearson@gmail.com>
Reported-by: Michael Scott <hashcode0f@gmail.com>
Signed-off-by: Luciano Coelho <coelho@ti.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Andrey Vagin [Wed, 12 Jun 2013 21:04:42 +0000 (14:04 -0700)]
memcg: don't initialize kmem-cache destroying work for root caches
commit
f101a9464bfbda42730b54a66f926d75ed2cd31e upstream.
struct memcg_cache_params has a union. Different parts of this union
are used for root and non-root caches. A part with destroying work is
used only for non-root caches.
BUG: unable to handle kernel paging request at
0000000fffffffe0
IP: kmem_cache_alloc+0x41/0x1f0
Modules linked in: netlink_diag af_packet_diag udp_diag tcp_diag inet_diag unix_diag ip6table_filter ip6_tables i2c_piix4 virtio_net virtio_balloon microcode i2c_core pcspkr floppy
CPU: 0 PID: 1929 Comm: lt-vzctl Tainted: G D 3.10.0-rc1+ #2
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
RIP: kmem_cache_alloc+0x41/0x1f0
Call Trace:
getname_flags.part.34+0x30/0x140
getname+0x38/0x60
do_sys_open+0xc5/0x1e0
SyS_open+0x22/0x30
system_call_fastpath+0x16/0x1b
Code: f4 53 48 83 ec 18 8b 05 8e 53 b7 00 4c 8b 4d 08 21 f0 a8 10 74 0d 4c 89 4d c0 e8 1b 76 4a 00 4c 8b 4d c0 e9 92 00 00 00 4d 89 f5 <4d> 8b 45 00 65 4c 03 04 25 48 cd 00 00 49 8b 50 08 4d 8b 38 49
RIP [<
ffffffff8116b641>] kmem_cache_alloc+0x41/0x1f0
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Glauber Costa <glommer@parallels.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Michal Hocko <mhocko@suse.cz>
Cc: Li Zefan <lizefan@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Stephen M. Cameron [Wed, 12 Jun 2013 21:04:47 +0000 (14:04 -0700)]
cciss: fix broken mutex usage in ioctl
commit
03f47e888daf56c8e9046c674719a0bcc644eed5 upstream.
If a new logical drive is added and the CCISS_REGNEWD ioctl is invoked
(as is normal with the Array Configuration Utility) the process will
hang as below. It attempts to acquire the same mutex twice, once in
do_ioctl() and once in cciss_unlocked_open(). The BKL was recursive,
the mutex isn't.
Linux version 3.10.0-rc2 (scameron@localhost.localdomain) (gcc version 4.4.7
20120313 (Red Hat 4.4.7-3) (GCC) ) #1 SMP Fri May 24 14:32:12 CDT 2013
[...]
acu D
0000000000000001 0 3246 3191 0x00000080
Call Trace:
schedule+0x29/0x70
schedule_preempt_disabled+0xe/0x10
__mutex_lock_slowpath+0x17b/0x220
mutex_lock+0x2b/0x50
cciss_unlocked_open+0x2f/0x110 [cciss]
__blkdev_get+0xd3/0x470
blkdev_get+0x5c/0x1e0
register_disk+0x182/0x1a0
add_disk+0x17c/0x310
cciss_add_disk+0x13a/0x170 [cciss]
cciss_update_drive_info+0x39b/0x480 [cciss]
rebuild_lun_table+0x258/0x370 [cciss]
cciss_ioctl+0x34f/0x470 [cciss]
do_ioctl+0x49/0x70 [cciss]
__blkdev_driver_ioctl+0x28/0x30
blkdev_ioctl+0x200/0x7b0
block_ioctl+0x3c/0x40
do_vfs_ioctl+0x89/0x350
SyS_ioctl+0xa1/0xb0
system_call_fastpath+0x16/0x1b
This mutex usage was added into the ioctl path when the big kernel lock
was removed. As it turns out, these paths are all thread safe anyway
(or can easily be made so) and we don't want ioctl() to be single
threaded in any case.
Signed-off-by: Stephen M. Cameron <scameron@beardog.cce.hp.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Mike Miller <mike.miller@hp.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>