Arnaldo Carvalho de Melo [Tue, 18 Sep 2018 19:08:02 +0000 (16:08 -0300)]
perf python: Use -Wno-redundant-decls to build with PYTHON=python3
[ Upstream commit
05a2f54679861deb188750ba2a70187000b2c71f ]
When building in ClearLinux using 'make PYTHON=python3' with gcc 8.2.1
it fails with:
GEN /tmp/build/perf/python/perf.so
In file included from /usr/include/python3.7m/Python.h:126,
from /git/linux/tools/perf/util/python.c:2:
/usr/include/python3.7m/import.h:58:24: error: redundant redeclaration of ‘_PyImport_AddModuleObject’ [-Werror=redundant-decls]
PyAPI_FUNC(PyObject *) _PyImport_AddModuleObject(PyObject *, PyObject *);
^~~~~~~~~~~~~~~~~~~~~~~~~
/usr/include/python3.7m/import.h:47:24: note: previous declaration of ‘_PyImport_AddModuleObject’ was here
PyAPI_FUNC(PyObject *) _PyImport_AddModuleObject(PyObject *name,
^~~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
error: command 'gcc' failed with exit status 1
And indeed there is a redundant declaration in that Python.h file, one
with parameter names and the other without, so just add
-Wno-error=redundant-decls to the python setup instructions.
Now perf builds with gcc in ClearLinux with the following Dockerfile:
# docker.io/acmel/linux-perf-tools-build-clearlinux:latest
FROM docker.io/clearlinux:latest
MAINTAINER Arnaldo Carvalho de Melo <acme@kernel.org>
RUN swupd update && \
swupd bundle-add sysadmin-basic-dev
RUN mkdir -m 777 -p /git /tmp/build/perf /tmp/build/objtool /tmp/build/linux && \
groupadd -r perfbuilder && \
useradd -m -r -g perfbuilder perfbuilder && \
chown -R perfbuilder.perfbuilder /tmp/build/ /git/
USER perfbuilder
COPY rx_and_build.sh /
ENV EXTRA_MAKE_ARGS=PYTHON=python3
ENTRYPOINT ["/rx_and_build.sh"]
Now to figure out why the build fails with clang, that is present in the
above container as detected by the rx_and_build.sh script:
clang version 6.0.1 (tags/RELEASE_601/final)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/sbin
make: Entering directory '/git/linux/tools/perf'
BUILD: Doing 'make -j4' parallel build
HOSTCC /tmp/build/perf/fixdep.o
HOSTLD /tmp/build/perf/fixdep-in.o
LINK /tmp/build/perf/fixdep
Auto-detecting system features:
... dwarf: [ OFF ]
... dwarf_getlocations: [ OFF ]
... glibc: [ OFF ]
... gtk2: [ OFF ]
... libaudit: [ OFF ]
... libbfd: [ OFF ]
... libelf: [ OFF ]
... libnuma: [ OFF ]
... numa_num_possible_cpus: [ OFF ]
... libperl: [ OFF ]
... libpython: [ OFF ]
... libslang: [ OFF ]
... libcrypto: [ OFF ]
... libunwind: [ OFF ]
... libdw-dwarf-unwind: [ OFF ]
... zlib: [ OFF ]
... lzma: [ OFF ]
... get_cpuid: [ OFF ]
... bpf: [ OFF ]
Makefile.config:331: *** No gnu/libc-version.h found, please install glibc-dev[el]. Stop.
make[1]: *** [Makefile.perf:206: sub-make] Error 2
make: *** [Makefile:70: all] Error 2
make: Leaving directory '/git/linux/tools/perf'
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thiago Macieira <thiago.macieira@intel.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-c3khb9ac86s00qxzjrueomme@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Sascha Hauer [Wed, 12 Sep 2018 06:23:01 +0000 (08:23 +0200)]
ARM: dts: imx53-qsb: disable 1.2GHz OPP
[ Upstream commit
eea96566c189c77e5272585984eb2729881a2f1d ]
The maximum CPU frequency for the i.MX53 QSB is 1GHz, so disable the
1.2GHz OPP. This makes the board work again with configs that have
cpufreq enabled like imx_v6_v7_defconfig on which the board stopped
working with the addition of cpufreq-dt support.
Fixes: 791f416608 ("ARM: dts: imx53: add cpufreq-dt support")
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Paul Burton [Mon, 20 Aug 2018 22:36:17 +0000 (15:36 -0700)]
compiler.h: Allow arch-specific asm/compiler.h
[ Upstream commit
04f264d3a8b0eb25d378127bd78c3c9a0261c828 ]
We have a need to override the definition of
barrier_before_unreachable() for MIPS, which means we either need to add
architecture-specific code into linux/compiler-gcc.h or we need to allow
the architecture to provide a header that can define the macro before
the generic definition. The latter seems like the better approach.
A straightforward approach to the per-arch header is to make use of
asm-generic to provide a default empty header & adjust architectures
which don't need anything specific to make use of that by adding the
header to generic-y. Unfortunately this doesn't work so well due to
commit
28128c61e08e ("kconfig.h: Include compiler types to avoid missed
struct attributes") which caused linux/compiler_types.h to be included
in the compilation of every C file via the -include linux/kconfig.h flag
in c_flags.
Because the -include flag is present for all C files we compile, we need
the architecture-provided header to be present before any C files are
compiled. If any C files can be compiled prior to the asm-generic header
wrappers being generated then we hit a build failure due to missing
header. Such cases do exist - one pointed out by the kbuild test robot
is the compilation of arch/ia64/kernel/nr-irqs.c, which occurs as part
of the archprepare target [1].
This leaves us with a few options:
1) Use generic-y & fix any build failures we find by enforcing
ordering such that the asm-generic target occurs before any C
compilation, such that linux/compiler_types.h can always include
the generated asm-generic wrapper which in turn includes the empty
asm-generic header. This would rely on us finding all the
problematic cases - I don't know for sure that the ia64 issue is
the only one.
2) Add an actual empty header to each architecture, so that we don't
need the generated asm-generic wrapper. This seems messy.
3) Give up & add #ifdef CONFIG_MIPS or similar to
linux/compiler_types.h. This seems messy too.
4) Include the arch header only when it's actually needed, removing
the need for the asm-generic wrapper for all other architectures.
This patch allows us to use approach 4, by including an asm/compiler.h
header from linux/compiler_types.h after the inclusion of the
compiler-specific linux/compiler-*.h header(s). We do this
conditionally, only when CONFIG_HAVE_ARCH_COMPILER_H is selected, in
order to avoid the need for asm-generic wrappers & the associated build
ordering issue described above. The asm/compiler.h header is included
after the generic linux/compiler-*.h header(s) for consistency with the
way linux/compiler-intel.h & linux/compiler-clang.h are included after
the linux/compiler-gcc.h header that they override.
[1] https://lists.01.org/pipermail/kbuild-all/2018-August/051175.html
Signed-off-by: Paul Burton <paul.burton@mips.com>
Reviewed-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Patchwork: https://patchwork.linux-mips.org/patch/20269/
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: James Hogan <jhogan@kernel.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-arch@vger.kernel.org
Cc: linux-kbuild@vger.kernel.org
Cc: linux-mips@linux-mips.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
Hans de Goede [Wed, 19 Sep 2018 18:52:45 +0000 (20:52 +0200)]
HID: i2c-hid: Remove RESEND_REPORT_DESCR quirk and its handling
[ Upstream commit
afbb1169ed5b58cfca017e368b53e019cf285853 ]
Commit
52cf93e63ee6 ("HID: i2c-hid: Don't reset device upon system resume")
removes the need for the RESEND_REPORT_DESCR quirk for Raydium devices, but
kept it for the SIS device id 10FB touchscreens, as the author of that
commit could not determine if the quirk is still necessary there.
I've tested suspend/resume on a Toshiba Click Mini L9W-B which is the
device for which this quirk was added in the first place and with the
"Don't reset device upon system resume" fix the quirk is no longer
necessary, so this commit removes it.
Note even better I also had some other devices with SIS touchscreens which
suspend/resume issues, where the RESEND_REPORT_DESCR quirk did not help.
I've also tested these devices with the "Don't reset device upon system
resume" fix and I'm happy to report that that fix also fixes touchscreen
resume on the following devices:
Asus T100HA
Asus T200TA
Peaq C1010
Cc: Kai-Heng Feng <kai.heng.feng@canonical.com>
Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Doron Roberts-Kedes [Thu, 26 Jul 2018 14:59:36 +0000 (07:59 -0700)]
tls: Fix improper revert in zerocopy_from_iter
[ Upstream commit
2da19ed3e4a87db16c0f69039da9f17a9596c350 ]
The current code is problematic because the iov_iter is reverted and
never advanced in the non-error case. This patch skips the revert in the
non-error case. This patch also fixes the amount by which the iov_iter
is reverted. Currently, iov_iter is reverted by size, which can be
greater than the amount by which the iter was actually advanced.
Instead, only revert by the amount that the iter was advanced.
Fixes: 4718799817c5 ("tls: Fix zerocopy_from_iter iov handling")
Signed-off-by: Doron Roberts-Kedes <doronrk@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Milian Wolff [Wed, 26 Sep 2018 13:52:05 +0000 (15:52 +0200)]
perf report: Don't try to map ip to invalid map
[ Upstream commit
ff4ce2885af8f9e8e99864d78dbeb4673f089c76 ]
Fixes a crash when the report encounters an address that could not be
associated with an mmaped region:
#0 0x00005555557bdc4a in callchain_srcline (ip=<error reading variable: Cannot access memory at address 0x38>, sym=0x0, map=0x0) at util/machine.c:2329
#1 unwind_entry (entry=entry@entry=0x7fffffff9180, arg=arg@entry=0x7ffff5642498) at util/machine.c:2329
#2 0x00005555558370af in entry (arg=0x7ffff5642498, cb=0x5555557bdb50 <unwind_entry>, thread=<optimized out>, ip=
18446744073709551615) at util/unwind-libunwind-local.c:586
#3 get_entries (ui=ui@entry=0x7fffffff9620, cb=0x5555557bdb50 <unwind_entry>, arg=0x7ffff5642498, max_stack=<optimized out>) at util/unwind-libunwind-local.c:703
#4 0x0000555555837192 in _unwind__get_entries (cb=<optimized out>, arg=<optimized out>, thread=<optimized out>, data=<optimized out>, max_stack=<optimized out>) at util/unwind-libunwind-local.c:725
#5 0x00005555557c310f in thread__resolve_callchain_unwind (max_stack=127, sample=0x7fffffff9830, evsel=0x555555c7b3b0, cursor=0x7ffff5642498, thread=0x555555c7f6f0) at util/machine.c:2351
#6 thread__resolve_callchain (thread=0x555555c7f6f0, cursor=0x7ffff5642498, evsel=0x555555c7b3b0, sample=0x7fffffff9830, parent=0x7fffffff97b8, root_al=0x7fffffff9750, max_stack=127) at util/machine.c:2378
#7 0x00005555557ba4ee in sample__resolve_callchain (sample=<optimized out>, cursor=<optimized out>, parent=parent@entry=0x7fffffff97b8, evsel=<optimized out>, al=al@entry=0x7fffffff9750,
max_stack=<optimized out>) at util/callchain.c:1085
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Tested-by: Sandipan Das <sandipan@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: 2a9d5050dc84 ("perf script: Show correct offsets for DWARF-based unwinding")
Link: http://lkml.kernel.org/r/20180926135207.30263-1-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Daniel Mack [Mon, 8 Oct 2018 20:03:57 +0000 (22:03 +0200)]
libertas: call into generic suspend code before turning off power
[ Upstream commit
4f666675cdff0b986195413215eb062b7da6586f ]
When powering down a SDIO connected card during suspend, make sure to call
into the generic lbs_suspend() function before pulling the plug. This will
make sure the card is successfully deregistered from the system to avoid
communication to the card starving out.
Fixes: 7444a8092906 ("libertas: fix suspend and resume for SDIO connected cards")
Signed-off-by: Daniel Mack <daniel@zonque.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Anders Roxell [Thu, 26 Jul 2018 22:27:21 +0000 (00:27 +0200)]
clk: mvebu: armada-37xx-periph: Remove unused var num_parents
[ Upstream commit
8927c27b32703e28041ae19bf25ea53461be83a1 ]
When building armada-37xx-periph, num_parents isn't used in function
clk_pm_cpu_get_parent:
drivers/clk/mvebu/armada-37xx-periph.c: In function ‘clk_pm_cpu_get_parent’:
drivers/clk/mvebu/armada-37xx-periph.c:419:6: warning: unused variable ‘num_parents’ [-Wunused-variable]
int num_parents = clk_hw_get_num_parents(hw);
^~~~~~~~~~~
Remove the declaration of num_parents to dispose the warning.
Fixes: 616bf80d381d ("clk: mvebu: armada-37xx-periph: Fix wrong return value in get_parent")
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Dan Carpenter [Wed, 19 Sep 2018 10:35:53 +0000 (13:35 +0300)]
x86/paravirt: Fix some warning messages
[ Upstream commit
571d0563c8881595f4ab027aef9ed1c55e3e7b7c ]
The first argument to WARN_ONCE() is a condition.
Fixes: 5800dc5c19f3 ("x86/paravirt: Fix spectre-v2 mitigations for paravirt guests")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Alok Kataria <akataria@vmware.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: virtualization@lists.linux-foundation.org
Cc: kernel-janitors@vger.kernel.org
Link: https://lkml.kernel.org/r/20180919103553.GD9238@mwanda
Signed-off-by: Sasha Levin <sashal@kernel.org>
Anshuman Khandual [Fri, 5 Oct 2018 22:51:51 +0000 (15:51 -0700)]
mm/migrate.c: split only transparent huge pages when allocation fails
[ Upstream commit
e6112fc300702f96374f34368513d57795fc6d23 ]
split_huge_page_to_list() fails on HugeTLB pages. I was experimenting
with moving 32MB contig HugeTLB pages on arm64 (with a debug patch
applied) and hit the following stack trace when the kernel crashed.
[ 3732.462797] Call trace:
[ 3732.462835] split_huge_page_to_list+0x3b0/0x858
[ 3732.462913] migrate_pages+0x728/0xc20
[ 3732.462999] soft_offline_page+0x448/0x8b0
[ 3732.463097] __arm64_sys_madvise+0x724/0x850
[ 3732.463197] el0_svc_handler+0x74/0x110
[ 3732.463297] el0_svc+0x8/0xc
[ 3732.463347] Code:
d1000400 f90b0e60 f2fbd5a2 a94982a1 (
f9000420)
When unmap_and_move[_huge_page]() fails due to lack of memory, the
splitting should happen only for transparent huge pages not for HugeTLB
pages. PageTransHuge() returns true for both THP and HugeTLB pages.
Hence the conditonal check should test PagesHuge() flag to make sure that
given pages is not a HugeTLB one.
Link: http://lkml.kernel.org/r/1537798495-4996-1-git-send-email-anshuman.khandual@arm.com
Fixes: 94723aafb9 ("mm: unclutter THP migration")
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Zi Yan <zi.yan@cs.rutgers.edu>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
YueHaibing [Fri, 5 Oct 2018 22:51:44 +0000 (15:51 -0700)]
mm/gup_benchmark: fix unsigned comparison to zero in __gup_benchmark_ioctl
[ Upstream commit
51896864579d5a3349740847083f4db5c6487164 ]
get_user_pages_fast() will return negative value if no pages were pinned,
then be converted to a unsigned, which is compared to zero, giving the
wrong result.
Link: http://lkml.kernel.org/r/20180921095015.26088-1-yuehaibing@huawei.com
Fixes: 09e35a4a1ca8 ("mm/gup_benchmark: handle gup failures")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Larry Chen [Fri, 5 Oct 2018 22:51:37 +0000 (15:51 -0700)]
ocfs2: fix crash in ocfs2_duplicate_clusters_by_page()
[ Upstream commit
69eb7765b9c6902444c89c54e7043242faf981e5 ]
ocfs2_duplicate_clusters_by_page() may crash if one of the extent's pages
is dirty. When a page has not been written back, it is still in dirty
state. If ocfs2_duplicate_clusters_by_page() is called against the dirty
page, the crash happens.
To fix this bug, we can just unlock the page and wait until the page until
its not dirty.
The following is the backtrace:
kernel BUG at /root/code/ocfs2/refcounttree.c:2961!
[exception RIP: ocfs2_duplicate_clusters_by_page+822]
__ocfs2_move_extent+0x80/0x450 [ocfs2]
? __ocfs2_claim_clusters+0x130/0x250 [ocfs2]
ocfs2_defrag_extent+0x5b8/0x5e0 [ocfs2]
__ocfs2_move_extents_range+0x2a4/0x470 [ocfs2]
ocfs2_move_extents+0x180/0x3b0 [ocfs2]
? ocfs2_wait_for_recovery+0x13/0x70 [ocfs2]
ocfs2_ioctl_move_extents+0x133/0x2d0 [ocfs2]
ocfs2_ioctl+0x253/0x640 [ocfs2]
do_vfs_ioctl+0x90/0x5f0
SyS_ioctl+0x74/0x80
do_syscall_64+0x74/0x140
entry_SYSCALL_64_after_hwframe+0x3d/0xa2
Once we find the page is dirty, we do not wait until it's clean, rather we
use write_one_page() to write it back
Link: http://lkml.kernel.org/r/20180829074740.9438-1-lchen@suse.com
[lchen@suse.com: update comments]
Link: http://lkml.kernel.org/r/20180830075041.14879-1-lchen@suse.com
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Larry Chen <lchen@suse.com>
Acked-by: Changwei Ge <ge.changwei@h3c.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Wenwen Wang [Fri, 5 Oct 2018 15:59:36 +0000 (10:59 -0500)]
yam: fix a missing-check bug
[ Upstream commit
0781168e23a2fc8dceb989f11fc5b39b3ccacc35 ]
In yam_ioctl(), the concrete ioctl command is firstly copied from the
user-space buffer 'ifr->ifr_data' to 'ioctl_cmd' and checked through the
following switch statement. If the command is not as expected, an error
code EINVAL is returned. In the following execution the buffer
'ifr->ifr_data' is copied again in the cases of the switch statement to
specific data structures according to what kind of ioctl command is
requested. However, after the second copy, no re-check is enforced on the
newly-copied command. Given that the buffer 'ifr->ifr_data' is in the user
space, a malicious user can race to change the command between the two
copies. This way, the attacker can inject inconsistent data and cause
undefined behavior.
This patch adds a re-check in each case of the switch statement if there is
a second copy in that case, to re-check whether the command obtained in the
second copy is the same as the one in the first copy. If not, an error code
EINVAL will be returned.
Signed-off-by: Wenwen Wang <wang6495@umn.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Wenwen Wang [Fri, 5 Oct 2018 13:48:27 +0000 (08:48 -0500)]
net: cxgb3_main: fix a missing-check bug
[ Upstream commit
2c05d88818ab6571816b93edce4d53703870d7ae ]
In cxgb_extension_ioctl(), the command of the ioctl is firstly copied from
the user-space buffer 'useraddr' to 'cmd' and checked through the
switch statement. If the command is not as expected, an error code
EOPNOTSUPP is returned. In the following execution, i.e., the cases of the
switch statement, the whole buffer of 'useraddr' is copied again to a
specific data structure, according to what kind of command is requested.
However, after the second copy, there is no re-check on the newly-copied
command. Given that the buffer 'useraddr' is in the user space, a malicious
user can race to change the command between the two copies. By doing so,
the attacker can supply malicious data to the kernel and cause undefined
behavior.
This patch adds a re-check in each case of the switch statement if there is
a second copy in that case, to re-check whether the command obtained in the
second copy is the same as the one in the first copy. If not, an error code
EINVAL is returned.
Signed-off-by: Wenwen Wang <wang6495@umn.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Srikar Dronamraju [Fri, 28 Sep 2018 03:47:32 +0000 (09:17 +0530)]
powerpc/numa: Skip onlining a offline node in kdump path
[ Upstream commit
ac1788cc7da4ce54edcfd2e499afdb0a23d5c41d ]
With commit
2ea626306810 ("powerpc/topology: Get topology for shared
processors at boot"), kdump kernel on shared LPAR may crash.
The necessary conditions are
- Shared LPAR with at least 2 nodes having memory and CPUs.
- Memory requirement for kdump kernel must be met by the first N-1
nodes where there are at least N nodes with memory and CPUs.
Example numactl of such a machine.
$ numactl -H
available: 5 nodes (0,2,5-7)
node 0 cpus:
node 0 size: 0 MB
node 0 free: 0 MB
node 2 cpus:
node 2 size: 255 MB
node 2 free: 189 MB
node 5 cpus: 24 25 26 27 28 29 30 31
node 5 size: 4095 MB
node 5 free: 4024 MB
node 6 cpus: 0 1 2 3 4 5 6 7 16 17 18 19 20 21 22 23
node 6 size: 6353 MB
node 6 free: 5998 MB
node 7 cpus: 8 9 10 11 12 13 14 15 32 33 34 35 36 37 38 39
node 7 size: 7640 MB
node 7 free: 7164 MB
node distances:
node 0 2 5 6 7
0: 10 40 40 40 40
2: 40 10 40 40 40
5: 40 40 10 40 40
6: 40 40 40 10 20
7: 40 40 40 20 10
Steps to reproduce.
1. Load / start kdump service.
2. Trigger a kdump (for example : echo c > /proc/sysrq-trigger)
When booting a kdump kernel with 2048M:
kexec: Starting switchover sequence.
I'm in purgatory
Using 1TB segments
hash-mmu: Initializing hash mmu with SLB
Linux version 4.19.0-rc5-master+ (srikar@linux-xxu6) (gcc version 4.8.5 (SUSE Linux)) #1 SMP Thu Sep 27 19:45:00 IST 2018
Found initrd at 0xc000000009e70000:0xc00000000ae554b4
Using pSeries machine description
-----------------------------------------------------
ppc64_pft_size = 0x1e
phys_mem_size = 0x88000000
dcache_bsize = 0x80
icache_bsize = 0x80
cpu_features = 0x000000ff8f5d91a7
possible = 0x0000fbffcf5fb1a7
always = 0x0000006f8b5c91a1
cpu_user_features = 0xdc0065c2 0xef000000
mmu_features = 0x7c006001
firmware_features = 0x00000007c45bfc57
htab_hash_mask = 0x7fffff
physical_start = 0x8000000
-----------------------------------------------------
numa: NODE_DATA [mem 0x87d5e300-0x87d67fff]
numa: NODE_DATA(0) on node 6
numa: NODE_DATA [mem 0x87d54600-0x87d5e2ff]
Top of RAM: 0x88000000, Total RAM: 0x88000000
Memory hole size: 0MB
Zone ranges:
DMA [mem 0x0000000000000000-0x0000000087ffffff]
DMA32 empty
Normal empty
Movable zone start for each node
Early memory node ranges
node 6: [mem 0x0000000000000000-0x0000000087ffffff]
Could not find start_pfn for node 0
Initmem setup node 0 [mem 0x0000000000000000-0x0000000000000000]
On node 0 totalpages: 0
Initmem setup node 6 [mem 0x0000000000000000-0x0000000087ffffff]
On node 6 totalpages: 34816
Unable to handle kernel paging request for data at address 0x00000060
Faulting instruction address: 0xc000000008703a54
Oops: Kernel access of bad area, sig: 11 [#1]
LE SMP NR_CPUS=2048 NUMA pSeries
Modules linked in:
CPU: 11 PID: 1 Comm: swapper/11 Not tainted 4.19.0-rc5-master+ #1
NIP:
c000000008703a54 LR:
c000000008703a38 CTR:
0000000000000000
REGS:
c00000000b673440 TRAP: 0380 Not tainted (4.19.0-rc5-master+)
MSR:
8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR:
24022022 XER:
20000002
CFAR:
c0000000086fc238 IRQMASK: 0
GPR00:
c000000008703a38 c00000000b6736c0 c000000009281900 0000000000000000
GPR04:
0000000000000000 0000000000000000 fffffffffffff001 c00000000b660080
GPR08:
0000000000000000 0000000000000000 0000000000000000 0000000000000220
GPR12:
0000000000002200 c000000009e51400 0000000000000000 0000000000000008
GPR16:
0000000000000000 c000000008c152e8 c000000008c152a8 0000000000000000
GPR20:
c000000009422fd8 c000000009412fd8 c000000009426040 0000000000000008
GPR24:
0000000000000000 0000000000000000 c000000009168bc8 c000000009168c78
GPR28:
c00000000b126410 0000000000000000 c00000000916a0b8 c00000000b126400
NIP [
c000000008703a54] bus_add_device+0x84/0x1e0
LR [
c000000008703a38] bus_add_device+0x68/0x1e0
Call Trace:
[
c00000000b6736c0] [
c000000008703a38] bus_add_device+0x68/0x1e0 (unreliable)
[
c00000000b673740] [
c000000008700194] device_add+0x454/0x7c0
[
c00000000b673800] [
c00000000872e660] __register_one_node+0xb0/0x240
[
c00000000b673860] [
c00000000839a6bc] __try_online_node+0x12c/0x180
[
c00000000b673900] [
c00000000839b978] try_online_node+0x58/0x90
[
c00000000b673930] [
c0000000080846d8] find_and_online_cpu_nid+0x158/0x190
[
c00000000b673a10] [
c0000000080848a0] numa_update_cpu_topology+0x190/0x580
[
c00000000b673c00] [
c000000008d3f2e4] smp_cpus_done+0x94/0x108
[
c00000000b673c70] [
c000000008d5c00c] smp_init+0x174/0x19c
[
c00000000b673d00] [
c000000008d346b8] kernel_init_freeable+0x1e0/0x450
[
c00000000b673dc0] [
c0000000080102e8] kernel_init+0x28/0x160
[
c00000000b673e30] [
c00000000800b65c] ret_from_kernel_thread+0x5c/0x80
Instruction dump:
60000000 60000000 e89e0020 7fe3fb78 4bff87d5 60000000 7c7d1b79 4082008c
e8bf0050 e93e0098 3b9f0010 2fa50000 <
e8690060>
38630018 419e0114 7f84e378
---[ end trace
593577668c2daa65 ]---
However a regular kernel with 4096M (2048 gets reserved for crash
kernel) boots properly.
Unlike regular kernels, which mark all available nodes as online,
kdump kernel only marks just enough nodes as online and marks the rest
as offline at boot. However kdump kernel boots with all available
CPUs. With Commit
2ea626306810 ("powerpc/topology: Get topology for
shared processors at boot"), all CPUs are onlined on their respective
nodes at boot time. try_online_node() tries to online the offline
nodes but fails as all needed subsystems are not yet initialized.
As part of fix, detect and skip early onlining of a offline node.
Fixes: 2ea626306810 ("powerpc/topology: Get topology for shared processors at boot")
Reported-by: Pavithra Prakash <pavrampu@in.ibm.com>
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Tested-by: Hari Bathini <hbathini@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Davide Caratti [Wed, 3 Oct 2018 13:20:58 +0000 (15:20 +0200)]
be2net: don't flip hw_features when VXLANs are added/deleted
[ Upstream commit
2d52527e80c2dc0c5f43f50adf183781262ec565 ]
the be2net implementation of .ndo_tunnel_{add,del}() changes the value of
NETIF_F_GSO_UDP_TUNNEL bit in 'features' and 'hw_features', but it forgets
to call netdev_features_change(). Moreover, ethtool setting for that bit
can potentially be reverted after a tunnel is added or removed.
GSO already does software segmentation when 'hw_enc_features' is 0, even
if VXLAN offload is turned on. In addition, commit
096de2f83ebc ("benet:
stricter vxlan offloading check in be_features_check") avoids hardware
segmentation of non-VXLAN tunneled packets, or VXLAN packets having wrong
destination port. So, it's safe to avoid flipping the above feature on
addition/deletion of VXLAN tunnels.
Fixes: 630f4b70567f ("be2net: Export tunnel offloads only when a VxLAN tunnel is created")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Shirish S [Mon, 24 Sep 2018 13:31:47 +0000 (19:01 +0530)]
drm/amd/display: Signal hw_done() after waiting for flip_done()
[ Upstream commit
987bf116445db5d63a5c2ed94c4479687d9c9973 ]
In amdgpu_dm_commit_tail(), wait until flip_done() is signaled before
we signal hw_done().
[Why]
This is to temporarily address a paging error that occurs when a
nonblocking commit contends with another commit, particularly in a
mirrored display configuration where at least 2 CRTCs are updated.
The error occurs in drm_atomic_helper_wait_for_flip_done(), when we
attempt to access the contents of new_crtc_state->commit.
Here's the sequence for a mirrored 2 display setup (irrelevant steps
left out for clarity):
**THREAD 1** | **THREAD 2**
|
Initialize atomic state for flip |
|
Queue worker |
...
| Do work for flip
|
| Signal hw_done() on CRTC 1
| Signal hw_done() on CRTC 2
|
| Wait for flip_done() on CRTC 1
<---- **PREEMPTED BY THREAD 1**
Initialize atomic state for cursor |
update (1) |
|
Do cursor update work on both CRTCs |
|
Clear atomic state (2) |
**DONE** |
...
|
| Wait for flip_done() on CRTC 2
| *ERROR*
|
The issue starts with (1). When the atomic state is initialized, the
current CRTC states are duplicated to be the new_crtc_states, and
referenced to be the old_crtc_states. (The new_crtc_states are to be
filled with update data.)
Some things to note:
* Due to the mirrored configuration, the cursor updates on both CRTCs.
* At this point, the pflip IRQ has already been handled, and flip_done
signaled on all CRTCs. The cursor commit can therefore continue.
* The old_crtc_states used by the cursor update are the **same states**
as the new_crtc_states used by the flip worker.
At (2), the old_crtc_state is freed (*), and the cursor commit
completes. We then context switch back to the flip worker, where we
attempt to access the new_crtc_state->commit object. This is
problematic, as this state has already been freed.
(*) Technically, 'state->crtcs[i].state' is freed, which was made to
reference old_crtc_state in drm_atomic_helper_swap_state()
[How]
By moving hw_done() after wait_for_flip_done(), we're guaranteed that
the new_crtc_state (from the flip worker's perspective) still exists.
This is because any other commit will be blocked, waiting for the
hw_done() signal.
Note that both the i915 and imx drivers have this sequence flipped
already, masking this problem.
Signed-off-by: Shirish S <shirish.s@amd.com>
Signed-off-by: Leo Li <sunpeng.li@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Guenter Roeck [Tue, 2 Oct 2018 21:48:49 +0000 (14:48 -0700)]
locking/ww_mutex: Fix runtime warning in the WW mutex selftest
[ Upstream commit
e4a02ed2aaf447fa849e3254bfdb3b9b01e1e520 ]
If CONFIG_WW_MUTEX_SELFTEST=y is enabled, booting an image
in an arm64 virtual machine results in the following
traceback if 8 CPUs are enabled:
DEBUG_LOCKS_WARN_ON(__owner_task(owner) != current)
WARNING: CPU: 2 PID: 537 at kernel/locking/mutex.c:1033 __mutex_unlock_slowpath+0x1a8/0x2e0
...
Call trace:
__mutex_unlock_slowpath()
ww_mutex_unlock()
test_cycle_work()
process_one_work()
worker_thread()
kthread()
ret_from_fork()
If requesting b_mutex fails with -EDEADLK, the error variable
is reassigned to the return value from calling ww_mutex_lock
on a_mutex again. If this call fails, a_mutex is not locked.
It is, however, unconditionally unlocked subsequently, causing
the reported warning. Fix the problem by using two error variables.
With this change, the selftest still fails as follows:
cyclic deadlock not resolved, ret[7/8] = -35
However, the traceback is gone.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Fixes: d1b42b800e5d0 ("locking/ww_mutex: Add kselftests for resolving ww_mutex cyclic deadlocks")
Link: http://lkml.kernel.org/r/1538516929-9734-1-git-send-email-linux@roeck-us.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Guenter Roeck [Tue, 2 Oct 2018 04:42:37 +0000 (21:42 -0700)]
Revert "serial: 8250_dw: Fix runtime PM handling"
[ Upstream commit
beeeac43b6fae5f5eaf707b6fcc2bf1e09deb785 ]
This reverts commit
d76c74387e1c978b6c5524a146ab0f3f72206f98.
While commit
d76c74387e1c ("serial: 8250_dw: Fix runtime PM handling")
fixes runtime PM handling when using kgdb, it introduces a traceback for
everyone else.
BUG: sleeping function called from invalid context at
/mnt/host/source/src/third_party/kernel/next/drivers/base/power/runtime.c:1034
in_atomic(): 1, irqs_disabled(): 1, pid: 1, name: swapper/0
7 locks held by swapper/0/1:
#0:
000000005ec5bc72 (&dev->mutex){....}, at: __driver_attach+0xb5/0x12b
#1:
000000005d5fa9e5 (&dev->mutex){....}, at: __device_attach+0x3e/0x15b
#2:
0000000047e93286 (serial_mutex){+.+.}, at: serial8250_register_8250_port+0x51/0x8bb
#3:
000000003b328f07 (port_mutex){+.+.}, at: uart_add_one_port+0xab/0x8b0
#4:
00000000fa313d4d (&port->mutex){+.+.}, at: uart_add_one_port+0xcc/0x8b0
#5:
00000000090983ca (console_lock){+.+.}, at: vprintk_emit+0xdb/0x217
#6:
00000000c743e583 (console_owner){-...}, at: console_unlock+0x211/0x60f
irq event stamp: 735222
__down_trylock_console_sem+0x4a/0x84
console_unlock+0x338/0x60f
__do_softirq+0x4a4/0x50d
irq_exit+0x64/0xe2
CPU: 2 PID: 1 Comm: swapper/0 Not tainted 4.19.0-rc5 #6
Hardware name: Google Caroline/Caroline, BIOS Google_Caroline.7820.286.0 03/15/2017
Call Trace:
dump_stack+0x7d/0xbd
___might_sleep+0x238/0x259
__pm_runtime_resume+0x4e/0xa4
? serial8250_rpm_get+0x2e/0x44
serial8250_console_write+0x44/0x301
? lock_acquire+0x1b8/0x1fa
console_unlock+0x577/0x60f
vprintk_emit+0x1f0/0x217
printk+0x52/0x6e
register_console+0x43b/0x524
uart_add_one_port+0x672/0x8b0
? set_io_from_upio+0x150/0x162
serial8250_register_8250_port+0x825/0x8bb
dw8250_probe+0x80c/0x8b0
? dw8250_serial_inq+0x8e/0x8e
? dw8250_check_lcr+0x108/0x108
? dw8250_runtime_resume+0x5b/0x5b
? dw8250_serial_outq+0xa1/0xa1
? dw8250_remove+0x115/0x115
platform_drv_probe+0x76/0xc5
really_probe+0x1f1/0x3ee
? driver_allows_async_probing+0x5d/0x5d
driver_probe_device+0xd6/0x112
? driver_allows_async_probing+0x5d/0x5d
bus_for_each_drv+0xbe/0xe5
__device_attach+0xdd/0x15b
bus_probe_device+0x5a/0x10b
device_add+0x501/0x894
? _raw_write_unlock+0x27/0x3a
platform_device_add+0x224/0x2b7
mfd_add_device+0x718/0x75b
? __kmalloc+0x144/0x16a
? mfd_add_devices+0x38/0xdb
mfd_add_devices+0x9b/0xdb
intel_lpss_probe+0x7d4/0x8ee
intel_lpss_pci_probe+0xac/0xd4
pci_device_probe+0x101/0x18e
...
Revert the offending patch until a more comprehensive solution
is available.
Cc: Tony Lindgren <tony@atomide.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Phil Edworthy <phil.edworthy@renesas.com>
Fixes: d76c74387e1c ("serial: 8250_dw: Fix runtime PM handling")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Atish Patra [Tue, 11 Sep 2018 18:30:18 +0000 (11:30 -0700)]
RISCV: Fix end PFN for low memory
[ Upstream commit
ef1f2258748b675422ca0107e5bfb9ceeac675de ]
Use memblock_end_of_DRAM which provides correct last low memory
PFN. Without that, DMA32 region becomes empty resulting in zero
pages being allocated for DMA32.
This patch is based on earlier patch from palmer which never
merged into 4.19. I just edited the commit text to make more
sense.
Signed-off-by: Atish Patra <atish.patra@wdc.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Maciej W. Rozycki [Tue, 2 Oct 2018 13:23:45 +0000 (14:23 +0100)]
declance: Fix continuation with the adapter identification message
[ Upstream commit
fe3a83af6a50199bf250fa331e94216912f79395 ]
Fix a commit
4bcc595ccd80 ("printk: reinstate KERN_CONT for printing
continuation lines") regression with the `declance' driver, which caused
the adapter identification message to be split between two lines, e.g.:
declance.c: v0.011 by Linux MIPS DECstation task force
tc6: PMAD-AA
, addr = 08:00:2b:1b:2a:6a, irq = 14
tc6: registered as eth0.
Address that properly, by printing identification with a single call,
making the messages now look like:
declance.c: v0.011 by Linux MIPS DECstation task force
tc6: PMAD-AA, addr = 08:00:2b:1b:2a:6a, irq = 14
tc6: registered as eth0.
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org>
Fixes: 4bcc595ccd80 ("printk: reinstate KERN_CONT for printing continuation lines")
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Rickard x Andersson [Tue, 2 Oct 2018 12:49:32 +0000 (14:49 +0200)]
net: fec: fix rare tx timeout
[ Upstream commit
657ade07df72847f591ccdb36bd9b91ed0edbac3 ]
During certain heavy network loads TX could time out
with TX ring dump.
TX is sometimes never restarted after reaching
"tx_stop_threshold" because function "fec_enet_tx_queue"
only tests the first queue.
In addition the TX timeout callback function failed to
recover because it also operated only on the first queue.
Signed-off-by: Rickard x Andersson <rickaran@axis.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Mika Westerberg [Mon, 24 Sep 2018 10:20:45 +0000 (13:20 +0300)]
thunderbolt: Initialize after IOMMUs
[ Upstream commit
eafa717bc145963c944bb0a64d16add683861b35 ]
If IOMMU is enabled and Thunderbolt driver is built into the kernel
image, it will be probed before IOMMUs are attached to the PCI bus.
Because of this DMA mappings the driver does will not go through IOMMU
and start failing right after IOMMUs are enabled.
For this reason move the Thunderbolt driver initialization happen at
rootfs level.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Mika Westerberg [Mon, 24 Sep 2018 10:20:44 +0000 (13:20 +0300)]
thunderbolt: Do not handle ICM events after domain is stopped
[ Upstream commit
86da809dda64a63fc27e05a215475325c3aaae92 ]
If there is a long chain of devices connected when the driver is loaded
ICM sends device connected event for each and those are put to tb->wq
for later processing. Now if the driver gets unloaded in the middle, so
that the work queue is not yet empty it gets flushed by tb_domain_stop().
However, by that time the root switch is already removed so the driver
crashes when it tries to dereference it in ICM event handling callbacks.
Fix this by checking whether the root switch is already removed. If it
is we know that the domain is stopped and we should merely skip handling
the event.
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Natarajan, Janakarajan [Thu, 27 Sep 2018 15:51:55 +0000 (15:51 +0000)]
perf/x86/amd/uncore: Set ThreadMask and SliceMask for L3 Cache perf events
[ Upstream commit
d7cbbe49a9304520181fb8c9272d1327deec8453 ]
In Family 17h, some L3 Cache Performance events require the ThreadMask
and SliceMask to be set. For other events, these fields do not affect
the count either way.
Set ThreadMask and SliceMask to 0xFF and 0xF respectively.
Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H . Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suravee <Suravee.Suthikulpanit@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/Message-ID:
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Kan Liang [Fri, 21 Sep 2018 14:07:06 +0000 (07:07 -0700)]
perf/x86/intel/uncore: Fix PCI BDF address of M3UPI on SKX
[ Upstream commit
9d92cfeaf5215158d26d2991be7f7ff865cb98f3 ]
The counters on M3UPI Link 0 and Link 3 don't count properly, and writing
0 to these counters may causes system crash on some machines.
The PCI BDF addresses of the M3UPI in the current code are incorrect.
The correct addresses should be:
D18:F1 0x204D
D18:F2 0x204E
D18:F5 0x204D
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Fixes: cd34cd97b7b4 ("perf/x86/intel/uncore: Add Skylake server uncore support")
Link: http://lkml.kernel.org/r/1537538826-55489-1-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Jiri Olsa [Sun, 23 Sep 2018 16:13:43 +0000 (18:13 +0200)]
perf/ring_buffer: Prevent concurent ring buffer access
[ Upstream commit
cd6fb677ce7e460c25bdd66f689734102ec7d642 ]
Some of the scheduling tracepoints allow the perf_tp_event
code to write to ring buffer under different cpu than the
code is running on.
This results in corrupted ring buffer data demonstrated in
following perf commands:
# perf record -e 'sched:sched_switch,sched:sched_wakeup' perf bench sched messaging
# Running 'sched/messaging' benchmark:
# 20 sender and receiver processes per group
# 10 groups == 400 processes run
Total time: 0.383 [sec]
[ perf record: Woken up 8 times to write data ]
0x42b890 [0]: failed to process type: -
1765585640
[ perf record: Captured and wrote 4.825 MB perf.data (29669 samples) ]
# perf report --stdio
0x42b890 [0]: failed to process type: -
1765585640
The reason for the corruption are some of the scheduling tracepoints,
that have __perf_task dfined and thus allow to store data to another
cpu ring buffer:
sched_waking
sched_wakeup
sched_wakeup_new
sched_stat_wait
sched_stat_sleep
sched_stat_iowait
sched_stat_blocked
The perf_tp_event function first store samples for current cpu
related events defined for tracepoint:
hlist_for_each_entry_rcu(event, head, hlist_entry)
perf_swevent_event(event, count, &data, regs);
And then iterates events of the 'task' and store the sample
for any task's event that passes tracepoint checks:
ctx = rcu_dereference(task->perf_event_ctxp[perf_sw_context]);
list_for_each_entry_rcu(event, &ctx->event_list, event_entry) {
if (event->attr.type != PERF_TYPE_TRACEPOINT)
continue;
if (event->attr.config != entry->type)
continue;
perf_swevent_event(event, count, &data, regs);
}
Above code can race with same code running on another cpu,
ending up with 2 cpus trying to store under the same ring
buffer, which is specifically not allowed.
This patch prevents the problem, by allowing only events with the same
current cpu to receive the event.
NOTE: this requires the use of (per-task-)per-cpu buffers for this
feature to work; perf-record does this.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
[peterz: small edits to Changelog]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Vagin <avagin@openvz.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Fixes: e6dab5ffab59 ("perf/trace: Add ability to set a target task for events")
Link: http://lkml.kernel.org/r/20180923161343.GB15054@krava
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Masayoshi Mizuma [Mon, 10 Sep 2018 14:47:50 +0000 (10:47 -0400)]
perf/x86/intel/uncore: Use boot_cpu_data.phys_proc_id instead of hardcorded physical package ID 0
[ Upstream commit
6265adb9726098b7f4f7ca70bc51992b25fdd9d6 ]
Physical package id 0 doesn't always exist, we should use
boot_cpu_data.phys_proc_id here.
Signed-off-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masayoshi Mizuma <msys.mizuma@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/20180910144750.6782-1-msys.mizuma@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Peter Zijlstra [Tue, 25 Sep 2018 15:58:35 +0000 (17:58 +0200)]
perf/core: Fix perf_pmu_unregister() locking
[ Upstream commit
a9f9772114c8b07ae75bcb3654bd017461248095 ]
When we unregister a PMU, we fail to serialize the @pmu_idr properly.
Fix that by doing the entire thing under pmu_lock.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Fixes: 2e80a82a49c4 ("perf: Dynamic pmu types")
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Liran Alon [Fri, 14 Sep 2018 00:25:54 +0000 (03:25 +0300)]
KVM: nVMX: Fix emulation of VM_ENTRY_LOAD_BNDCFGS
[ Upstream commit
62cf9bd8118c4009f02c477ef78c723f49e53e16 ]
L2 IA32_BNDCFGS should be updated with vmcs12->guest_bndcfgs only
when VM_ENTRY_LOAD_BNDCFGS is specified in vmcs12->vm_entry_controls.
Otherwise, L2 IA32_BNDCFGS should be set to vmcs01->guest_bndcfgs which
is L1 IA32_BNDCFGS.
Reviewed-by: Nikita Leshchenko <nikita.leshchenko@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Liran Alon [Fri, 14 Sep 2018 00:25:53 +0000 (03:25 +0300)]
KVM: x86: Do not use kvm_x86_ops->mpx_supported() directly
[ Upstream commit
503234b3fdcaa578395c07e393ea3e5d13958824 ]
Commit
a87036add092 ("KVM: x86: disable MPX if host did not enable
MPX XSAVE features") introduced kvm_mpx_supported() to return true
iff MPX is enabled in the host.
However, that commit seems to have missed replacing some calls to
kvm_x86_ops->mpx_supported() to kvm_mpx_supported().
Complete original commit by replacing remaining calls to
kvm_mpx_supported().
Fixes: a87036add092 ("KVM: x86: disable MPX if host did not enable
MPX XSAVE features")
Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Liran Alon [Fri, 14 Sep 2018 00:25:52 +0000 (03:25 +0300)]
KVM: nVMX: Do not expose MPX VMX controls when guest MPX disabled
[ Upstream commit
5f76f6f5ff96587af5acd5930f7d9fea81e0d1a8 ]
Before this commit, KVM exposes MPX VMX controls to L1 guest only based
on if KVM and host processor supports MPX virtualization.
However, these controls should be exposed to guest only in case guest
vCPU supports MPX.
Without this change, a L1 guest running with kernel which don't have
commit
691bd4340bef ("kvm: vmx: allow host to access guest
MSR_IA32_BNDCFGS") asserts in QEMU on the following:
qemu-kvm: error: failed to set MSR 0xd90 to 0x0
qemu-kvm: .../qemu-2.10.0/target/i386/kvm.c:1801 kvm_put_msrs:
Assertion 'ret == cpu->kvm_msr_buf->nmsrs failed'
This is because L1 KVM kvm_init_msr_list() will see that
vmx_mpx_supported() (As it only checks MPX VMX controls support) and
therefore KVM_GET_MSR_INDEX_LIST IOCTL will include MSR_IA32_BNDCFGS.
However, later when L1 will attempt to set this MSR via KVM_SET_MSRS
IOCTL, it will fail because !guest_cpuid_has_mpx(vcpu).
Therefore, fix the issue by exposing MPX VMX controls to L1 guest only
when vCPU supports MPX.
Fixes: 36be0b9deb23 ("KVM: x86: Add nested virtualization support for MPX")
Reported-by: Eyal Moscovici <eyal.moscovici@oracle.com>
Reviewed-by: Nikita Leshchenko <nikita.leshchenko@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Yu Zhao [Thu, 27 Sep 2018 23:05:04 +0000 (17:05 -0600)]
cfg80211: fix use-after-free in reg_process_hint()
[ Upstream commit
1db58529454742f67ebd96e3588315e880b72837 ]
reg_process_hint_country_ie() can free regulatory_request and return
REG_REQ_ALREADY_SET. We shouldn't use regulatory_request after it's
called. KASAN error was observed when this happens.
BUG: KASAN: use-after-free in reg_process_hint+0x839/0x8aa [cfg80211]
Read of size 4 at addr
ffff8800c430d434 by task kworker/1:3/89
<snipped>
Workqueue: events reg_todo [cfg80211]
Call Trace:
dump_stack+0xc1/0x10c
? _atomic_dec_and_lock+0x1ad/0x1ad
? _raw_spin_lock_irqsave+0xa0/0xd2
print_address_description+0x86/0x26f
? reg_process_hint+0x839/0x8aa [cfg80211]
kasan_report+0x241/0x29b
reg_process_hint+0x839/0x8aa [cfg80211]
reg_todo+0x204/0x5b9 [cfg80211]
process_one_work+0x55f/0x8d0
? worker_detach_from_pool+0x1b5/0x1b5
? _raw_spin_unlock_irq+0x65/0xdd
? _raw_spin_unlock_irqrestore+0xf3/0xf3
worker_thread+0x5dd/0x841
? kthread_parkme+0x1d/0x1d
kthread+0x270/0x285
? pr_cont_work+0xe3/0xe3
? rcu_read_unlock_sched_notrace+0xca/0xca
ret_from_fork+0x22/0x40
Allocated by task 2718:
set_track+0x63/0xfa
__kmalloc+0x119/0x1ac
regulatory_hint_country_ie+0x38/0x329 [cfg80211]
__cfg80211_connect_result+0x854/0xadd [cfg80211]
cfg80211_rx_assoc_resp+0x3bc/0x4f0 [cfg80211]
smsc95xx v1.0.6
ieee80211_sta_rx_queued_mgmt+0x1803/0x7ed5 [mac80211]
ieee80211_iface_work+0x411/0x696 [mac80211]
process_one_work+0x55f/0x8d0
worker_thread+0x5dd/0x841
kthread+0x270/0x285
ret_from_fork+0x22/0x40
Freed by task 89:
set_track+0x63/0xfa
kasan_slab_free+0x6a/0x87
kfree+0xdc/0x470
reg_process_hint+0x31e/0x8aa [cfg80211]
reg_todo+0x204/0x5b9 [cfg80211]
process_one_work+0x55f/0x8d0
worker_thread+0x5dd/0x841
kthread+0x270/0x285
ret_from_fork+0x22/0x40
<snipped>
Signed-off-by: Yu Zhao <yuzhao@google.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Florian Fainelli [Fri, 28 Sep 2018 23:18:56 +0000 (16:18 -0700)]
smsc95xx: Check for Wake-on-LAN modes
[ Upstream commit
c530c471ba37bdd9fe1c7185b01455c00ae606fb ]
The driver does not check for Wake-on-LAN modes specified by an user,
but will conditionally set the device as wake-up enabled or not based on
that, which could be a very confusing user experience.
Fixes: e0e474a83c18 ("smsc95xx: add wol magic packet support")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Florian Fainelli [Fri, 28 Sep 2018 23:18:55 +0000 (16:18 -0700)]
smsc75xx: Check for Wake-on-LAN modes
[ Upstream commit
9c734b2769a73eea2e9e9767c0e0bf839ff23679 ]
The driver does not check for Wake-on-LAN modes specified by an user,
but will conditionally set the device as wake-up enabled or not based on
that, which could be a very confusing user experience.
Fixes: 6c636503260d ("smsc75xx: add wol magic packet support")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Florian Fainelli [Fri, 28 Sep 2018 23:18:54 +0000 (16:18 -0700)]
r8152: Check for supported Wake-on-LAN Modes
[ Upstream commit
f2750df1548bd8a2b060eb609fc43ca82811af4c ]
The driver does not check for Wake-on-LAN modes specified by an user,
but will conditionally set the device as wake-up enabled or not based on
that, which could be a very confusing user experience.
Fixes: 21ff2e8976b1 ("r8152: support WOL")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Florian Fainelli [Fri, 28 Sep 2018 23:18:53 +0000 (16:18 -0700)]
sr9800: Check for supported Wake-on-LAN modes
[ Upstream commit
c5cb93e994ffb43b7b3b1ff10b9f928f54574a36 ]
The driver currently silently accepts unsupported Wake-on-LAN modes
(other than WAKE_PHY or WAKE_MAGIC) without reporting that to the user,
which is confusing.
Fixes: 19a38d8e0aa3 ("USB2NET : SR9800 : One chip USB2.0 USB2NET SR9800 Device Driver Support")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Florian Fainelli [Fri, 28 Sep 2018 23:18:52 +0000 (16:18 -0700)]
lan78xx: Check for supported Wake-on-LAN modes
[ Upstream commit
eb9ad088f96653a26b340f7c447c44cf023d5cdc ]
The driver supports a fair amount of Wake-on-LAN modes, but is not
checking that the user specified one that is supported.
Fixes: 55d7de9de6c3 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Woojung Huh <Woojung.Huh@Microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Florian Fainelli [Fri, 28 Sep 2018 23:18:51 +0000 (16:18 -0700)]
ax88179_178a: Check for supported Wake-on-LAN modes
[ Upstream commit
5ba6b4aa9a410c5e2c6417df52b5e2118ea9b467 ]
The driver currently silently accepts unsupported Wake-on-LAN modes
(other than WAKE_PHY or WAKE_MAGIC) without reporting that to the user,
which is confusing.
Fixes: e2ca90c276e1 ("ax88179_178a: ASIX AX88179_178A USB 3.0/2.0 to gigabit ethernet adapter driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Florian Fainelli [Fri, 28 Sep 2018 23:18:50 +0000 (16:18 -0700)]
asix: Check for supported Wake-on-LAN modes
[ Upstream commit
c4ce446e33d7a0e978256ac6fea4c80e59d9de5f ]
The driver currently silently accepts unsupported Wake-on-LAN modes
(other than WAKE_PHY or WAKE_MAGIC) without reporting that to the user,
which is confusing.
Fixes: 2e55cc7210fe ("[PATCH] USB: usbnet (3/9) module for ASIX Ethernet adapters")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Florian Westphal [Mon, 24 Sep 2018 12:10:04 +0000 (14:10 +0200)]
netfilter: avoid erronous array bounds warning
[ Upstream commit
421c119f558761556afca6a62ad183bc2d8659e0 ]
Unfortunately some versions of gcc emit following warning:
$ make net/xfrm/xfrm_output.o
linux/compiler.h:252:20: warning: array subscript is above array bounds [-Warray-bounds]
hook_head = rcu_dereference(net->nf.hooks_arp[hook]);
^~~~~~~~~~~~~~~~~~~~~
xfrm_output_resume passes skb_dst(skb)->ops->family as its 'pf' arg so compiler
can't know that we'll never access hooks_arp[].
(NFPROTO_IPV4 or NFPROTO_IPV6 are only possible cases).
Avoid this by adding an explicit WARN_ON_ONCE() check.
This patch has no effect if the family is a compile-time constant as gcc
will remove the switch() construct entirely.
Reported-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Taehee Yoo [Thu, 30 Aug 2018 08:56:52 +0000 (17:56 +0900)]
netfilter: nft_set_rbtree: add missing rb_erase() in GC routine
[ Upstream commit
a13f814a67b12a2f29d1decf4b4f4e700658a517 ]
The nft_set_gc_batch_check() checks whether gc buffer is full.
If gc buffer is full, gc buffer is released by
the nft_set_gc_batch_complete() internally.
In case of rbtree, the rb_erase() should be called before calling the
nft_set_gc_batch_complete(). therefore the rb_erase() should
be called before calling the nft_set_gc_batch_check() too.
test commands:
table ip filter {
set set1 {
type ipv4_addr; flags interval, timeout;
gc-interval 10s;
timeout 1s;
elements = {
1-2,
3-4,
5-6,
...
10000-10001,
}
}
}
%nft -f test.nft
splat looks like:
[ 430.273885] kasan: GPF could be caused by NULL-ptr deref or user memory access
[ 430.282158] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[ 430.283116] CPU: 1 PID: 190 Comm: kworker/1:2 Tainted: G B 4.18.0+ #7
[ 430.283116] Workqueue: events_power_efficient nft_rbtree_gc [nf_tables_set]
[ 430.313559] RIP: 0010:rb_next+0x81/0x130
[ 430.313559] Code: 08 49 bd 00 00 00 00 00 fc ff df 48 bb 00 00 00 00 00 fc ff df 48 85 c0 75 05 eb 58 48 89 d4
[ 430.313559] RSP: 0018:
ffff88010cdb7680 EFLAGS:
00010207
[ 430.313559] RAX:
0000000000b84854 RBX:
dffffc0000000000 RCX:
ffffffff83f01973
[ 430.313559] RDX:
000000000017090c RSI:
0000000000000008 RDI:
0000000000b84864
[ 430.313559] RBP:
ffff8801060d4588 R08:
fffffbfff09bc349 R09:
fffffbfff09bc349
[ 430.313559] R10:
0000000000000001 R11:
fffffbfff09bc348 R12:
ffff880100f081a8
[ 430.313559] R13:
dffffc0000000000 R14:
ffff880100ff8688 R15:
dffffc0000000000
[ 430.313559] FS:
0000000000000000(0000) GS:
ffff88011b400000(0000) knlGS:
0000000000000000
[ 430.313559] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 430.313559] CR2:
0000000001551008 CR3:
000000005dc16000 CR4:
00000000001006e0
[ 430.313559] Call Trace:
[ 430.313559] nft_rbtree_gc+0x112/0x5c0 [nf_tables_set]
[ 430.313559] process_one_work+0xc13/0x1ec0
[ 430.313559] ? _raw_spin_unlock_irq+0x29/0x40
[ 430.313559] ? pwq_dec_nr_in_flight+0x3c0/0x3c0
[ 430.313559] ? set_load_weight+0x270/0x270
[ 430.313559] ? __switch_to_asm+0x34/0x70
[ 430.313559] ? __switch_to_asm+0x40/0x70
[ 430.313559] ? __switch_to_asm+0x34/0x70
[ 430.313559] ? __switch_to_asm+0x34/0x70
[ 430.313559] ? __switch_to_asm+0x40/0x70
[ 430.313559] ? __switch_to_asm+0x34/0x70
[ 430.313559] ? __switch_to_asm+0x40/0x70
[ 430.313559] ? __switch_to_asm+0x34/0x70
[ 430.313559] ? __switch_to_asm+0x34/0x70
[ 430.313559] ? __switch_to_asm+0x40/0x70
[ 430.313559] ? __switch_to_asm+0x34/0x70
[ 430.313559] ? __schedule+0x6d3/0x1f50
[ 430.313559] ? find_held_lock+0x39/0x1c0
[ 430.313559] ? __sched_text_start+0x8/0x8
[ 430.313559] ? cyc2ns_read_end+0x10/0x10
[ 430.313559] ? save_trace+0x300/0x300
[ 430.313559] ? sched_clock_local+0xd4/0x140
[ 430.313559] ? find_held_lock+0x39/0x1c0
[ 430.313559] ? worker_thread+0x353/0x1120
[ 430.313559] ? worker_thread+0x353/0x1120
[ 430.313559] ? lock_contended+0xe70/0xe70
[ 430.313559] ? __lock_acquire+0x4500/0x4500
[ 430.535635] ? do_raw_spin_unlock+0xa5/0x330
[ 430.535635] ? do_raw_spin_trylock+0x101/0x1a0
[ 430.535635] ? do_raw_spin_lock+0x1f0/0x1f0
[ 430.535635] ? _raw_spin_lock_irq+0x10/0x70
[ 430.535635] worker_thread+0x15d/0x1120
[ ... ]
Fixes: 8d8540c4f5e0 ("netfilter: nft_set_rbtree: add timeout support")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
David Howells [Thu, 27 Sep 2018 14:13:09 +0000 (15:13 +0100)]
rxrpc: Fix error distribution
[ Upstream commit
f334430316e7fd37c4821ebec627e27714bb5d76 ]
Fix error distribution by immediately delivering the errors to all the
affected calls rather than deferring them to a worker thread. The problem
with the latter is that retries and things can happen in the meantime when we
want to stop that sooner.
To this end:
(1) Stop the error distributor from removing calls from the error_targets
list so that peer->lock isn't needed to synchronise against other adds
and removals.
(2) Require the peer's error_targets list to be accessed with RCU, thereby
avoiding the need to take peer->lock over distribution.
(3) Don't attempt to affect a call's state if it is already marked complete.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
David Howells [Thu, 27 Sep 2018 14:13:09 +0000 (15:13 +0100)]
rxrpc: Fix transport sockopts to get IPv4 errors on an IPv6 socket
[ Upstream commit
37a675e768d7606fe8a53e0c459c9b53e121ac20 ]
It seems that enabling IPV6_RECVERR on an IPv6 socket doesn't also turn on
IP_RECVERR, so neither local errors nor ICMP-transported remote errors from
IPv4 peer addresses are returned to the AF_RXRPC protocol.
Make the sockopt setting code in rxrpc_open_socket() fall through from the
AF_INET6 case to the AF_INET case to turn on all the AF_INET options too in
the AF_INET6 case.
Fixes: f2aeed3a591f ("rxrpc: Fix error reception on AF_INET6 sockets")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
David Howells [Thu, 27 Sep 2018 14:13:08 +0000 (15:13 +0100)]
rxrpc: Fix RTT gathering
[ Upstream commit
b604dd9883f783a94020d772e4fe03160f455372 ]
Fix RTT information gathering in AF_RXRPC by the following means:
(1) Enable Rx timestamping on the transport socket with SO_TIMESTAMPNS.
(2) If the sk_buff doesn't have a timestamp set when rxrpc_data_ready()
collects it, set it at that point.
(3) Allow ACKs to be requested on the last packet of a client call, but
not a service call. We need to be careful lest we undo:
bf7d620abf22c321208a4da4f435e7af52551a21
Author: David Howells <dhowells@redhat.com>
Date: Thu Oct 6 08:11:51 2016 +0100
rxrpc: Don't request an ACK on the last DATA packet of a call's Tx phase
but that only really applies to service calls that we're handling,
since the client side gets to send the final ACK (or not).
(4) When about to transmit an ACK or DATA packet, record the Tx timestamp
before only; don't update the timestamp afterwards.
(5) Switch the ordering between recording the serial and recording the
timestamp to always set the serial number first. The serial number
shouldn't be seen referenced by an ACK packet until we've transmitted
the packet bearing it - so in the Rx path, we don't need the timestamp
until we've checked the serial number.
Fixes: cf1a6474f807 ("rxrpc: Add per-peer RTT tracker")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
David Howells [Thu, 27 Sep 2018 14:13:08 +0000 (15:13 +0100)]
rxrpc: Fix checks as to whether we should set up a new call
[ Upstream commit
dc71db34e4f3c06b8277c8f3c2ff014610607a8c ]
There's a check in rxrpc_data_ready() that's checking the CLIENT_INITIATED
flag in the packet type field rather than in the packet flags field.
Fix this by creating a pair of helper functions to check whether the packet
is going to the client or to the server and use them generally.
Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Nilesh Javali [Thu, 27 Sep 2018 12:15:35 +0000 (05:15 -0700)]
scsi: qedi: Initialize the stats mutex lock
[ Upstream commit
3cc5746e5ad7688e274e193fa71278d98aa52759 ]
Fix kernel NULL pointer dereference,
Call Trace:
[<
ffffffff9b7658e6>] __mutex_lock_slowpath+0xa6/0x1d0
[<
ffffffff9b764cef>] mutex_lock+0x1f/0x2f
[<
ffffffffc061b5e1>] qedi_get_protocol_tlv_data+0x61/0x450 [qedi]
[<
ffffffff9b1f9d8e>] ? map_vm_area+0x2e/0x40
[<
ffffffff9b1fc370>] ? __vmalloc_node_range+0x170/0x280
[<
ffffffffc0b81c3d>] ? qed_mfw_process_tlv_req+0x27d/0xbd0 [qed]
[<
ffffffffc0b6461b>] qed_mfw_fill_tlv_data+0x4b/0xb0 [qed]
[<
ffffffffc0b81c59>] qed_mfw_process_tlv_req+0x299/0xbd0 [qed]
[<
ffffffff9b02a59e>] ? __switch_to+0xce/0x580
[<
ffffffffc0b61e5b>] qed_slowpath_task+0x5b/0x80 [qed]
Signed-off-by: Nilesh Javali <nilesh.javali@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Masashi Honma [Tue, 25 Sep 2018 02:15:01 +0000 (11:15 +0900)]
nl80211: Fix possible Spectre-v1 for CQM RSSI thresholds
[ Upstream commit
1222a16014888ed9733c11e221730d4a8196222b ]
Use array_index_nospec() to sanitize i with respect to speculation.
Note that the user doesn't control i directly, but can make it out
of bounds by not finding a threshold in the array.
Signed-off-by: Masashi Honma <masashi.honma@gmail.com>
[add note about user control, as explained by Masashi]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Nathan Chancellor [Mon, 24 Sep 2018 21:42:12 +0000 (14:42 -0700)]
qed: Avoid implicit enum conversion in qed_iwarp_parse_rx_pkt
[ Upstream commit
77f2d753819b7d50c16abfb778caf1fe075faed0 ]
Clang warns when one enumerated type is implicitly converted to another.
drivers/net/ethernet/qlogic/qed/qed_iwarp.c:1713:25: warning: implicit
conversion from enumeration type 'enum tcp_ip_version' to different
enumeration type 'enum qed_tcp_ip_version' [-Wenum-conversion]
cm_info->ip_version = TCP_IPV4;
~ ^~~~~~~~
drivers/net/ethernet/qlogic/qed/qed_iwarp.c:1733:25: warning: implicit
conversion from enumeration type 'enum tcp_ip_version' to different
enumeration type 'enum qed_tcp_ip_version' [-Wenum-conversion]
cm_info->ip_version = TCP_IPV6;
~ ^~~~~~~~
2 warnings generated.
Use the appropriate values from the expected type, qed_tcp_ip_version:
TCP_IPV4 = QED_TCP_IPV4 = 0
TCP_IPV6 = QED_TCP_IPV6 = 1
Link: https://github.com/ClangBuiltLinux/linux/issues/125
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Nathan Chancellor [Mon, 24 Sep 2018 22:17:03 +0000 (15:17 -0700)]
qed: Avoid constant logical operation warning in qed_vf_pf_acquire
[ Upstream commit
1c492a9d55ba99079210ed901dd8a5423f980487 ]
Clang warns when a constant is used in a boolean context as it thinks a
bitwise operation may have been intended.
drivers/net/ethernet/qlogic/qed/qed_vf.c:415:27: warning: use of logical
'&&' with constant operand [-Wconstant-logical-operand]
if (!p_iov->b_pre_fp_hsi &&
^
drivers/net/ethernet/qlogic/qed/qed_vf.c:415:27: note: use '&' for a
bitwise operation
if (!p_iov->b_pre_fp_hsi &&
^~
&
drivers/net/ethernet/qlogic/qed/qed_vf.c:415:27: note: remove constant
to silence this warning
if (!p_iov->b_pre_fp_hsi &&
~^~
1 warning generated.
This has been here since commit
1fe614d10f45 ("qed: Relax VF firmware
requirements") and I am not entirely sure why since 0 isn't a special
case. Just remove the statement causing Clang to warn since it isn't
required.
Link: https://github.com/ClangBuiltLinux/linux/issues/126
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Nathan Chancellor [Mon, 24 Sep 2018 21:34:53 +0000 (14:34 -0700)]
qed: Avoid implicit enum conversion in qed_roce_mode_to_flavor
[ Upstream commit
d3a315795b4ce8b105a64a90699103121bde04a8 ]
Clang warns when one enumerated type is implicitly converted to another.
drivers/net/ethernet/qlogic/qed/qed_roce.c:153:12: warning: implicit
conversion from enumeration type 'enum roce_mode' to different
enumeration type 'enum roce_flavor' [-Wenum-conversion]
flavor = ROCE_V2_IPV6;
~ ^~~~~~~~~~~~
drivers/net/ethernet/qlogic/qed/qed_roce.c:156:12: warning: implicit
conversion from enumeration type 'enum roce_mode' to different
enumeration type 'enum roce_flavor' [-Wenum-conversion]
flavor = MAX_ROCE_MODE;
~ ^~~~~~~~~~~~~
2 warnings generated.
Use the appropriate values from the expected type, roce_flavor:
ROCE_V2_IPV6 = RROCE_IPV6 = 2
MAX_ROCE_MODE = MAX_ROCE_FLAVOR = 3
While we're add it, ditch the local variable flavor, we can just return
the value directly from the switch statement.
Link: https://github.com/ClangBuiltLinux/linux/issues/125
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Nathan Chancellor [Mon, 24 Sep 2018 21:05:27 +0000 (14:05 -0700)]
qed: Fix mask parameter in qed_vf_prep_tunn_req_tlv
[ Upstream commit
db803f36e56f23b5a2266807e190d1dc11554d54 ]
Clang complains when one enumerated type is implicitly converted to
another.
drivers/net/ethernet/qlogic/qed/qed_vf.c:686:6: warning: implicit
conversion from enumeration type 'enum qed_tunn_mode' to different
enumeration type 'enum qed_tunn_clss' [-Wenum-conversion]
QED_MODE_L2GENEVE_TUNN,
^~~~~~~~~~~~~~~~~~~~~~
Update mask's parameter to expect qed_tunn_mode, which is what was
intended.
Link: https://github.com/ClangBuiltLinux/linux/issues/125
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Nathan Chancellor [Mon, 24 Sep 2018 20:53:34 +0000 (13:53 -0700)]
qed: Avoid implicit enum conversion in qed_set_tunn_cls_info
[ Upstream commit
a898fba32229efd5e6b6154f83fa86a7145156b9 ]
Clang warns when one enumerated type is implicitly converted to another.
drivers/net/ethernet/qlogic/qed/qed_sp_commands.c:163:25: warning:
implicit conversion from enumeration type 'enum tunnel_clss' to
different enumeration type 'enum qed_tunn_clss' [-Wenum-conversion]
p_tun->vxlan.tun_cls = type;
~ ^~~~
drivers/net/ethernet/qlogic/qed/qed_sp_commands.c:165:26: warning:
implicit conversion from enumeration type 'enum tunnel_clss' to
different enumeration type 'enum qed_tunn_clss' [-Wenum-conversion]
p_tun->l2_gre.tun_cls = type;
~ ^~~~
drivers/net/ethernet/qlogic/qed/qed_sp_commands.c:167:26: warning:
implicit conversion from enumeration type 'enum tunnel_clss' to
different enumeration type 'enum qed_tunn_clss' [-Wenum-conversion]
p_tun->ip_gre.tun_cls = type;
~ ^~~~
drivers/net/ethernet/qlogic/qed/qed_sp_commands.c:169:29: warning:
implicit conversion from enumeration type 'enum tunnel_clss' to
different enumeration type 'enum qed_tunn_clss' [-Wenum-conversion]
p_tun->l2_geneve.tun_cls = type;
~ ^~~~
drivers/net/ethernet/qlogic/qed/qed_sp_commands.c:171:29: warning:
implicit conversion from enumeration type 'enum tunnel_clss' to
different enumeration type 'enum qed_tunn_clss' [-Wenum-conversion]
p_tun->ip_geneve.tun_cls = type;
~ ^~~~
5 warnings generated.
Avoid this by changing type to an int.
Link: https://github.com/ClangBuiltLinux/linux/issues/125
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Lubomir Rintel [Wed, 26 Sep 2018 16:11:22 +0000 (18:11 +0200)]
pxa168fb: prepare the clock
[ Upstream commit
d85536cde91fcfed6fb8d983783bd2b92c843939 ]
Add missing prepare/unprepare operations for fbi->clk,
this fixes following kernel warning:
------------[ cut here ]------------
WARNING: CPU: 0 PID: 1 at drivers/clk/clk.c:874 clk_core_enable+0x2c/0x1b0
Enabling unprepared disp0_clk
Modules linked in:
CPU: 0 PID: 1 Comm: swapper Not tainted
4.18.0-rc8-00032-g02b43ddd4f21-dirty #25
Hardware name: Marvell MMP2 (Device Tree Support)
[<
c010f7cc>] (unwind_backtrace) from [<
c010cc6c>] (show_stack+0x10/0x14)
[<
c010cc6c>] (show_stack) from [<
c011dab4>] (__warn+0xd8/0xf0)
[<
c011dab4>] (__warn) from [<
c011db10>] (warn_slowpath_fmt+0x44/0x6c)
[<
c011db10>] (warn_slowpath_fmt) from [<
c043898c>] (clk_core_enable+0x2c/0x1b0)
[<
c043898c>] (clk_core_enable) from [<
c0439ec8>] (clk_core_enable_lock+0x18/0x2c)
[<
c0439ec8>] (clk_core_enable_lock) from [<
c0436698>] (pxa168fb_probe+0x464/0x6ac)
[<
c0436698>] (pxa168fb_probe) from [<
c04779a0>] (platform_drv_probe+0x48/0x94)
[<
c04779a0>] (platform_drv_probe) from [<
c0475bec>] (driver_probe_device+0x328/0x470)
[<
c0475bec>] (driver_probe_device) from [<
c0475de4>] (__driver_attach+0xb0/0x124)
[<
c0475de4>] (__driver_attach) from [<
c0473c38>] (bus_for_each_dev+0x64/0xa0)
[<
c0473c38>] (bus_for_each_dev) from [<
c0474ee0>] (bus_add_driver+0x1b8/0x230)
[<
c0474ee0>] (bus_add_driver) from [<
c0476a20>] (driver_register+0xac/0xf0)
[<
c0476a20>] (driver_register) from [<
c0102dd4>] (do_one_initcall+0xb8/0x1f0)
[<
c0102dd4>] (do_one_initcall) from [<
c0b010a0>] (kernel_init_freeable+0x294/0x2e0)
[<
c0b010a0>] (kernel_init_freeable) from [<
c07e9eb8>] (kernel_init+0x8/0x10c)
[<
c07e9eb8>] (kernel_init) from [<
c01010e8>] (ret_from_fork+0x14/0x2c)
Exception stack(0xd008bfb0 to 0xd008bff8)
bfa0:
00000000 00000000 00000000 00000000
bfc0:
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
bfe0:
00000000 00000000 00000000 00000000 00000013 00000000
---[ end trace
c0af40f9e2ed7cb4 ]---
Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
[b.zolnierkie: enhance patch description a bit]
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Matias Karhumaa [Wed, 26 Sep 2018 06:13:46 +0000 (09:13 +0300)]
Bluetooth: SMP: fix crash in unpairing
[ Upstream commit
cb28c306b93b71f2741ce1a5a66289db26715f4d ]
In case unpair_device() was called through mgmt interface at the same time
when pairing was in progress, Bluetooth kernel module crash was seen.
[ 600.351225] general protection fault: 0000 [#1] SMP PTI
[ 600.351235] CPU: 1 PID: 11096 Comm: btmgmt Tainted: G OE 4.19.0-rc1+ #1
[ 600.351238] Hardware name: Dell Inc. Latitude E5440/08RCYC, BIOS A18 05/14/2017
[ 600.351272] RIP: 0010:smp_chan_destroy.isra.10+0xce/0x2c0 [bluetooth]
[ 600.351276] Code: c0 0f 84 b4 01 00 00 80 78 28 04 0f 84 53 01 00 00 4d 85 ed 0f 85 ab 00 00 00 48 8b 08 48 8b 50 08 be 10 00 00 00 48 89 51 08 <48> 89 0a 48 b9 00 02 00 00 00 00 ad de 48 89 48 08 48 8b 83 00 01
[ 600.351279] RSP: 0018:
ffffa9be839b3b50 EFLAGS:
00010246
[ 600.351282] RAX:
ffff9c999ac565a0 RBX:
ffff9c9996e98c00 RCX:
ffff9c999aa28b60
[ 600.351285] RDX:
dead000000000200 RSI:
0000000000000010 RDI:
ffff9c999e403500
[ 600.351287] RBP:
ffffa9be839b3b70 R08:
0000000000000000 R09:
ffffffff92a25c00
[ 600.351290] R10:
ffffa9be839b3ae8 R11:
0000000000000001 R12:
ffff9c995375b800
[ 600.351292] R13:
0000000000000000 R14:
ffff9c99619a5000 R15:
ffff9c9962a01c00
[ 600.351295] FS:
00007fb2be27c700(0000) GS:
ffff9c999e880000(0000) knlGS:
0000000000000000
[ 600.351298] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 600.351300] CR2:
00007fb2bdadbad0 CR3:
000000041c328001 CR4:
00000000001606e0
[ 600.351302] Call Trace:
[ 600.351325] smp_failure+0x4f/0x70 [bluetooth]
[ 600.351345] smp_cancel_pairing+0x74/0x80 [bluetooth]
[ 600.351370] unpair_device+0x1c1/0x330 [bluetooth]
[ 600.351399] hci_sock_sendmsg+0x960/0x9f0 [bluetooth]
[ 600.351409] ? apparmor_socket_sendmsg+0x1e/0x20
[ 600.351417] sock_sendmsg+0x3e/0x50
[ 600.351422] sock_write_iter+0x85/0xf0
[ 600.351429] do_iter_readv_writev+0x12b/0x1b0
[ 600.351434] do_iter_write+0x87/0x1a0
[ 600.351439] vfs_writev+0x98/0x110
[ 600.351443] ? ep_poll+0x16d/0x3d0
[ 600.351447] ? ep_modify+0x73/0x170
[ 600.351451] do_writev+0x61/0xf0
[ 600.351455] ? do_writev+0x61/0xf0
[ 600.351460] __x64_sys_writev+0x1c/0x20
[ 600.351465] do_syscall_64+0x5a/0x110
[ 600.351471] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 600.351474] RIP: 0033:0x7fb2bdb62fe0
[ 600.351477] Code: 73 01 c3 48 8b 0d b8 6e 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 69 c7 2c 00 00 75 10 b8 14 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 de 80 01 00 48 89 04 24
[ 600.351479] RSP: 002b:
00007ffe062cb8f8 EFLAGS:
00000246 ORIG_RAX:
0000000000000014
[ 600.351484] RAX:
ffffffffffffffda RBX:
000000000255b3d0 RCX:
00007fb2bdb62fe0
[ 600.351487] RDX:
0000000000000001 RSI:
00007ffe062cb920 RDI:
0000000000000004
[ 600.351490] RBP:
00007ffe062cb920 R08:
000000000255bd80 R09:
0000000000000000
[ 600.351494] R10:
0000000000000353 R11:
0000000000000246 R12:
0000000000000001
[ 600.351497] R13:
00007ffe062cbbe0 R14:
0000000000000000 R15:
0000000000000000
[ 600.351501] Modules linked in: algif_hash algif_skcipher af_alg cmac ipt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat_ipv4 xt_addrtype iptable_filter ip_tables xt_conntrack x_tables nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c br_netfilter bridge stp llc overlay arc4 nls_iso8859_1 dm_crypt intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp dell_laptop kvm_intel crct10dif_pclmul dell_smm_hwmon crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd glue_helper intel_cstate intel_rapl_perf uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common videodev media hid_multitouch input_leds joydev serio_raw dell_wmi snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic dell_smbios dcdbas sparse_keymap
[ 600.351569] snd_hda_intel btusb snd_hda_codec btrtl btbcm btintel snd_hda_core bluetooth(OE) snd_hwdep snd_pcm iwlmvm ecdh_generic wmi_bmof dell_wmi_descriptor snd_seq_midi mac80211 snd_seq_midi_event lpc_ich iwlwifi snd_rawmidi snd_seq snd_seq_device snd_timer cfg80211 snd soundcore mei_me mei dell_rbtn dell_smo8800 mac_hid parport_pc ppdev lp parport autofs4 hid_generic usbhid hid i915 nouveau kvmgt vfio_mdev mdev vfio_iommu_type1 vfio kvm irqbypass i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt mxm_wmi psmouse ahci sdhci_pci cqhci libahci fb_sys_fops sdhci drm e1000e video wmi
[ 600.351637] ---[ end trace
e49e9f1df09c94fb ]---
[ 600.351664] RIP: 0010:smp_chan_destroy.isra.10+0xce/0x2c0 [bluetooth]
[ 600.351666] Code: c0 0f 84 b4 01 00 00 80 78 28 04 0f 84 53 01 00 00 4d 85 ed 0f 85 ab 00 00 00 48 8b 08 48 8b 50 08 be 10 00 00 00 48 89 51 08 <48> 89 0a 48 b9 00 02 00 00 00 00 ad de 48 89 48 08 48 8b 83 00 01
[ 600.351669] RSP: 0018:
ffffa9be839b3b50 EFLAGS:
00010246
[ 600.351672] RAX:
ffff9c999ac565a0 RBX:
ffff9c9996e98c00 RCX:
ffff9c999aa28b60
[ 600.351674] RDX:
dead000000000200 RSI:
0000000000000010 RDI:
ffff9c999e403500
[ 600.351676] RBP:
ffffa9be839b3b70 R08:
0000000000000000 R09:
ffffffff92a25c00
[ 600.351679] R10:
ffffa9be839b3ae8 R11:
0000000000000001 R12:
ffff9c995375b800
[ 600.351681] R13:
0000000000000000 R14:
ffff9c99619a5000 R15:
ffff9c9962a01c00
[ 600.351684] FS:
00007fb2be27c700(0000) GS:
ffff9c999e880000(0000) knlGS:
0000000000000000
[ 600.351686] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 600.351689] CR2:
00007fb2bdadbad0 CR3:
000000041c328001 CR4:
00000000001606e0
Crash happened because list_del_rcu() was called twice for smp->ltk. This
was possible if unpair_device was called right after ltk was generated
but before keys were distributed.
In this commit smp_cancel_pairing was refactored to cancel pairing if it
is in progress and otherwise just removes keys. Once keys are removed from
rcu list, pointers to smp context's keys are set to NULL to make sure
removed list items are not accessed later.
This commit also adjusts the functionality of mgmt unpair_device() little
bit. Previously pairing was canceled only if pairing was in state that
keys were already generated. With this commit unpair_device() cancels
pairing already in earlier states.
Bug was found by fuzzing kernel SMP implementation using Synopsys
Defensics.
Reported-by: Pekka Oikarainen <pekka.oikarainen@synopsys.com>
Signed-off-by: Matias Karhumaa <matias.karhumaa@gmail.com>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Martin Willi [Tue, 25 Sep 2018 07:51:02 +0000 (09:51 +0200)]
mac80211_hwsim: do not omit multicast announce of first added radio
[ Upstream commit
28ef8b49a338dc1844e86b7954cfffc7dfa2660a ]
The allocation of hwsim radio identifiers uses a post-increment from 0,
so the first radio has idx 0. This idx is explicitly excluded from
multicast announcements ever since, but it is unclear why.
Drop that idx check and announce the first radio as well. This makes
userspace happy if it relies on these events.
Signed-off-by: Martin Willi <martin@strongswan.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Martin Willi [Tue, 25 Sep 2018 07:41:14 +0000 (09:41 +0200)]
mac80211_hwsim: fix race in radio destruction from netlink notifier
[ Upstream commit
f1c47eb61d52379de5747d02bb36be20d7a2d0d3 ]
The asynchronous destruction from a work-queue of radios tagged with
destroy-on-close may race with the owning namespace about to exit,
resulting in potential use-after-free of that namespace.
Instead of using a work-queue, move radios about to destroy to a
temporary list, which can be worked on synchronously after releasing
the lock. This should be safe to do from the netlink socket notifier,
as the namespace is guaranteed to not get released.
Signed-off-by: Martin Willi <martin@strongswan.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Martin Willi [Tue, 25 Sep 2018 07:41:13 +0000 (09:41 +0200)]
mac80211_hwsim: fix locking when iterating radios during ns exit
[ Upstream commit
628980e5c8f038f730582c6ee50b7410741cd96e ]
The cleanup of radios during namespace exit has recently been reworked
to directly delete a radio while temporarily releasing the spinlock,
fixing a race condition between the work-queue execution and namespace
exits. However, the temporary unlock allows unsafe modifications on the
iterated list, resulting in a potential crash when continuing the
iteration of additional radios.
Move radios about to destroy to a temporary list, and clean that up
after releasing the spinlock once iteration is complete.
Fixes: 8cfd36a0b53a ("mac80211_hwsim: fix use-after-free bug in hwsim_exit_net")
Signed-off-by: Martin Willi <martin@strongswan.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Masashi Honma [Tue, 25 Sep 2018 02:15:00 +0000 (11:15 +0900)]
nl80211: Fix possible Spectre-v1 for NL80211_TXRATE_HT
[ Upstream commit
30fe6d50eb088783c8729c7d930f65296b2b3fa7 ]
Use array_index_nospec() to sanitize ridx with respect to speculation.
Signed-off-by: Masashi Honma <masashi.honma@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Zhao Qiang [Thu, 1 Feb 2018 06:54:32 +0000 (14:54 +0800)]
soc: fsl: qe: Fix copy/paste bug in ucc_get_tdm_sync_shift()
[ Upstream commit
96fc74333f84cfdf8d434c6c07254e215e2aad00 ]
There is a copy and paste bug so we accidentally use the RX_ shift when
we're in TX_ mode.
Fixes: bb8b2062aff3 ("fsl/qe: setup clock source for TDM mode")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Zhao Qiang <qiang.zhao@nxp.com>
Signed-off-by: Li Yang <leoyang.li@nxp.com>
(cherry picked from commit
3cb31b634052ed458922e0c8e2b4b093d7fb60b9)
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Alexandre Belloni [Thu, 23 Aug 2018 21:36:00 +0000 (23:36 +0200)]
soc: fsl: qbman: qman: avoid allocating from non existing gen_pool
[ Upstream commit
64e9e22e68512da8df3c9a7430f07621e48db3c2 ]
If the qman driver didn't probe, calling qman_alloc_fqid_range,
qman_alloc_pool_range or qman_alloc_cgrid_range (as done in dpaa_eth) will
pass a NULL pointer to gen_pool_alloc, leading to a NULL pointer
dereference.
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Reviewed-by: Roy Pledge <roy.pledge@nxp.com>
Signed-off-by: Li Yang <leoyang.li@nxp.com>
(cherry picked from commit
f72487a2788aa70c3aee1d0ebd5470de9bac953a)
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Michal Simek [Tue, 25 Sep 2018 06:32:50 +0000 (08:32 +0200)]
net: macb: Clean 64b dma addresses if they are not detected
[ Upstream commit
e1e5d8a9fe737d94ccc0ccbaf0c97f69a8f3e000 ]
Clear ADDR64 dma bit in DMACFG register in case that HW_DMA_CAP_64B is
not detected on 64bit system.
The issue was observed when bootloader(u-boot) does not check macb
feature at DCFG6 register (DAW64_OFFSET) and enabling 64bit dma support
by default. Then macb driver is reading DMACFG register back and only
adding 64bit dma configuration but not cleaning it out.
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Florian Fainelli [Thu, 20 Sep 2018 00:14:01 +0000 (17:14 -0700)]
ARM: dts: BCM63xx: Fix incorrect interrupt specifiers
[ Upstream commit
3ab97942d0213b6583a5408630a8cbbfbf54730f ]
A number of our interrupts were incorrectly specified, fix both the PPI
and SPI interrupts to be correct.
Fixes: b5762cacc411 ("ARM: bcm63138: add NAND DT support")
Fixes: 46d4bca0445a ("ARM: BCM63XX: add BCM63138 minimal Device Tree")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Steve Capper [Fri, 21 Sep 2018 15:34:04 +0000 (16:34 +0100)]
arm64: hugetlb: Fix handling of young ptes
[ Upstream commit
469ed9d823b7d240d6b9574f061ded7c3834c167 ]
In the contiguous bit hugetlb break-before-make code we assume that all
hugetlb pages are young.
In fact, remove_migration_pte is able to place an old hugetlb pte so
this assumption is not valid.
This patch fixes the contiguous hugetlb scanning code to preserve young
ptes.
Fixes: d8bdcff28764 ("arm64: hugetlb: Add break-before-make logic for contiguous entries")
Signed-off-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
zhong jiang [Wed, 19 Sep 2018 12:21:11 +0000 (20:21 +0800)]
netfilter: conntrack: get rid of double sizeof
[ Upstream commit
346fa83d10934cf206e2fd0f514bf8ce186f08fe ]
sizeof(sizeof()) is quite strange and does not seem to be what
is wanted here.
The issue is detected with the help of Coccinelle.
Fixes: 39215846740a ("netfilter: conntrack: remove nlattr_size pointer from l4proto trackers")
Signed-off-by: zhong jiang <zhongjiang@huawei.com>
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
David Ahern [Mon, 17 Sep 2018 15:20:36 +0000 (08:20 -0700)]
netfilter: bridge: Don't sabotage nf_hook calls from an l3mdev
[ Upstream commit
a173f066c7cfc031acb8f541708041e009fc9812 ]
For starters, the bridge netfilter code registers operations that
are invoked any time nh_hook is called. Specifically, ip_sabotage_in
watches for nested calls for NF_INET_PRE_ROUTING when a bridge is in
the stack.
Packet wise, the bridge netfilter hook runs first. br_nf_pre_routing
allocates nf_bridge, sets in_prerouting to 1 and calls NF_HOOK for
NF_INET_PRE_ROUTING. It's finish function, br_nf_pre_routing_finish,
then resets in_prerouting flag to 0 and the packet continues up the
stack. The packet eventually makes it to the VRF driver and it invokes
nf_hook for NF_INET_PRE_ROUTING in case any rules have been added against
the vrf device.
Because of the registered operations the call to nf_hook causes
ip_sabotage_in to be invoked. That function sees the nf_bridge on the
skb and that in_prerouting is not set. Thinking it is an invalid nested
call it steals (drops) the packet.
Update ip_sabotage_in to recognize that the bridge or one of its upper
devices (e.g., vlan) can be enslaved to a VRF (L3 master device) and
allow the packet to go through the nf_hook a second time.
Fixes: 73e20b761acf ("net: vrf: Add support for PREROUTING rules on vrf device")
Reported-by: D'Souza, Nelson <ndsouza@ciena.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Hans Verkuil [Mon, 27 Aug 2018 12:28:50 +0000 (14:28 +0200)]
drm/i2c: tda9950: set MAX_RETRIES for errors only
[ Upstream commit
e0dccce1193f87597548d0db6ecc942fb92c04cd ]
The CEC_TX_STATUS_MAX_RETRIES should be set for errors only to
prevent the CEC framework from retrying the transmit. If the
transmit was successful, then don't set this flag.
Found by running 'cec-compliance -A' on a beaglebone box.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Colin Ian King [Sun, 27 May 2018 21:42:55 +0000 (22:42 +0100)]
drm/i2c: tda9950: fix timeout counter check
[ Upstream commit
d98627d1360d55e3b28f702caca8b6342c4a4e45 ]
Currently the check to see if the timeout has reached zero is incorrect
and the check is instead checking if the timeout is non-zero and not
zero, hence it will break out of the loop on the first iteration and
the msleep is never executed. Fix this by breaking from the loop when
timeout is zero.
Detected by CoverityScan, CID#
1469404 ("Logically Dead Code")
Fixes: f0316f93897c ("drm/i2c: tda9950: add CEC driver")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Sean Tranchetti [Wed, 19 Sep 2018 19:54:56 +0000 (13:54 -0600)]
xfrm: validate template mode
[ Upstream commit
32bf94fb5c2ec4ec842152d0e5937cd4bb6738fa ]
XFRM mode parameters passed as part of the user templates
in the IP_XFRM_POLICY are never properly validated. Passing
values other than valid XFRM modes can cause stack-out-of-bounds
reads to occur later in the XFRM processing:
[ 140.535608] ================================================================
[ 140.543058] BUG: KASAN: stack-out-of-bounds in xfrm_state_find+0x17e4/0x1cc4
[ 140.550306] Read of size 4 at addr
ffffffc0238a7a58 by task repro/5148
[ 140.557369]
[ 140.558927] Call trace:
[ 140.558936] dump_backtrace+0x0/0x388
[ 140.558940] show_stack+0x24/0x30
[ 140.558946] __dump_stack+0x24/0x2c
[ 140.558949] dump_stack+0x8c/0xd0
[ 140.558956] print_address_description+0x74/0x234
[ 140.558960] kasan_report+0x240/0x264
[ 140.558963] __asan_report_load4_noabort+0x2c/0x38
[ 140.558967] xfrm_state_find+0x17e4/0x1cc4
[ 140.558971] xfrm_resolve_and_create_bundle+0x40c/0x1fb8
[ 140.558975] xfrm_lookup+0x238/0x1444
[ 140.558977] xfrm_lookup_route+0x48/0x11c
[ 140.558984] ip_route_output_flow+0x88/0xc4
[ 140.558991] raw_sendmsg+0xa74/0x266c
[ 140.558996] inet_sendmsg+0x258/0x3b0
[ 140.559002] sock_sendmsg+0xbc/0xec
[ 140.559005] SyS_sendto+0x3a8/0x5a8
[ 140.559008] el0_svc_naked+0x34/0x38
[ 140.559009]
[ 140.592245] page dumped because: kasan: bad access detected
[ 140.597981] page_owner info is not active (free page?)
[ 140.603267]
[ 140.653503] ================================================================
Signed-off-by: Sean Tranchetti <stranche@codeaurora.org>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Thomas Petazzoni [Thu, 13 Sep 2018 15:48:08 +0000 (16:48 +0100)]
ARM: 8799/1: mm: fix pci_ioremap_io() offset check
[ Upstream commit
3a58ac65e2d7969bcdf1b6acb70fa4d12a88e53e ]
IO_SPACE_LIMIT is the ending address of the PCI IO space, i.e
something like 0xfffff (and not 0x100000).
Therefore, when offset = 0xf0000 is passed as argument, this function
fails even though the offset + SZ_64K fits below the
IO_SPACE_LIMIT. This makes the last chunk of 64 KB of the I/O space
not usable as it cannot be mapped.
This patch fixes that by substracing 1 to offset + SZ_64K, so that we
compare the addrss of the last byte of the I/O space against
IO_SPACE_LIMIT instead of the address of the first byte of what is
after the I/O space.
Fixes: c2794437091a4 ("ARM: Add fixed PCI i/o mapping")
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Steffen Klassert [Tue, 11 Sep 2018 08:31:15 +0000 (10:31 +0200)]
xfrm: Fix NULL pointer dereference when skb_dst_force clears the dst_entry.
[ Upstream commit
9e1437937807b0122e8da1ca8765be2adca9aee6 ]
Since commit
222d7dbd258d ("net: prevent dst uses after free")
skb_dst_force() might clear the dst_entry attached to the skb.
The xfrm code don't expect this to happen, so we crash with
a NULL pointer dereference in this case. Fix it by checking
skb_dst(skb) for NULL after skb_dst_force() and drop the packet
in cast the dst_entry was cleared.
Fixes: 222d7dbd258d ("net: prevent dst uses after free")
Reported-by: Tobias Hommel <netdev-list@genoetigt.de>
Reported-by: Kristian Evensen <kristian.evensen@gmail.com>
Reported-by: Wolfgang Walter <linux@stwm.de>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Yuan-Chi Pang [Thu, 6 Sep 2018 08:57:48 +0000 (16:57 +0800)]
mac80211: fix TX status reporting for ieee80211s
[ Upstream commit
c42055105785580563535e6d3143cad95c7ac7ee ]
TX status reporting to ieee80211s is through ieee80211s_update_metric.
There are two problems about ieee80211s_update_metric:
1. The purpose is to estimate the fail probability
to a specific link. No need to restrict to data frame.
2. Current implementation does not work if wireless driver does not
pass tx_status with skb.
Fix this by removing ieee80211_is_data condition, passing
ieee80211_tx_status directly to ieee80211s_update_metric, and
putting it in both __ieee80211_tx_status and ieee80211_tx_status_ext.
Signed-off-by: Yuan-Chi Pang <fu3mo6goo@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Johannes Berg [Wed, 5 Sep 2018 11:34:02 +0000 (13:34 +0200)]
mac80211: TDLS: fix skb queue/priority assignment
[ Upstream commit
cb59bc14e830028d2244861216df038165d7625d ]
If the TDLS setup happens over a connection to an AP that
doesn't have QoS, we nevertheless assign a non-zero TID
(skb->priority) and queue mapping, which may confuse us or
drivers later.
Fix it by just assigning the special skb->priority and then
using ieee80211_select_queue() just like other data frames
would go through.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Jouni Malinen [Wed, 5 Sep 2018 15:52:22 +0000 (18:52 +0300)]
cfg80211: Address some corner cases in scan result channel updating
[ Upstream commit
119f94a6fefcc76d47075b83d2b73d04c895df78 ]
cfg80211_get_bss_channel() is used to update the RX channel based on the
available frame payload information (channel number from DSSS Parameter
Set element or HT Operation element). This is needed on 2.4 GHz channels
where frames may be received on neighboring channels due to overlapping
frequency range.
This might of some use on the 5 GHz band in some corner cases, but
things are more complex there since there is no n:1 or 1:n mapping
between channel numbers and frequencies due to multiple different
starting frequencies in different operating classes. This could result
in ieee80211_channel_to_frequency() returning incorrect frequency and
ieee80211_get_channel() returning incorrect channel information (or
indication of no match). In the previous implementation, this could
result in some scan results being dropped completely, e.g., for the 4.9
GHz channels. That prevented connection to such BSSs.
Fix this by using the driver-provided channel pointer if
ieee80211_get_channel() does not find matching channel data for the
channel number in the frame payload and if the scan is done with 5 MHz
or 10 MHz channel bandwidth. While doing this, also add comments
describing what the function is trying to achieve to make it easier to
understand what happens here and why.
Signed-off-by: Jouni Malinen <jouni@codeaurora.org>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Bob Copeland [Wed, 5 Sep 2018 10:22:59 +0000 (06:22 -0400)]
mac80211: fix pending queue hang due to TX_DROP
[ Upstream commit
6eae4a6c2be387fec41b0d2782c4fffb57159498 ]
In our environment running lots of mesh nodes, we are seeing the
pending queue hang periodically, with the debugfs queues file showing
lines such as:
00: 0x00000000/348
i.e. there are a large number of frames but no stop reason set.
One way this could happen is if queue processing from the pending
tasklet exited early without processing all frames, and without having
some future event (incoming frame, stop reason flag, ...) to reschedule
it.
Exactly this can occur today if ieee80211_tx() returns false due to
packet drops or power-save buffering in the tx handlers. In the
past, this function would return true in such cases, and the change
to false doesn't seem to be intentional. Fix this case by reverting
to the previous behavior.
Fixes: bb42f2d13ffc ("mac80211: Move reorder-sensitive TX handlers to after TXQ dequeue")
Signed-off-by: Bob Copeland <bobcopeland@fb.com>
Acked-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Andrei Otcheretianski [Wed, 5 Sep 2018 05:06:12 +0000 (08:06 +0300)]
cfg80211: reg: Init wiphy_idx in regulatory_hint_core()
[ Upstream commit
24f33e64fcd0d50a4b1a8e5b41bd0257aa66b0e8 ]
Core regulatory hints didn't set wiphy_idx to WIPHY_IDX_INVALID. Since
the regulatory request is zeroed, wiphy_idx was always implicitly set to
0. This resulted in updating only phy #0.
Fix that.
Fixes: 806a9e39670b ("cfg80211: make regulatory_request use wiphy_idx instead of wiphy")
Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
[add fixes tag]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Andrei Otcheretianski [Wed, 5 Sep 2018 05:06:13 +0000 (08:06 +0300)]
mac80211: Always report TX status
[ Upstream commit
8682250b3c1b75a45feb7452bc413d004cfe3778 ]
If a frame is dropped for any reason, mac80211 wouldn't report the TX
status back to user space.
As the user space may rely on the TX_STATUS to kick its state
machines, resends etc, it's better to just report this frame as not
acked instead.
Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Sowmini Varadhan [Mon, 3 Sep 2018 11:36:53 +0000 (04:36 -0700)]
xfrm: reset crypto_done when iterating over multiple input xfrms
[ Upstream commit
782710e333a526780d65918d669cb96646983ba2 ]
We only support one offloaded xfrm (we do not have devices that
can handle more than one offload), so reset crypto_done in
xfrm_input() when iterating over multiple transforms in xfrm_input,
so that we can invoke the appropriate x->type->input for the
non-offloaded transforms
Fixes: d77e38e612a0 ("xfrm: Add an IPsec hardware offloading API")
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Sowmini Varadhan [Mon, 3 Sep 2018 11:36:52 +0000 (04:36 -0700)]
xfrm: reset transport header back to network header after all input transforms ahave been applied
[ Upstream commit
bfc0698bebcb16d19ecfc89574ad4d696955e5d3 ]
A policy may have been set up with multiple transforms (e.g., ESP
and ipcomp). In this situation, the ingress IPsec processing
iterates in xfrm_input() and applies each transform in turn,
processing the nexthdr to find any additional xfrm that may apply.
This patch resets the transport header back to network header
only after the last transformation so that subsequent xfrms
can find the correct transport header.
Fixes: 7785bba299a8 ("esp: Add a software GRO codepath")
Suggested-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Thadeu Lima de Souza Cascardo [Fri, 31 Aug 2018 11:38:49 +0000 (08:38 -0300)]
xfrm6: call kfree_skb when skb is toobig
[ Upstream commit
215ab0f021c9fea3c18b75e7d522400ee6a49990 ]
After commit
d6990976af7c5d8f55903bfb4289b6fb030bf754 ("vti6: fix PMTU caching
and reporting on xmit"), some too big skbs might be potentially passed down to
__xfrm6_output, causing it to fail to transmit but not free the skb, causing a
leak of skb, and consequentially a leak of dst references.
After running pmtu.sh, that shows as failure to unregister devices in a namespace:
[ 311.397671] unregister_netdevice: waiting for veth_b to become free. Usage count = 1
The fix is to call kfree_skb in case of transmit failures.
Fixes: dd767856a36e ("xfrm6: Don't call icmpv6_send on local error")
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Steffen Klassert [Wed, 1 Aug 2018 11:45:11 +0000 (13:45 +0200)]
xfrm: Validate address prefix lengths in the xfrm selector.
[ Upstream commit
07bf7908950a8b14e81aa1807e3c667eab39287a ]
We don't validate the address prefix lengths in the xfrm
selector we got from userspace. This can lead to undefined
behaviour in the address matching functions if the prefix
is too big for the given address family. Fix this by checking
the prefixes and refuse SA/policy insertation when a prefix
is invalid.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: Air Icy <icytxw@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Greg Kroah-Hartman [Sat, 20 Oct 2018 07:47:10 +0000 (09:47 +0200)]
Linux 4.18.16
Alexey Brodkin [Sun, 16 Sep 2018 20:47:57 +0000 (23:47 +0300)]
ARC: build: Don't set CROSS_COMPILE in arch's Makefile
commit
40660f1fcee8d524a60b5101538e42b1f39f106d upstream.
There's not much sense in doing that because if user or
his build-system didn't set CROSS_COMPILE we still may
very well make incorrect guess.
But as it turned out setting CROSS_COMPILE is not as harmless
as one may think: with recent changes that implemented automatic
discovery of __host__ gcc features unconditional setup of
CROSS_COMPILE leads to failures on execution of "make xxx_defconfig"
with absent cross-compiler, for more info see [1].
Set CROSS_COMPILE as well gets in the way if we want only to build
.dtb's (again with absent cross-compiler which is not really needed
for building .dtb's), see [2].
Note, we had to change LIBGCC assignment type from ":=" to "="
so that is is resolved on its usage, otherwise if it is resolved
at declaration time with missing CROSS_COMPILE we're getting this
error message from host GCC:
| gcc: error: unrecognized command line option -mmedium-calls
| gcc: error: unrecognized command line option -mno-sdata
[1] http://lists.infradead.org/pipermail/linux-snps-arc/2018-September/004308.html
[2] http://lists.infradead.org/pipermail/linux-snps-arc/2018-September/004320.html
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Rob Herring <robh@kernel.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Alexey Brodkin [Thu, 13 Sep 2018 20:24:28 +0000 (23:24 +0300)]
ARC: build: Get rid of toolchain check
commit
615f64458ad890ef94abc879a66d8b27236e733a upstream.
This check is very naive: we simply test if GCC invoked without
"-mcpu=XXX" has ARC700 define set. In that case we think that GCC
was built with "--with-cpu=arc700" and has libgcc built for ARC700.
Otherwise if ARC700 is not defined we think that everythng was built
for ARCv2.
But in reality our life is much more interesting.
1. Regardless of GCC configuration (i.e. what we pass in "--with-cpu"
it may generate code for any ARC core).
2. libgcc might be built with explicitly specified "--mcpu=YYY"
That's exactly what happens in case of multilibbed toolchains:
- GCC is configured with default settings
- All the libs built for many different CPU flavors
I.e. that check gets in the way of usage of multilibbed
toolchains. And even non-multilibbed toolchains are affected.
OpenEmbedded also builds GCC without "--with-cpu" because
each and every target component later is compiled with explicitly
set "-mcpu=ZZZ".
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Linus Torvalds [Fri, 12 Oct 2018 22:22:59 +0000 (15:22 -0700)]
mremap: properly flush TLB before releasing the page
commit
eb66ae030829605d61fbef1909ce310e29f78821 upstream.
Jann Horn points out that our TLB flushing was subtly wrong for the
mremap() case. What makes mremap() special is that we don't follow the
usual "add page to list of pages to be freed, then flush tlb, and then
free pages". No, mremap() obviously just _moves_ the page from one page
table location to another.
That matters, because mremap() thus doesn't directly control the
lifetime of the moved page with a freelist: instead, the lifetime of the
page is controlled by the page table locking, that serializes access to
the entry.
As a result, we need to flush the TLB not just before releasing the lock
for the source location (to avoid any concurrent accesses to the entry),
but also before we release the destination page table lock (to avoid the
TLB being flushed after somebody else has already done something to that
page).
This also makes the whole "need_flush" logic unnecessary, since we now
always end up flushing the TLB for every valid entry.
Reported-and-tested-by: Jann Horn <jannh@google.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Tested-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Greg Kroah-Hartman [Thu, 18 Oct 2018 12:22:54 +0000 (14:22 +0200)]
Revert "vfs: fix freeze protection in mnt_want_write_file() for overlayfs"
This reverts commit
5e1002ab5c9bde81a0c1eed12f243987e98f7bd0 which was
commit
a6795a585929d94ca3e931bc8518f8deb8bbe627 upstream.
Turns out this causes problems and was to fix a patch only in the 4.19
and newer tree.
Reported-by: Amir Goldstein <amir73il@gmail.com>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Kairui Song [Thu, 27 Sep 2018 12:38:45 +0000 (20:38 +0800)]
x86/boot: Fix kexec booting failure in the SEV bit detection code
[ Upstream commit
bdec8d7fa55e6f5314ed72e5a0b435d90ff90548 ]
Commit
1958b5fc4010 ("x86/boot: Add early boot support when running with SEV active")
can occasionally cause system resets when kexec-ing a second kernel even
if SEV is not active.
That's because get_sev_encryption_bit() uses 32-bit rIP-relative
addressing to read the value of enc_bit - a variable which caches a
previously detected encryption bit position - but kexec may allocate
the early boot code to a higher location, beyond the 32-bit addressing
limit.
In this case, garbage will be read and get_sev_encryption_bit() will
return the wrong value, leading to accessing memory with the wrong
encryption setting.
Therefore, remove enc_bit, and thus get rid of the need to do 32-bit
rIP-relative addressing in the first place.
[ bp: massage commit message heavily. ]
Fixes: 1958b5fc4010 ("x86/boot: Add early boot support when running with SEV active")
Suggested-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Kairui Song <kasong@redhat.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Cc: linux-kernel@vger.kernel.org
Cc: tglx@linutronix.de
Cc: mingo@redhat.com
Cc: hpa@zytor.com
Cc: brijesh.singh@amd.com
Cc: kexec@lists.infradead.org
Cc: dyoung@redhat.com
Cc: bhe@redhat.com
Cc: ghook@redhat.com
Link: https://lkml.kernel.org/r/20180927123845.32052-1-kasong@redhat.com
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Arindam Nath [Tue, 18 Sep 2018 10:10:58 +0000 (15:40 +0530)]
iommu/amd: Return devid as alias for ACPI HID devices
[ Upstream commit
5ebb1bc2d63d90dd204169e21fd7a0b4bb8c776e ]
ACPI HID devices do not actually have an alias for
them in the IVRS. But dev_data->alias is still used
for indexing into the IOMMU device table for devices
being handled by the IOMMU. So for ACPI HID devices,
we simply return the corresponding devid as an alias,
as parsed from IVRS table.
Signed-off-by: Arindam Nath <arindam.nath@amd.com>
Fixes: 2bf9a0a12749 ('iommu/amd: Add iommu support for ACPI HID devices')
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Srikar Dronamraju [Tue, 25 Sep 2018 12:25:15 +0000 (17:55 +0530)]
powerpc/numa: Use associativity if VPHN hcall is successful
[ Upstream commit
2483ef056f6e42f61cd266452e2841165dfe1b5c ]
Currently associativity is used to lookup node-id even if the
preceding VPHN hcall failed. However this can cause CPU to be made
part of the wrong node, (most likely to be node 0). This is because
VPHN is not enabled on KVM guests.
With
2ea6263 ("powerpc/topology: Get topology for shared processors at
boot"), associativity is used to set to the wrong node. Hence KVM
guest topology is broken.
For example : A 4 node KVM guest before would have reported.
[root@localhost ~]# numactl -H
available: 4 nodes (0-3)
node 0 cpus: 0 1 2 3
node 0 size: 1746 MB
node 0 free: 1604 MB
node 1 cpus: 4 5 6 7
node 1 size: 2044 MB
node 1 free: 1765 MB
node 2 cpus: 8 9 10 11
node 2 size: 2044 MB
node 2 free: 1837 MB
node 3 cpus: 12 13 14 15
node 3 size: 2044 MB
node 3 free: 1903 MB
node distances:
node 0 1 2 3
0: 10 40 40 40
1: 40 10 40 40
2: 40 40 10 40
3: 40 40 40 10
Would now report:
[root@localhost ~]# numactl -H
available: 4 nodes (0-3)
node 0 cpus: 0 2 3 4 5 6 7 8 9 10 11 12 13 14 15
node 0 size: 1746 MB
node 0 free: 1244 MB
node 1 cpus:
node 1 size: 2044 MB
node 1 free: 2032 MB
node 2 cpus: 1
node 2 size: 2044 MB
node 2 free: 2028 MB
node 3 cpus:
node 3 size: 2044 MB
node 3 free: 2032 MB
node distances:
node 0 1 2 3
0: 10 40 40 40
1: 40 10 40 40
2: 40 40 10 40
3: 40 40 40 10
Fix this by skipping associativity lookup if the VPHN hcall failed.
Fixes: 2ea626306810 ("powerpc/topology: Get topology for shared processors at boot")
Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Michael Neuling [Tue, 25 Sep 2018 09:36:47 +0000 (19:36 +1000)]
powerpc/tm: Avoid possible userspace r1 corruption on reclaim
[ Upstream commit
96dc89d526ef77604376f06220e3d2931a0bfd58 ]
Current we store the userspace r1 to PACATMSCRATCH before finally
saving it to the thread struct.
In theory an exception could be taken here (like a machine check or
SLB miss) that could write PACATMSCRATCH and hence corrupt the
userspace r1. The SLB fault currently doesn't touch PACATMSCRATCH, but
others do.
We've never actually seen this happen but it's theoretically
possible. Either way, the code is fragile as it is.
This patch saves r1 to the kernel stack (which can't fault) before we
turn MSR[RI] back on. PACATMSCRATCH is still used but only with
MSR[RI] off. We then copy r1 from the kernel stack to the thread
struct once we have MSR[RI] back on.
Suggested-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Michael Neuling [Mon, 24 Sep 2018 07:27:04 +0000 (17:27 +1000)]
powerpc/tm: Fix userspace r13 corruption
[ Upstream commit
cf13435b730a502e814c63c84d93db131e563f5f ]
When we treclaim we store the userspace checkpointed r13 to a scratch
SPR and then later save the scratch SPR to the user thread struct.
Unfortunately, this doesn't work as accessing the user thread struct
can take an SLB fault and the SLB fault handler will write the same
scratch SPRG that now contains the userspace r13.
To fix this, we store r13 to the kernel stack (which can't fault)
before we access the user thread struct.
Found by running P8 guest + powervm + disable_1tb_segments + TM. Seen
as a random userspace segfault with r13 looking like a kernel address.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Reviewed-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Daniel Kurtz [Sat, 22 Sep 2018 19:58:26 +0000 (13:58 -0600)]
pinctrl/amd: poll InterruptEnable bits in amd_gpio_irq_set_type
[ Upstream commit
b85bfa246efd24ea3fdb5ee949c28e3110c6d299 ]
>From the AMD BKDG, if WAKE_INT_MASTER_REG.MaskStsEn is set, a software
write to the debounce registers of *any* gpio will block wake/interrupt
status generation for *all* gpios for a length of time that depends on
WAKE_INT_MASTER_REG.MaskStsLength[11:0]. During this period the Interrupt
Delivery bit (INTERRUPT_ENABLE) will read as 0.
In commit
4c1de0414a1340 ("pinctrl/amd: poll InterruptEnable bits in
enable_irq") we tried to fix this same "gpio Interrupts are blocked
immediately after writing debounce registers" problem, but incorrectly
assumed it only affected the gpio whose debounce was being configured
and not ALL gpios.
To solve this for all gpios, we move the polling loop from
amd_gpio_irq_enable() to amd_gpio_irq_set_type(), while holding the gpio
spinlock. This ensures that another gpio operation (e.g.
amd_gpio_irq_unmask()) can read a temporarily disabled IRQ and
incorrectly disable it while trying to modify some other register bits.
Fixes: 4c1de0414a1340 pinctrl/amd: poll InterruptEnable bits in enable_irq
Signed-off-by: Daniel Kurtz <djkurtz@chromium.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Heiko Stuebner [Mon, 27 Aug 2018 10:56:24 +0000 (12:56 +0200)]
iommu/rockchip: Free irqs in shutdown handler
[ Upstream commit
74bc2abca7603c956d1e331e8b9bee7b874c1eec ]
In the iommu's shutdown handler we disable runtime-pm which could
result in the irq-handler running unclocked and since commit
3fc7c5c0cff3 ("iommu/rockchip: Handle errors returned from PM framework")
we warn about that fact.
This can cause warnings on shutdown on some Rockchip machines, so
free the irqs in the shutdown handler before we disable runtime-pm.
Reported-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Fixes: 3fc7c5c0cff3 ("iommu/rockchip: Handle errors returned from PM framework")
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Tested-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
James Cowgill [Thu, 6 Sep 2018 21:57:56 +0000 (22:57 +0100)]
RISC-V: include linux/ftrace.h in asm-prototypes.h
[ Upstream commit
57a489786de9ec37d6e25ef1305dc337047f0236 ]
Building a riscv kernel with CONFIG_FUNCTION_TRACER and
CONFIG_MODVERSIONS enabled results in these two warnings:
MODPOST vmlinux.o
WARNING: EXPORT symbol "return_to_handler" [vmlinux] version generation failed, symbol will not be versioned.
WARNING: EXPORT symbol "_mcount" [vmlinux] version generation failed, symbol will not be versioned.
When exporting symbols from an assembly file, the MODVERSIONS code
requires their prototypes to be defined in asm-prototypes.h (see
scripts/Makefile.build). Since both of these symbols have prototypes
defined in linux/ftrace.h, include this header from RISC-V's
asm-prototypes.h.
Reported-by: Karsten Merker <merker@debian.org>
Signed-off-by: James Cowgill <jcowgill@debian.org>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Selvin Xavier [Fri, 21 Sep 2018 05:33:00 +0000 (22:33 -0700)]
RDMA/bnxt_re: Fix system crash during RDMA resource initialization
[ Upstream commit
de5c95d0f518537f59ee5aef762abc46f868c377 ]
bnxt_re_ib_reg acquires and releases the rtnl lock whenever it accesses
the L2 driver.
The following sequence can trigger a crash
Acquires the rtnl_lock ->
Registers roce driver callback with L2 driver ->
release the rtnl lock
bnxt_re acquires the rtnl_lock ->
Request for MSIx vectors ->
release the rtnl_lock
Issue happens when bnxt_re proceeds with remaining part of initialization
and L2 driver invokes bnxt_ulp_irq_stop as a part of bnxt_open_nic.
The crash is in bnxt_qplib_nq_stop_irq as the NQ structures are
not initialized yet,
<snip>
[ 3551.726647] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 3551.726656] IP: [<
ffffffffc0840ee9>] bnxt_qplib_nq_stop_irq+0x59/0xb0 [bnxt_re]
[ 3551.726674] PGD 0
[ 3551.726679] Oops: 0002 1 SMP
...
[ 3551.726822] Hardware name: Dell Inc. PowerEdge R720/08RW36, BIOS 2.4.3 07/09/2014
[ 3551.726826] task:
ffff97e30eec5ee0 ti:
ffff97e3173bc000 task.ti:
ffff97e3173bc000
[ 3551.726829] RIP: 0010:[<
ffffffffc0840ee9>] [<
ffffffffc0840ee9>]
bnxt_qplib_nq_stop_irq+0x59/0xb0 [bnxt_re]
...
[ 3551.726872] Call Trace:
[ 3551.726886] [<
ffffffffc082cb9e>] bnxt_re_stop_irq+0x4e/0x70 [bnxt_re]
[ 3551.726899] [<
ffffffffc07d6a53>] bnxt_ulp_irq_stop+0x43/0x70 [bnxt_en]
[ 3551.726908] [<
ffffffffc07c82f4>] bnxt_reserve_rings+0x174/0x1e0 [bnxt_en]
[ 3551.726917] [<
ffffffffc07cafd8>] __bnxt_open_nic+0x368/0x9a0 [bnxt_en]
[ 3551.726925] [<
ffffffffc07cb62b>] bnxt_open_nic+0x1b/0x50 [bnxt_en]
[ 3551.726934] [<
ffffffffc07cc62f>] bnxt_setup_mq_tc+0x11f/0x260 [bnxt_en]
[ 3551.726943] [<
ffffffffc07d5f58>] bnxt_dcbnl_ieee_setets+0xb8/0x1f0 [bnxt_en]
[ 3551.726954] [<
ffffffff890f983a>] dcbnl_ieee_set+0x9a/0x250
[ 3551.726966] [<
ffffffff88fd6d21>] ? __alloc_skb+0xa1/0x2d0
[ 3551.726972] [<
ffffffff890f72fa>] dcb_doit+0x13a/0x210
[ 3551.726981] [<
ffffffff89003ff7>] rtnetlink_rcv_msg+0xa7/0x260
[ 3551.726989] [<
ffffffff88ffdb00>] ? rtnl_unicast+0x20/0x30
[ 3551.726996] [<
ffffffff88bf9dc8>] ? __kmalloc_node_track_caller+0x58/0x290
[ 3551.727002] [<
ffffffff890f7326>] ? dcb_doit+0x166/0x210
[ 3551.727007] [<
ffffffff88fd6d0d>] ? __alloc_skb+0x8d/0x2d0
[ 3551.727012] [<
ffffffff89003f50>] ? rtnl_newlink+0x880/0x880
...
[ 3551.727104] [<
ffffffff8911f7d5>] system_call_fastpath+0x1c/0x21
...
[ 3551.727164] RIP [<
ffffffffc0840ee9>] bnxt_qplib_nq_stop_irq+0x59/0xb0 [bnxt_re]
[ 3551.727175] RSP <
ffff97e3173bf788>
[ 3551.727177] CR2:
0000000000000000
Avoid this inconsistent state and system crash by acquiring
the rtnl lock for the entire duration of device initialization.
Re-factor the code to remove the rtnl lock from the individual function
and acquire and release it from the caller.
Fixes: 1ac5a4047975 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Fixes: 6e04b1035689 ("RDMA/bnxt_re: Fix broken RoCE driver due to recent L2 driver changes")
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tao Ren [Wed, 19 Sep 2018 22:13:31 +0000 (15:13 -0700)]
clocksource/drivers/fttmr010: Fix set_next_event handler
[ Upstream commit
4451d3f59f2a6f95e5d205c2d04ea072955d080d ]
Currently, the aspeed MATCH1 register is updated to <current_count -
cycles> in set_next_event handler, with the assumption that COUNT
register value is preserved when the timer is disabled and it continues
decrementing after the timer is enabled. But the assumption is wrong:
RELOAD register is loaded into COUNT register when the aspeed timer is
enabled, which means the next event may be delayed because timer
interrupt won't be generated until <0xFFFFFFFF - current_count +
cycles>.
The problem can be fixed by updating RELOAD register to <cycles>, and
COUNT register will be re-loaded when the timer is enabled and interrupt
is generated when COUNT register overflows.
The test result on Facebook Backpack-CMM BMC hardware (AST2500) shows
the issue is fixed: without the patch, usleep(100) suspends the process
for several milliseconds (and sometimes even over 40 milliseconds);
after applying the fix, usleep(100) takes averagely 240 microseconds to
return under the same workload level.
Signed-off-by: Tao Ren <taoren@fb.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Tested-by: Lei YU <mine260309@gmail.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Nathan Chancellor [Fri, 21 Sep 2018 09:44:12 +0000 (02:44 -0700)]
net/mlx4: Use cpumask_available for eq->affinity_mask
[ Upstream commit
8ac1ee6f4d62e781e3b3fd8b9c42b70371427669 ]
Clang warns that the address of a pointer will always evaluated as true
in a boolean context:
drivers/net/ethernet/mellanox/mlx4/eq.c:243:11: warning: address of
array 'eq->affinity_mask' will always evaluate to 'true'
[-Wpointer-bool-conversion]
if (!eq->affinity_mask || cpumask_empty(eq->affinity_mask))
~~~~~^~~~~~~~~~~~~
1 warning generated.
Use cpumask_available, introduced in commit
f7e30f01a9e2 ("cpumask: Add
helper cpumask_available()"), which does the proper checking and avoids
this warning.
Link: https://github.com/ClangBuiltLinux/linux/issues/86
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
John Fastabend [Tue, 18 Sep 2018 16:01:54 +0000 (09:01 -0700)]
bpf: test_maps, only support ESTABLISHED socks
[ Upstream commit
5028027844cfc6168e39650abecd817ba64c9d98 ]
Ensure that sockets added to a sock{map|hash} that is not in the
ESTABLISHED state is rejected.
Fixes: 1aa12bdf1bfb ("bpf: sockmap, add sock close() hook to remove socks")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
John Fastabend [Tue, 18 Sep 2018 16:01:49 +0000 (09:01 -0700)]
bpf: sockmap, fix transition through disconnect without close
[ Upstream commit
b05545e15e1ff1d6a6a8593971275f9cc3e6b92b ]
It is possible (via shutdown()) for TCP socks to go trough TCP_CLOSE
state via tcp_disconnect() without actually calling tcp_close which
would then call our bpf_tcp_close() callback. Because of this a user
could disconnect a socket then put it in a LISTEN state which would
break our assumptions about sockets always being ESTABLISHED state.
To resolve this rely on the unhash hook, which is called in the
disconnect case, to remove the sock from the sockmap.
Reported-by: Eric Dumazet <edumazet@google.com>
Fixes: 1aa12bdf1bfb ("bpf: sockmap, add sock close() hook to remove socks")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
John Fastabend [Tue, 18 Sep 2018 16:01:44 +0000 (09:01 -0700)]
bpf: sockmap only allow ESTABLISHED sock state
[ Upstream commit
5607fff303636d48b88414c6be353d9fed700af2 ]
After this patch we only allow socks that are in ESTABLISHED state or
are being added via a sock_ops event that is transitioning into an
ESTABLISHED state. By allowing sock_ops events we allow users to
manage sockmaps directly from sock ops programs. The two supported
sock_ops ops are BPF_SOCK_OPS_PASSIVE_ESTABLISHED_CB and
BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB.
Similar to TLS ULP this ensures sk_user_data is correct.
Reported-by: Eric Dumazet <edumazet@google.com>
Fixes: 1aa12bdf1bfb ("bpf: sockmap, add sock close() hook to remove socks")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>