Erez Zadok [Fri, 13 May 2022 23:42:26 +0000 (19:42 -0400)]
Wrapfs: add ->bmap support
Signed-off-by: Aditya Sonavane <aditya.sonavane@stonybrook.edu>
Signed-off-by: Erez Zadok <ezk@cs.stonybrook.edu>
Erez Zadok [Sat, 7 May 2022 22:12:52 +0000 (18:12 -0400)]
wrapfs: copyright year update
Erez Zadok [Sat, 7 May 2022 20:48:45 +0000 (16:48 -0400)]
wrapfs: change http URLs to https
Erez Zadok [Mon, 16 Dec 2019 00:02:15 +0000 (19:02 -0500)]
merge changes
Erez Zadok [Sat, 14 Dec 2019 05:06:57 +0000 (00:06 -0500)]
wrapfs: update copyright
Erez Zadok [Sat, 14 Dec 2019 04:22:20 +0000 (23:22 -0500)]
wrapfs: remove unused variable
Erez Zadok [Fri, 29 Mar 2019 20:46:28 +0000 (16:46 -0400)]
update copyright
Erez Zadok [Fri, 29 Mar 2019 20:40:35 +0000 (16:40 -0400)]
fix spell error
Erez Zadok [Sun, 19 Feb 2017 00:11:23 +0000 (19:11 -0500)]
Wrapfs: ->iget fixes
Change where we igrab/iput to ensure we always hold a valid lower_inode.
Return ENOMEM (not EACCES) if iget5_locked returns NULL.
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Sat, 18 Feb 2017 20:20:33 +0000 (15:20 -0500)]
Wrapfs: update copyrights for 2017
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Sun, 22 May 2016 05:27:13 +0000 (01:27 -0400)]
Wrapfs: support NFS exports
Based on patch from Sandeep Joshi <sanjos100@gmail.com>.
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Sun, 22 May 2016 05:27:13 +0000 (01:27 -0400)]
Wrapfs: use d_splice_alias
Refactor interpose code to allow lookup to use d_splice_alias.
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Mon, 14 Dec 2015 23:38:16 +0000 (18:38 -0500)]
Wrapfs: update nlinks after rename
Signed-off-by: Logeswari P Viswanath <logeswari.pv@hpe.com>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Wed, 4 Nov 2015 06:24:29 +0000 (01:24 -0500)]
Wrapfs: update copyright year to 2015
Erez Zadok [Wed, 4 Nov 2015 06:24:29 +0000 (01:24 -0500)]
Wrapfs: use vfs xattr helpers
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Sat, 16 Aug 2014 03:22:26 +0000 (23:22 -0400)]
Wrapfs: properly copy meta-data after AIO operations from lower inode
Signed-off-by: Mengyang Li <li.mengyang@stonybrook.edu>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Mon, 11 Aug 2014 22:59:28 +0000 (18:59 -0400)]
Wrapfs: leave placeholders for updating upper inode after AIO
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Sun, 10 Aug 2014 06:25:45 +0000 (02:25 -0400)]
Wrapfs: protect lower_file by ref-count during aio operation
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Mengyang Li <li.mengyang@stonybrook.edu>
Erez Zadok [Thu, 26 Jun 2014 03:22:55 +0000 (23:22 -0400)]
Wrapfs: fix ->llseek to update upper and lower offsets
Fixes bug: xfstests generic/257. f_pos consistently is required by and
only by dir_ops->wrapfs_readdir, main_ops is not affected.
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Mengyang Li <li.mengyang@stonybrook.edu>
Erez Zadok [Thu, 26 Jun 2014 03:22:55 +0000 (23:22 -0400)]
Wrapfs: support extended attributes (xattr) operations
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Mengyang Li <li.mengyang@stonybrook.edu>
Erez Zadok [Sat, 21 Jun 2014 00:28:12 +0000 (20:28 -0400)]
Wrapfs: support asynchronous-IO (AIO) operations
Signed-off-by: Li Mengyang <li.mengyang@stonybrook.edu>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Sat, 21 Jun 2014 00:28:12 +0000 (20:28 -0400)]
Wrapfs: support direct-IO (DIO) operations
Signed-off-by: Li Mengyang <li.mengyang@stonybrook.edu>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Thu, 15 May 2014 04:16:05 +0000 (00:16 -0400)]
Wrapfs: implement vm_ops->page_mkwrite
Some file systems (e.g., ext4) require it. Reported by Ted Ts'o.
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Thu, 3 Apr 2014 18:12:09 +0000 (14:12 -0400)]
Wrapfs: update documentation
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Thu, 3 Apr 2014 18:12:00 +0000 (14:12 -0400)]
Wrapfs: update maintainers
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Tue, 21 Jan 2014 08:23:10 +0000 (03:23 -0500)]
Wrapfs: update documentation
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Tue, 21 Jan 2014 06:17:55 +0000 (01:17 -0500)]
Wrapfs: 2014 Copyright update
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Wed, 5 Jun 2013 05:44:38 +0000 (01:44 -0400)]
Wrapfs: copy lower inode attributes in ->ioctl
Some ioctls (e.g., EXT2_IOC_SETFLAGS) can change inode attributes, so copy
them from lower inode.
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Wed, 5 Jun 2013 05:43:38 +0000 (01:43 -0400)]
Wrapfs: remove unnecessary call to vm_unmap in ->mmap
Code is unnecessary and causes deadlocks in newer kernels.
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Mon, 3 Jun 2013 01:55:33 +0000 (21:55 -0400)]
patch copyright-2013.patch
Erez Zadok [Sun, 26 May 2013 03:23:42 +0000 (23:23 -0400)]
Wrapfs: use d_make_root
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Tue, 31 Jan 2012 09:40:19 +0000 (04:40 -0500)]
Wrapfs: use mode_t
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Mon, 30 Jan 2012 01:34:27 +0000 (20:34 -0500)]
Wrapfs: use set_nlink()
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Fri, 9 Sep 2011 04:47:49 +0000 (00:47 -0400)]
Wrapfs: drop our dentry in ->rmdir
Also clear nlinks on our inode.
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Tue, 6 Sep 2011 04:10:32 +0000 (00:10 -0400)]
Wrapfs: use d_alloc_root
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Tue, 6 Sep 2011 04:10:31 +0000 (00:10 -0400)]
Wrapfs: use d_set_d_op
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Tue, 6 Sep 2011 04:10:30 +0000 (00:10 -0400)]
Wrapfs: use updated vfs_path_lookup prototype
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Tue, 6 Sep 2011 04:10:30 +0000 (00:10 -0400)]
Wrapfs: ->fsync updates for new prototype
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Tue, 6 Sep 2011 04:10:29 +0000 (00:10 -0400)]
Wrapfs: support LOOKUP_RCU in ->d_revalidate
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Tue, 6 Sep 2011 04:10:28 +0000 (00:10 -0400)]
Wrapfs: new ->permission prototype and fixes.
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Mon, 2 May 2011 06:00:02 +0000 (02:00 -0400)]
Wrapfs: lookup fixes
Don't use lookup_one_len any longer (doesn't work for NFS).
Initialize lower wrapfs_dentry_info so lower_path is NULL.
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Fri, 18 Mar 2011 17:14:28 +0000 (13:14 -0400)]
Wrapfs: remove extra debug in rmdir
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Fri, 18 Mar 2011 16:38:01 +0000 (12:38 -0400)]
Wrapfs: checkpatch fixes
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Fri, 18 Mar 2011 04:45:17 +0000 (00:45 -0400)]
Wrapfs: port to 2.6.39
Remove lock/unlock_kernel in ->fasync.
Convert from ->get_sb to ->mount op.
Remove include to smp_lock.h, added sched.h.
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Fri, 18 Mar 2011 03:21:55 +0000 (23:21 -0400)]
Wrapfs: copyright update for 2011
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Fri, 18 Mar 2011 03:21:55 +0000 (23:21 -0400)]
Wrapfs: better handling of NFS silly-renamed files
In ->unlink, if we try to unlink an NFS silly-renamed file, NFS returns
-EBUSY. We have to treat it as a success and return 0 to the VFS. NFS will
remove silly-deleted files later on anyway.
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Fri, 18 Mar 2011 03:21:55 +0000 (23:21 -0400)]
Wrapfs: update parent directory inode size in inode ops
After ->unlink, ->rmdir, and ->rename, we need to copy the (possibly
changed) inode size of the parent directory(ies) where the operation took
place.
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Fri, 18 Mar 2011 03:21:55 +0000 (23:21 -0400)]
Wrapfs: remove unnecessary calls to copy lower inode->n_links
Removed from ->create, ->symlink, and ->mknod.
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Tue, 8 Mar 2011 04:20:33 +0000 (23:20 -0500)]
Wrapfs: ->setattr fixes
Call inode_change_ok on our inode, not lower.
Don't copy inode sizes (VFS does it).
Pass lower file in struct iattr passed to notify_change on lower inode.
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Sun, 6 Mar 2011 21:23:16 +0000 (16:23 -0500)]
Wrapfs: update ->permission prototye and code for new iperm flag
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Fri, 12 Nov 2010 23:15:05 +0000 (18:15 -0500)]
Wrapfs: handle maxbytes properly
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Sat, 11 Sep 2010 19:49:33 +0000 (15:49 -0400)]
Wrapfs: support ->unlocked_ioctl and ->compat_ioctl
Old ->ioctl was split into ->unlocked_ioctl and ->compat_ioctl. Compat
version doesn't need to lock_kernel any longer.
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Wed, 11 Aug 2010 03:50:14 +0000 (23:50 -0400)]
Wrapfs: new vfs_statfs and ->evict_inode prototypes
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Sat, 7 Aug 2010 03:37:29 +0000 (23:37 -0400)]
Wrapfs: update ->fsync prototype
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Wed, 21 Apr 2010 01:22:02 +0000 (21:22 -0400)]
Wrapfs: update documentation
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Tue, 20 Apr 2010 19:32:09 +0000 (15:32 -0400)]
Wrapfs: include slab.h
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Tue, 20 Apr 2010 19:26:02 +0000 (15:26 -0400)]
Wrapfs: avoid an extra path_get/put pair in wrapfs_open
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Fri, 26 Feb 2010 08:18:04 +0000 (03:18 -0500)]
Wrapfs: decrement nd_path on follow_link error
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Tue, 5 Jan 2010 09:27:00 +0000 (04:27 -0500)]
Wrapfs: don't mention kernel version in modload message
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Tue, 5 Jan 2010 09:28:11 +0000 (04:28 -0500)]
Wrapfs/VFS: remove init_lower_nd and unexport release_lower_nd
Only wrapfs_create used it, and it is unnecessary to init a completely new
nameidata for the lower file system.
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Tue, 5 Jan 2010 01:45:06 +0000 (20:45 -0500)]
Kconfig: hook to configure Wrapfs
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Tue, 5 Jan 2010 01:45:06 +0000 (20:45 -0500)]
Makefile: hook to compile Wrapfs
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Tue, 5 Jan 2010 01:45:06 +0000 (20:45 -0500)]
VFS: export release_open_intent symbol
Needed to release the resources of the lower nameidata structures that we
create and pass to lower file systems (e.g., when calling vfs_create).
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Tue, 5 Jan 2010 01:45:06 +0000 (20:45 -0500)]
Wrapfs: file system magic number
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Tue, 5 Jan 2010 01:45:06 +0000 (20:45 -0500)]
Wrapfs: Kconfig options
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Tue, 5 Jan 2010 01:45:06 +0000 (20:45 -0500)]
Wrapfs: main Makefile
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Tue, 5 Jan 2010 01:45:06 +0000 (20:45 -0500)]
Wrapfs: vm_ops operations
Includes necessary address_space workaround ops.
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Tue, 5 Jan 2010 01:45:06 +0000 (20:45 -0500)]
Wrapfs: mount-time and module-linkage functions
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Tue, 5 Jan 2010 01:45:06 +0000 (20:45 -0500)]
Wrapfs: lookup-related functions
Main lookup function, nameidata helpers, and stacking-interposition
functions.
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Tue, 5 Jan 2010 01:45:06 +0000 (20:45 -0500)]
Wrapfs: file operations
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Tue, 5 Jan 2010 01:45:06 +0000 (20:45 -0500)]
Wrapfs: dentry operations
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Tue, 5 Jan 2010 01:45:06 +0000 (20:45 -0500)]
Wrapfs: inode operations
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Tue, 5 Jan 2010 01:45:06 +0000 (20:45 -0500)]
Wrapfs: superblock operations
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Tue, 5 Jan 2010 01:45:06 +0000 (20:45 -0500)]
Wrapfs: main header file
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Tue, 5 Jan 2010 01:45:06 +0000 (20:45 -0500)]
Wrapfs: Maintainers
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Tue, 5 Jan 2010 01:45:06 +0000 (20:45 -0500)]
Documentation: index entry for Wrapfs
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Erez Zadok [Tue, 5 Jan 2010 01:45:06 +0000 (20:45 -0500)]
Wrapfs: introduction and usage documentation
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Zefan Li [Wed, 26 Oct 2016 15:15:47 +0000 (23:15 +0800)]
Linux 3.4.113
Arnaldo Carvalho de Melo [Mon, 14 Mar 2016 12:56:35 +0000 (09:56 -0300)]
net: Fix use after free in the recvmmsg exit path
commit
34b88a68f26a75e4fded796f1a49c40f82234b7d upstream.
The syzkaller fuzzer hit the following use-after-free:
Call Trace:
[<
ffffffff8175ea0e>] __asan_report_load8_noabort+0x3e/0x40 mm/kasan/report.c:295
[<
ffffffff851cc31a>] __sys_recvmmsg+0x6fa/0x7f0 net/socket.c:2261
[< inline >] SYSC_recvmmsg net/socket.c:2281
[<
ffffffff851cc57f>] SyS_recvmmsg+0x16f/0x180 net/socket.c:2270
[<
ffffffff86332bb6>] entry_SYSCALL_64_fastpath+0x16/0x7a
arch/x86/entry/entry_64.S:185
And, as Dmitry rightly assessed, that is because we can drop the
reference and then touch it when the underlying recvmsg calls return
some packets and then hit an error, which will make recvmmsg to set
sock->sk->sk_err, oops, fix it.
Reported-and-Tested-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Kostya Serebryany <kcc@google.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Fixes: a2e2725541fa ("net: Introduce recvmmsg socket syscall")
http://lkml.kernel.org/r/
20160122211644.GC2470@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Zefan Li <lizefan@huawei.com>
Michal Hocko [Sun, 16 Oct 2016 09:55:00 +0000 (11:55 +0200)]
mm, gup: close FOLL MAP_PRIVATE race
commit
19be0eaffa3ac7d8eb6784ad9bdbc7d67ed8e619 upstream.
faultin_page drops FOLL_WRITE after the page fault handler did the CoW
and then we retry follow_page_mask to get our CoWed page. This is racy,
however because the page might have been unmapped by that time and so
we would have to do a page fault again, this time without CoW. This
would cause the page cache corruption for FOLL_FORCE on MAP_PRIVATE
read only mappings with obvious consequences.
This is an ancient bug that was actually already fixed once by Linus
eleven years ago in commit
4ceb5db9757a ("Fix get_user_pages() race
for write access") but that was then undone due to problems on s390
by commit
f33ea7f404e5 ("fix get_user_pages bug") because s390 didn't
have proper dirty pte tracking until
abf09bed3cce ("s390/mm: implement
software dirty bits"). This wasn't a problem at the time as pointed out
by Hugh Dickins because madvise relied on mmap_sem for write up until
0a27a14a6292 ("mm: madvise avoid exclusive mmap_sem") but since then we
can race with madvise which can unmap the fresh COWed page or with KSM
and corrupt the content of the shared page.
This patch is based on the Linus' approach to not clear FOLL_WRITE after
the CoW page fault (aka VM_FAULT_WRITE) but instead introduces FOLL_COW
to note this fact. The flag is then rechecked during follow_pfn_pte to
enforce the page fault again if we do not see the CoWed page. Linus was
suggesting to check pte_dirty again as s390 is OK now. But that would
make backporting to some old kernels harder. So instead let's just make
sure that vm_normal_page sees a pure anonymous page.
This would guarantee we are seeing a real CoW page. Introduce
can_follow_write_pte which checks both pte_write and falls back to
PageAnon on forced write faults which passed CoW already. Thanks to Hugh
to point out that a special care has to be taken for KSM pages because
our COWed page might have been merged with a KSM one and keep its
PageAnon flag.
Fixes: 0a27a14a6292 ("mm: madvise avoid exclusive mmap_sem")
Reported-by: Phil "not Paul" Oester <kernel@linuxace.com>
Disclosed-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Michal Hocko <mhocko@suse.com>
[bwh: Backported to 3.2:
- Adjust filename, context, indentation
- The 'no_page' exit path in follow_page() is different, so open-code the
cleanup
- Delete a now-unused label]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Zefan Li <lizefan@huawei.com>
Nikolay Aleksandrov [Tue, 17 Nov 2015 14:49:06 +0000 (15:49 +0100)]
net/core: revert "net: fix __netdev_update_features return.." and add comment
commit
17b85d29e82cc3c874a497a8bc5764d6a2b043e2 upstream.
This reverts commit
00ee59271777 ("net: fix __netdev_update_features return
on ndo_set_features failure")
and adds a comment explaining why it's okay to return a value other than
0 upon error. Some drivers might actually change flags and return an
error so it's better to fire a spurious notification rather than miss
these.
CC: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Zefan Li <lizefan@huawei.com>
Paul Bolle [Thu, 18 Feb 2016 20:29:08 +0000 (21:29 +0100)]
ser_gigaset: use container_of() instead of detour
commit
8d2c3ab4445640957d136caa3629857d63544a2a upstream.
The purpose of gigaset_device_release() is to kfree() the struct
ser_cardstate that contains our struct device. This is done via a bit of
a detour. First we make our struct device's driver_data point to the
container of our struct ser_cardstate (which is a struct cardstate). In
gigaset_device_release() we then retrieve that driver_data again. And
after that we finally kfree() the struct ser_cardstate that was saved in
the struct cardstate.
All of this can be achieved much easier by using container_of() to get
from our struct device to its container, struct ser_cardstate. Do so.
Note that at the time the detour was implemented commit
b8b2c7d845d5
("base/platform: assert that dev_pm_domain callbacks are called
unconditionally") had just entered the tree. That commit disconnected
our platform_device and our platform_driver. These were reconnected
again in v4.5-rc2 through commit
25cad69f21f5 ("base/platform: Fix
platform drivers with no probe callback"). And one of the consequences
of that fix was that it broke the detour via driver_data. That's because
it made __device_release_driver() stop being a NOP for our struct device
and actually do stuff again. One of the things it now does, is setting
our driver_data to NULL. That, in turn, makes it impossible for
gigaset_device_release() to get to our struct cardstate. Which has the
net effect of leaking a struct ser_cardstate at every call of this
driver's tty close() operation. So using container_of() has the
additional benefit of actually working.
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Tested-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Acked-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Zefan Li <lizefan@huawei.com>
Tilman Schmidt [Tue, 15 Dec 2015 17:11:31 +0000 (18:11 +0100)]
ser_gigaset: remove unnecessary kfree() calls from release method
commit
8aeb3c3d655e22d3aa5ba49f313157bd27354bb4 upstream.
device->platform_data and platform_device->resource are never used
and remain NULL through their entire life. Drops the kfree() calls
for them from the device release method.
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Zefan Li <lizefan@huawei.com>
Konrad Rzeszutek Wilk [Thu, 11 Feb 2016 21:10:24 +0000 (16:10 -0500)]
xen/pciback: Save the number of MSI-X entries to be copied later.
commit
d159457b84395927b5a52adb72f748dd089ad5e5 upstream.
Commit
8135cf8b092723dbfcc611fe6fdcb3a36c9951c5 (xen/pciback: Save
xen_pci_op commands before processing it) broke enabling MSI-X because
it would never copy the resulting vectors into the response. The
number of vectors requested was being overwritten by the return value
(typically zero for success).
Save the number of vectors before processing the op, so the correct
number of vectors are copied afterwards.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Zefan Li <lizefan@huawei.com>
Tetsuo Handa [Fri, 5 Feb 2016 23:36:30 +0000 (15:36 -0800)]
mm, vmstat: fix wrong WQ sleep when memory reclaim doesn't make any progress
commit
564e81a57f9788b1475127012e0fd44e9049e342 upstream.
Jan Stancek has reported that system occasionally hanging after "oom01"
testcase from LTP triggers OOM. Guessing from a result that there is a
kworker thread doing memory allocation and the values between "Node 0
Normal free:" and "Node 0 Normal:" differs when hanging, vmstat is not
up-to-date for some reason.
According to commit
373ccbe59270 ("mm, vmstat: allow WQ concurrency to
discover memory reclaim doesn't make any progress"), it meant to force
the kworker thread to take a short sleep, but it by error used
schedule_timeout(1). We missed that schedule_timeout() in state
TASK_RUNNING doesn't do anything.
Fix it by using schedule_timeout_uninterruptible(1) which forces the
kworker thread to take a short sleep in order to make sure that vmstat
is up-to-date.
Fixes: 373ccbe59270 ("mm, vmstat: allow WQ concurrency to discover memory reclaim doesn't make any progress")
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: Jan Stancek <jstancek@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Cristopher Lameter <clameter@sgi.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Arkadiusz Miskiewicz <arekm@maven.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Zefan Li <lizefan@huawei.com>
John Stultz [Thu, 11 Jun 2015 22:54:55 +0000 (15:54 -0700)]
time: Prevent early expiry of hrtimers[CLOCK_REALTIME] at the leap second edge
commit
833f32d763028c1bb371c64f457788b933773b3e upstream.
Currently, leapsecond adjustments are done at tick time. As a result,
the leapsecond was applied at the first timer tick *after* the
leapsecond (~1-10ms late depending on HZ), rather then exactly on the
second edge.
This was in part historical from back when we were always tick based,
but correcting this since has been avoided since it adds extra
conditional checks in the gettime fastpath, which has performance
overhead.
However, it was recently pointed out that ABS_TIME CLOCK_REALTIME
timers set for right after the leapsecond could fire a second early,
since some timers may be expired before we trigger the timekeeping
timer, which then applies the leapsecond.
This isn't quite as bad as it sounds, since behaviorally it is similar
to what is possible w/ ntpd made leapsecond adjustments done w/o using
the kernel discipline. Where due to latencies, timers may fire just
prior to the settimeofday call. (Also, one should note that all
applications using CLOCK_REALTIME timers should always be careful,
since they are prone to quirks from settimeofday() disturbances.)
However, the purpose of having the kernel do the leap adjustment is to
avoid such latencies, so I think this is worth fixing.
So in order to properly keep those timers from firing a second early,
this patch modifies the ntp and timekeeping logic so that we keep
enough state so that the update_base_offsets_now accessor, which
provides the hrtimer core the current time, can check and apply the
leapsecond adjustment on the second edge. This prevents the hrtimer
core from expiring timers too early.
This patch does not modify any other time read path, so no additional
overhead is incurred. However, this also means that the leap-second
continues to be applied at tick time for all other read-paths.
Apologies to Richard Cochran, who pushed for similar changes years
ago, which I resisted due to the concerns about the performance
overhead.
While I suspect this isn't extremely critical, folks who care about
strict leap-second correctness will likely want to watch
this. Potentially a -stable candidate eventually.
Originally-suggested-by: Richard Cochran <richardcochran@gmail.com>
Reported-by: Daniel Bristot de Oliveira <bristot@redhat.com>
Reported-by: Prarit Bhargava <prarit@redhat.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jiri Bohac <jbohac@suse.cz>
Cc: Shuah Khan <shuahkh@osg.samsung.com>
Cc: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/1434063297-28657-4-git-send-email-john.stultz@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
[Yadi: Move do_adjtimex to timekeeping.c and solve context issues]
Signed-off-by: Hu <yadi.hu@windriver.com>
Signed-off-by: Zefan Li <lizefan@huawei.com>
Eric Dumazet [Sun, 10 Jul 2016 08:04:02 +0000 (10:04 +0200)]
tcp: make challenge acks less predictable
commit
75ff39ccc1bd5d3c455b6822ab09e533c551f758 upstream.
Yue Cao claims that current host rate limiting of challenge ACKS
(RFC 5961) could leak enough information to allow a patient attacker
to hijack TCP sessions. He will soon provide details in an academic
paper.
This patch increases the default limit from 100 to 1000, and adds
some randomization so that the attacker can no longer hijack
sessions without spending a considerable amount of probes.
Based on initial analysis and patch from Linus.
Note that we also have per socket rate limiting, so it is tempting
to remove the host limit in the future.
v2: randomize the count of challenge acks per second, not the period.
Fixes: 282f23c6ee34 ("tcp: implement RFC 5961 3.2")
Reported-by: Yue Cao <ycao009@ucr.edu>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[lizf: Backported to 3.4:
- adjust context
- use ACCESS_ONCE instead WRITE_ONCE/READ_ONCE
- open-code prandom_u32_max()]
Signed-off-by: Zefan Li <lizefan@huawei.com>
Zefan Li [Sun, 9 Oct 2016 11:23:12 +0000 (19:23 +0800)]
Revert "USB: Add OTG PET device to TPL"
This reverts commit
97fa724b23c3dd22e9c0979ad0e9d260cc6d545d.
Conflicts:
drivers/usb/core/quirks.c
Signed-off-by: Zefan Li <lizefan@huawei.com>
Zefan Li [Sun, 9 Oct 2016 11:20:47 +0000 (19:20 +0800)]
Revert "USB: Add device quirk for ASUS T100 Base Station keyboard"
This reverts commit
eea5a87d270e8d6925063019c3b0f3ff61fcb49a.
Conflicts:
drivers/usb/core/quirks.c
include/linux/usb/quirks.h
Signed-off-by: Zefan Li <lizefan@huawei.com>
Zefan Li [Sun, 9 Oct 2016 11:06:49 +0000 (19:06 +0800)]
Fix incomplete backport of commit
0f792cf949a0
Signed-off-by: Zefan Li <lizefan@huawei.com>
Zefan Li [Sun, 9 Oct 2016 11:00:59 +0000 (19:00 +0800)]
Fix incomplete backport of commit
423f04d63cf4
Signed-off-by: Zefan Li <lizefan@huawei.com>
Nicolas Dichtel [Wed, 5 Sep 2012 02:12:42 +0000 (02:12 +0000)]
ipv6: fix handling of blackhole and prohibit routes
commit
ef2c7d7b59708d54213c7556a82d14de9a7e4475 upstream.
When adding a blackhole or a prohibit route, they were handling like classic
routes. Moreover, it was only possible to add this kind of routes by specifying
an interface.
Bug already reported here:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=498498
Before the patch:
$ ip route add blackhole 2001::1/128
RTNETLINK answers: No such device
$ ip route add blackhole 2001::1/128 dev eth0
$ ip -6 route | grep 2001
2001::1 dev eth0 metric 1024
After:
$ ip route add blackhole 2001::1/128
$ ip -6 route | grep 2001
blackhole 2001::1 dev lo metric 1024 error -22
v2: wrong patch
v3: add a field fc_type in struct fib6_config to store RTN_* type
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Zefan Li <lizefan@huawei.com>
Michal Kubeček [Mon, 9 Sep 2013 19:45:04 +0000 (21:45 +0200)]
ipv6: don't call fib6_run_gc() until routing is ready
commit
2c861cc65ef4604011a0082e4dcdba2819aa191a upstream.
When loading the ipv6 module, ndisc_init() is called before
ip6_route_init(). As the former registers a handler calling
fib6_run_gc(), this opens a window to run the garbage collector
before necessary data structures are initialized. If a network
device is initialized in this window, adding MAC address to it
triggers a NETDEV_CHANGEADDR event, leading to a crash in
fib6_clean_all().
Take the event handler registration out of ndisc_init() into a
separate function ndisc_late_init() and move it after
ip6_route_init().
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Zefan Li <lizefan@huawei.com>
Michal Kubeček [Thu, 1 Aug 2013 08:04:24 +0000 (10:04 +0200)]
ipv6: update ip6_rt_last_gc every time GC is run
commit
49a18d86f66d33a20144ecb5a34bba0d1856b260 upstream.
As pointed out by Eric Dumazet, net->ipv6.ip6_rt_last_gc should
hold the last time garbage collector was run so that we should
update it whenever fib6_run_gc() calls fib6_clean_all(), not only
if we got there from ip6_dst_gc().
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Zefan Li <lizefan@huawei.com>
Karl Heiss [Thu, 24 Sep 2015 16:15:07 +0000 (12:15 -0400)]
sctp: Prevent soft lockup when sctp_accept() is called during a timeout event
commit
635682a14427d241bab7bbdeebb48a7d7b91638e upstream.
A case can occur when sctp_accept() is called by the user during
a heartbeat timeout event after the 4-way handshake. Since
sctp_assoc_migrate() changes both assoc->base.sk and assoc->ep, the
bh_sock_lock in sctp_generate_heartbeat_event() will be taken with
the listening socket but released with the new association socket.
The result is a deadlock on any future attempts to take the listening
socket lock.
Note that this race can occur with other SCTP timeouts that take
the bh_lock_sock() in the event sctp_accept() is called.
BUG: soft lockup - CPU#9 stuck for 67s! [swapper:0]
...
RIP: 0010:[<
ffffffff8152d48e>] [<
ffffffff8152d48e>] _spin_lock+0x1e/0x30
RSP: 0018:
ffff880028323b20 EFLAGS:
00000206
RAX:
0000000000000002 RBX:
ffff880028323b20 RCX:
0000000000000000
RDX:
0000000000000000 RSI:
ffff880028323be0 RDI:
ffff8804632c4b48
RBP:
ffffffff8100bb93 R08:
0000000000000000 R09:
0000000000000000
R10:
ffff880610662280 R11:
0000000000000100 R12:
ffff880028323aa0
R13:
ffff8804383c3880 R14:
ffff880028323a90 R15:
ffffffff81534225
FS:
0000000000000000(0000) GS:
ffff880028320000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0:
000000008005003b
CR2:
00000000006df528 CR3:
0000000001a85000 CR4:
00000000000006e0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000ffff0ff0 DR7:
0000000000000400
Process swapper (pid: 0, threadinfo
ffff880616b70000, task
ffff880616b6cab0)
Stack:
ffff880028323c40 ffffffffa01c2582 ffff880614cfb020 0000000000000000
<d>
0100000000000000 00000014383a6c44 ffff8804383c3880 ffff880614e93c00
<d>
ffff880614e93c00 0000000000000000 ffff8804632c4b00 ffff8804383c38b8
Call Trace:
<IRQ>
[<
ffffffffa01c2582>] ? sctp_rcv+0x492/0xa10 [sctp]
[<
ffffffff8148c559>] ? nf_iterate+0x69/0xb0
[<
ffffffff814974a0>] ? ip_local_deliver_finish+0x0/0x2d0
[<
ffffffff8148c716>] ? nf_hook_slow+0x76/0x120
[<
ffffffff814974a0>] ? ip_local_deliver_finish+0x0/0x2d0
[<
ffffffff8149757d>] ? ip_local_deliver_finish+0xdd/0x2d0
[<
ffffffff81497808>] ? ip_local_deliver+0x98/0xa0
[<
ffffffff81496ccd>] ? ip_rcv_finish+0x12d/0x440
[<
ffffffff81497255>] ? ip_rcv+0x275/0x350
[<
ffffffff8145cfeb>] ? __netif_receive_skb+0x4ab/0x750
...
With lockdep debugging:
=====================================
[ BUG: bad unlock balance detected! ]
-------------------------------------
CslRx/12087 is trying to release lock (slock-AF_INET) at:
[<
ffffffffa01bcae0>] sctp_generate_timeout_event+0x40/0xe0 [sctp]
but there are no more locks to release!
other info that might help us debug this:
2 locks held by CslRx/12087:
#0: (&asoc->timers[i]){+.-...}, at: [<
ffffffff8108ce1f>] run_timer_softirq+0x16f/0x3e0
#1: (slock-AF_INET){+.-...}, at: [<
ffffffffa01bcac3>] sctp_generate_timeout_event+0x23/0xe0 [sctp]
Ensure the socket taken is also the same one that is released by
saving a copy of the socket before entering the timeout event
critical section.
Signed-off-by: Karl Heiss <kheiss@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.2:
- Net namespaces are not used
- Keep using sctp_bh_{,un}lock_sock()
- Adjust context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Zefan Li <lizefan@huawei.com>
Dave Airlie [Thu, 20 Aug 2015 00:13:55 +0000 (10:13 +1000)]
drm/radeon: fix hotplug race at startup
commit
7f98ca454ad373fc1b76be804fa7138ff68c1d27 upstream.
We apparantly get a hotplug irq before we've initialised
modesetting,
[drm] Loading R100 Microcode
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<
c125f56f>] __mutex_lock_slowpath+0x23/0x91
*pde =
00000000
Oops: 0002 [#1]
Modules linked in: radeon(+) drm_kms_helper ttm drm i2c_algo_bit backlight pcspkr psmouse evdev sr_mod input_leds led_class cdrom sg parport_pc parport floppy intel_agp intel_gtt lpc_ich acpi_cpufreq processor button mfd_core agpgart uhci_hcd ehci_hcd rng_core snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm usbcore usb_common i2c_i801 i2c_core snd_timer snd soundcore thermal_sys
CPU: 0 PID: 15 Comm: kworker/0:1 Not tainted
4.2.0-rc7-00015-gbf67402 #111
Hardware name: MicroLink /D850MV , BIOS MV85010A.86A.0067.P24.
0304081124 04/08/2003
Workqueue: events radeon_hotplug_work_func [radeon]
task:
f6ca5900 ti:
f6d3e000 task.ti:
f6d3e000
EIP: 0060:[<
c125f56f>] EFLAGS:
00010282 CPU: 0
EIP is at __mutex_lock_slowpath+0x23/0x91
EAX:
00000000 EBX:
f5e900fc ECX:
00000000 EDX:
fffffffe
ESI:
f6ca5900 EDI:
f5e90100 EBP:
f5e90000 ESP:
f6d3ff0c
DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
CR0:
8005003b CR2:
00000000 CR3:
36f61000 CR4:
000006d0
Stack:
f5e90100 00000000 c103c4c1 f6d2a5a0 f5e900fc f6df394c c125f162 f8b0faca
f6d2a5a0 c138ca00 f6df394c f7395600 c1034741 00d40000 00000000 f6d2a5a0
c138ca00 f6d2a5b8 c138ca10 c1034b58 00000001 f6d40000 f6ca5900 f6d0c940
Call Trace:
[<
c103c4c1>] ? dequeue_task_fair+0xa4/0xb7
[<
c125f162>] ? mutex_lock+0x9/0xa
[<
f8b0faca>] ? radeon_hotplug_work_func+0x17/0x57 [radeon]
[<
c1034741>] ? process_one_work+0xfc/0x194
[<
c1034b58>] ? worker_thread+0x18d/0x218
[<
c10349cb>] ? rescuer_thread+0x1d5/0x1d5
[<
c103742a>] ? kthread+0x7b/0x80
[<
c12601c0>] ? ret_from_kernel_thread+0x20/0x30
[<
c10373af>] ? init_completion+0x18/0x18
Code: 42 08 e8 8e a6 dd ff c3 57 56 53 83 ec 0c 8b 35 48 f7 37 c1 8b 10 4a 74 1a 89 c3 8d 78 04 8b 40 08 89 63
Reported-and-Tested-by: Meelis Roos <mroos@linux.ee>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Cc: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Zefan Li <lizefan@huawei.com>
Eric Dumazet [Wed, 30 Dec 2015 13:51:12 +0000 (08:51 -0500)]
udp: properly support MSG_PEEK with truncated buffers
commit
197c949e7798fbf28cfadc69d9ca0c2abbf93191 upstream.
Backport of this upstream commit into stable kernels :
89c22d8c3b27 ("net: Fix skb csum races when peeking")
exposed a bug in udp stack vs MSG_PEEK support, when user provides
a buffer smaller than skb payload.
In this case,
skb_copy_and_csum_datagram_iovec(skb, sizeof(struct udphdr),
msg->msg_iov);
returns -EFAULT.
This bug does not happen in upstream kernels since Al Viro did a great
job to replace this into :
skb_copy_and_csum_datagram_msg(skb, sizeof(struct udphdr), msg);
This variant is safe vs short buffers.
For the time being, instead reverting Herbert Xu patch and add back
skb->ip_summed invalid changes, simply store the result of
udp_lib_checksum_complete() so that we avoid computing the checksum a
second time, and avoid the problematic
skb_copy_and_csum_datagram_iovec() call.
This patch can be applied on recent kernels as it avoids a double
checksumming, then backported to stable kernels as a bug fix.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li <lizefan@huawei.com>
Herbert Xu [Mon, 13 Jul 2015 12:01:42 +0000 (20:01 +0800)]
net: Fix skb csum races when peeking
[ Upstream commit
89c22d8c3b278212eef6a8cc66b570bc840a6f5a ]
When we calculate the checksum on the recv path, we store the
result in the skb as an optimisation in case we need the checksum
again down the line.
This is in fact bogus for the MSG_PEEK case as this is done without
any locking. So multiple threads can peek and then store the result
to the same skb, potentially resulting in bogus skb states.
This patch fixes this by only storing the result if the skb is not
shared. This preserves the optimisations for the few cases where
it can be done safely due to locking or other reasons, e.g., SIOCINQ.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Zefan Li <lizefan@huawei.com>
Ben Hutchings [Wed, 23 Dec 2015 13:25:54 +0000 (13:25 +0000)]
USB: ti_usb_3410_502: Fix ID table size
Commit
35a2fbc941ac ("USB: serial: ti_usb_3410_5052: new device id for
Abbot strip port cable") failed to update the size of the
ti_id_table_3410 array. This doesn't need to be fixed upstream
following commit
d7ece6515e12 ("USB: ti_usb_3410_5052: remove
vendor/product module parameters") but should be fixed in stable
branches older than 3.12.
Backports of commit
c9d09dc7ad10 ("USB: serial: ti_usb_3410_5052: add
Abbott strip port ID to combined table as well.") similarly failed to
update the size of the ti_id_table_combined array.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Zefan Li <lizefan@huawei.com>
Eric Dumazet [Wed, 1 May 2013 05:24:03 +0000 (05:24 +0000)]
af_unix: fix a fatal race with bit fields
commit
60bc851ae59bfe99be6ee89d6bc50008c85ec75d upstream.
Using bit fields is dangerous on ppc64/sparc64, as the compiler [1]
uses 64bit instructions to manipulate them.
If the 64bit word includes any atomic_t or spinlock_t, we can lose
critical concurrent changes.
This is happening in af_unix, where unix_sk(sk)->gc_candidate/
gc_maybe_cycle/lock share the same 64bit word.
This leads to fatal deadlock, as one/several cpus spin forever
on a spinlock that will never be available again.
A safer way would be to use a long to store flags.
This way we are sure compiler/arch wont do bad things.
As we own unix_gc_lock spinlock when clearing or setting bits,
we can use the non atomic __set_bit()/__clear_bit().
recursion_level can share the same 64bit location with the spinlock,
as it is set only with this spinlock held.
[1] bug fixed in gcc-4.8.0 :
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52080
Reported-by: Ambrose Feinstein <ambrose@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: hejianet <hejianet@gmail.com>
Signed-off-by: Zefan Li <lizefan@huawei.com>