unionfs-3.15.y.git
11 years agoUnionfs: Check remount options for being NULL
Adrian Brunyate [Thu, 26 Apr 2007 22:51:16 +0000 (18:51 -0400)]
Unionfs: Check remount options for being NULL

Signed-off-by: Adrian Brunyate <abrunyate@yahoo.com>
Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agoUnionfs: Accept MS_SILENT during remount
Adrian Brunyate [Thu, 26 Apr 2007 22:51:09 +0000 (18:51 -0400)]
Unionfs: Accept MS_SILENT during remount

[jsipek: whitespace cleanup]
Signed-off-by: Adrian Brunyate <abrunyate@yahoo.com>
Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agoUnionfs: Don't inline do_remount_{add,del,mode}_option
Erez Zadok [Fri, 23 Mar 2007 17:10:46 +0000 (13:10 -0400)]
Unionfs: Don't inline do_remount_{add,del,mode}_option

gcc4 decided to inline do_remount_{add,del,mode}_option creating an 600 byte
stack abuser on a x86_64 test box.

Reported by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>

Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agoUnionfs: Added several BUG_ONs to assert dentry validity
Erez Zadok [Fri, 23 Mar 2007 17:07:14 +0000 (13:07 -0400)]
Unionfs: Added several BUG_ONs to assert dentry validity

This should help catch races between the VFS and the branch-management code.

Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agoUnionfs: Properly handle stale inodes passed to unionfs_permission
Erez Zadok [Fri, 23 Mar 2007 00:17:48 +0000 (20:17 -0400)]
Unionfs: Properly handle stale inodes passed to unionfs_permission

Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agoUnionfs: Pass lowernd to lower ->revalidate function
Erez Zadok [Fri, 23 Mar 2007 00:09:07 +0000 (20:09 -0400)]
Unionfs: Pass lowernd to lower ->revalidate function

Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agoUnionfs: vfsmount reference counting fixes
Erez Zadok [Fri, 23 Mar 2007 00:08:30 +0000 (20:08 -0400)]
Unionfs: vfsmount reference counting fixes

Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agoUnionfs: unionfs_create needs to revalidate the dentry
Erez Zadok [Fri, 23 Mar 2007 00:01:56 +0000 (20:01 -0400)]
Unionfs: unionfs_create needs to revalidate the dentry

We have to read-lock the superblock rwsem, and we have to revalidate the
parent dentry and this one.  A branch-management operation could have taken
place, mid-way through a VFS operation that eventually reaches
unionfs_create().  So we have to ensure consistency, just as we do with the
file operations.

Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agoUnionfs: Decrement totalopens counter on error in unionfs_open
Erez Zadok [Thu, 22 Mar 2007 23:47:43 +0000 (19:47 -0400)]
Unionfs: Decrement totalopens counter on error in unionfs_open

Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agoUnionfs: Document unionfs_d_release locking
Josef 'Jeff' Sipek [Sun, 18 Mar 2007 20:30:45 +0000 (16:30 -0400)]
Unionfs: Document unionfs_d_release locking

Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agoUnionfs: Remove the older incgen ioctl
Josef 'Jeff' Sipek [Sun, 18 Mar 2007 20:22:03 +0000 (16:22 -0400)]
Unionfs: Remove the older incgen ioctl

The new remount code now has the "incgen" functionality.

Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agoUnionfs: Grab the unionfs sb private data lock around branch info users
Erez Zadok [Thu, 22 Mar 2007 23:32:40 +0000 (19:32 -0400)]
Unionfs: Grab the unionfs sb private data lock around branch info users

Locking/concurrency/race fixes.  Use the unionfs superblock rwsem, and grab
the read lock around every op that uses branch-related information, such as
branch counters.  Grab the write rwsem lock in operations which attempt to
change branch information, such as when adding/deleting branches.  This
will, for example, cause branch-management remount commands (which are
infrequent) to block a bit until all in-progress file operations on open
files are done.

Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
[jsipek: whitespace fixes & more locks/unlocks]
Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agoUnionfs: Rewrite unionfs_d_revalidate
Erez Zadok [Tue, 20 Mar 2007 01:02:55 +0000 (21:02 -0400)]
Unionfs: Rewrite unionfs_d_revalidate

Rewrite unionfs_d_revalidate code to avoid stack-unfriendly recursion: split
into a call to revalidate just one dentry, and an interative driver function
to revalidate an entire dentry-parent chain.

Fix vfsmount ref leaks which prevented lower f/s from being unmounted after
generation increment, esp. during heavy loads.

Fix one deadlock between revalidation code and VFS.

Better documentation of what the code does.

Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
[jsipek: compile & whitespace fixes]
Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agoUnionfs: Introduce unionfs_mnt{get,put}
Erez Zadok [Tue, 20 Mar 2007 00:57:49 +0000 (20:57 -0400)]
Unionfs: Introduce unionfs_mnt{get,put}

Helper inline functions to perform Unionfs's mntget/put ops on lower
branches.

Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
[jsipek: cleanup branching in unionfs_mnt{get,put} and compile fixes]
Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agoUnionfs: Bulk of branch-management remount code
Erez Zadok [Tue, 20 Mar 2007 00:44:12 +0000 (20:44 -0400)]
Unionfs: Bulk of branch-management remount code

Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agoUnionfs: Introduce branch-id code
Erez Zadok [Thu, 22 Mar 2007 23:43:29 +0000 (19:43 -0400)]
Unionfs: Introduce branch-id code

Each branch gets a unique ID, which helps during branch additions,
deletions, and changes, to locate where branches were moved to, and perform
proper reference-counting.  This is useful even if the same directory was
added more than once to union.

Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agoUnionfs: Actually verify if dentry's info node is locked
Erez Zadok [Mon, 9 Apr 2007 13:57:51 +0000 (09:57 -0400)]
Unionfs: Actually verify if dentry's info node is locked

Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agoUnionfs: Provide more helpful info on branch leaks during unmount
Erez Zadok [Mon, 19 Mar 2007 07:01:49 +0000 (03:01 -0400)]
Unionfs: Provide more helpful info on branch leaks during unmount

Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
[jsipek: no need to take a read lock on the superblock private data]
Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agoUnionfs: Rename unionfs_data sbcount field to more appropriate open_files
Erez Zadok [Mon, 19 Mar 2007 04:46:03 +0000 (00:46 -0400)]
Unionfs: Rename unionfs_data sbcount field to more appropriate open_files

Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agoUnionfs: Proper comment on rwsem field
Josef 'Jeff' Sipek [Sun, 18 Mar 2007 23:37:51 +0000 (19:37 -0400)]
Unionfs: Proper comment on rwsem field

Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agoUnionfs: Documentation updates for branch-management
Erez Zadok [Fri, 23 Mar 2007 21:17:07 +0000 (17:17 -0400)]
Unionfs: Documentation updates for branch-management

Describe dynamic branch-management introduced by subsequent patches.

Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agofs: Export drop_pagecache_sb symbol
Erez Zadok [Sun, 18 Mar 2007 23:24:29 +0000 (19:24 -0400)]
fs: Export drop_pagecache_sb symbol

Signed-off-by: Erez Zadok <ezkcs.sunysb.edu>
Signed-off-by: Josef 'Jeff' Sipek <jsipekcs.sunysb.edu>
11 years agofs: Introduce path{get,put}
Erez Zadok [Sun, 18 Mar 2007 23:23:52 +0000 (19:23 -0400)]
fs: Introduce path{get,put}

Export drop_pagecache_sb symbol (for branch-management).

Signed-off-by: Erez Zadok <ezkcs.sunysb.edu>
Signed-off-by: Josef 'Jeff' Sipek <jsipekcs.sunysb.edu>
11 years ago[PATCH] Unionfs: sioq not __exit
Randy Dunlap [Mon, 19 Mar 2007 23:15:22 +0000 (19:15 -0400)]
[PATCH] Unionfs: sioq not __exit

stop_sioq() is called from both __init and __exit functions, so it
shouldn't be marked __exit.

Reported on the kernelnewbies mailing list, but no patch offered there.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agoMAINTAINERS: Unionfs URL update
Josef 'Jeff' Sipek [Sat, 17 Mar 2007 23:42:25 +0000 (19:42 -0400)]
MAINTAINERS: Unionfs URL update

Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agofs/unionfs/: Fix unlocking in error paths
Erez Zadok [Sun, 4 Mar 2007 23:02:11 +0000 (18:02 -0500)]
fs/unionfs/: Fix unlocking in error paths

Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agofs/unionfs/: Fix dentry leak in copyup_named_dentry
Erez Zadok [Sun, 4 Mar 2007 20:21:14 +0000 (15:21 -0500)]
fs/unionfs/: Fix dentry leak in copyup_named_dentry

When we chmod a directory on a readonly branch, and have to copy it up, we
forget to dput(). If this was a file, it gets dput indirectly through other
functions we call, but not if it was a directory.

Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agofs/unionfs/: mntput in __cleanup_dentry
Erez Zadok [Fri, 2 Mar 2007 18:10:56 +0000 (13:10 -0500)]
fs/unionfs/: mntput in __cleanup_dentry

This fixes a mnt refleak which occured during copyup when directory
hierarchy was recreated on a writable branch.

Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agofs/unionfs/: Fix copyup_deleted_file dentry leak
Erez Zadok [Sun, 4 Mar 2007 05:41:51 +0000 (00:41 -0500)]
fs/unionfs/: Fix copyup_deleted_file dentry leak

Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agofs/unionfs/: Miscellaneous coding style fixes
Josef 'Jeff' Sipek [Fri, 2 Mar 2007 18:14:01 +0000 (13:14 -0500)]
fs/unionfs/: Miscellaneous coding style fixes

Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agofs/unionfs/: Check return value of d_path
Josef 'Jeff' Sipek [Tue, 27 Feb 2007 19:27:25 +0000 (14:27 -0500)]
fs/unionfs/: Check return value of d_path

Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agofs/unionfs/: Use SEEK_{SET,CUR} instead of hardcoded values
Josef 'Jeff' Sipek [Sun, 25 Feb 2007 22:32:46 +0000 (17:32 -0500)]
fs/unionfs/: Use SEEK_{SET,CUR} instead of hardcoded values

Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agofs/unionfs/: Remove alloc_filldir_node
Josef 'Jeff' Sipek [Sun, 25 Feb 2007 22:30:34 +0000 (17:30 -0500)]
fs/unionfs/: Remove alloc_filldir_node

alloc_filldir_node was used only once. Additionally, all the arguments
passed to it were ignored wasting stack space for no reason whatsoever.

Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agofs/unionfs/: Rename unionfs_d_revalidate_wrap
Josef 'Jeff' Sipek [Sat, 24 Feb 2007 21:47:45 +0000 (16:47 -0500)]
fs/unionfs/: Rename unionfs_d_revalidate_wrap

Follow the convention of "foo" calling "__foo".

Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agofs/unionfs/: Several small cleanups in unionfs_interpose
Josef 'Jeff' Sipek [Sat, 24 Feb 2007 21:46:08 +0000 (16:46 -0500)]
fs/unionfs/: Several small cleanups in unionfs_interpose

1) No need to lock the inode - lockdep was complaining about potential
circular dependency

2) No need to use temporary variable for iunique() inode number

3) Removed unneeded comment

Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agofs/unionfs/: Don't grab dentry private data mutex in unionfs_d_release
Josef 'Jeff' Sipek [Fri, 23 Feb 2007 06:04:09 +0000 (01:04 -0500)]
fs/unionfs/: Don't grab dentry private data mutex in unionfs_d_release

Grabbing the UNIONFS_D(dentry)->lock is completely unnecessary and there are
no other references; we are about to free the object anyway. Additionally,
grabbing the mutex produces warning when the slab object is reused - as it
was freed while there still was a reference to it.

Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agofs/unionfs/: Fix a memory leak in unionfs_read_super
Erez Zadok [Fri, 2 Mar 2007 18:35:33 +0000 (13:35 -0500)]
fs/unionfs/: Fix a memory leak in unionfs_read_super

Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agofs/unionfs: Fix a memory leak & null pointer dereference
Erez Zadok [Fri, 2 Mar 2007 18:00:57 +0000 (13:00 -0500)]
fs/unionfs: Fix a memory leak & null pointer dereference

Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years ago[PATCH] UNION_FS must depend on SLAB
Adrian Bunk [Wed, 21 Feb 2007 06:30:10 +0000 (01:30 -0500)]
[PATCH] UNION_FS must depend on SLAB

On Sat, Feb 17, 2007 at 09:51:46PM -0800, Andrew Morton wrote:
>...
> Changes since 2.6.20-mm1:
>...
>  git-unionfs.patch
>...
>  git trees
>...

<--  snip  -->

...
  CC      fs/unionfs/copyup.o
/home/bunk/linux/kernel-2.6/linux-2.6.20-mm2/fs/unionfs/copyup.c: In function 'create_parents_named':
/home/bunk/linux/kernel-2.6/linux-2.6.20-mm2/fs/unionfs/copyup.c:620: error: 'malloc_sizes' undeclared (first use in this function)
/home/bunk/linux/kernel-2.6/linux-2.6.20-mm2/fs/unionfs/copyup.c:620: error: (Each undeclared identifier is reported only once
/home/bunk/linux/kernel-2.6/linux-2.6.20-mm2/fs/unionfs/copyup.c:620: error: for each function it appears in.)
make[3]: *** [fs/unionfs/copyup.o] Error 1

<--  snip  -->

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agofs/unionfs/: Remove unused structure members & macros
Josef 'Jeff' Sipek [Sat, 17 Feb 2007 08:25:15 +0000 (03:25 -0500)]
fs/unionfs/: Remove unused structure members & macros

This patch removes:
- hidden_mnt pointer from struct unionfs_data
- mount_flag from struct unionfs_sb_info
- mount_flag related macros

Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agoUnionfs: Documentation update
Erez Zadok [Mon, 12 Feb 2007 17:36:38 +0000 (12:36 -0500)]
Unionfs: Documentation update

Be little gentler & updated the URLs

Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agofs/: Move eCryptfs & Unionfs config options into a sub-menu
Josef 'Jeff' Sipek [Tue, 20 Feb 2007 04:35:32 +0000 (23:35 -0500)]
fs/: Move eCryptfs & Unionfs config options into a sub-menu

Using The Misc filesystems sub-menu for layered/stackable filesystems only
makes it harder for users to find eCryptfs/Unionfs.

Additionally, the menu can be easily turned into a menuconfig, which could
be used to turn on any VFS/VM functionality required by layered filesystems
(there is none at the moment).

Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
Signed-off-by: Michael Halcrow <mhalcrow@us.ibm.com>
11 years agofs/unionfs/: Use __roundup_pow_of_two instead of custom rounding code
Josef 'Jeff' Sipek [Thu, 1 Feb 2007 17:14:03 +0000 (12:14 -0500)]
fs/unionfs/: Use __roundup_pow_of_two instead of custom rounding code

Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agofs/unionfs/: Don't duplicate the struct nameidata
Josef 'Jeff' Sipek [Sun, 28 Jan 2007 19:20:19 +0000 (14:20 -0500)]
fs/unionfs/: Don't duplicate the struct nameidata

The only fields that we have to watch out for are the dentry and vfsmount.
Additionally, this makes Unionfs gentler on the stack as nameidata is rather
large.

Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agofs/unionfs/: Andrew Morton's comments
Josef 'Jeff' Sipek [Sun, 28 Jan 2007 19:20:49 +0000 (14:20 -0500)]
fs/unionfs/: Andrew Morton's comments

- rename {,un}lock_dentry to unionfs_{,un}lock_dentry
- few minor coding style fixes
- removed prototypes from .c files
- replaced dbstart macros etc with static inlines
- replaced UNIONFS_D(d)->sem semaphore with a mutex
- renamed sioq struct workqueue to superio_workqueue
- made unionfs_get_nlinks and alloc_whname not inlined

Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agofs/unionfs/: possible cleanups
Adrian Bunk [Thu, 25 Jan 2007 08:15:59 +0000 (03:15 -0500)]
fs/unionfs/: possible cleanups

This patch contains the following possible cleanups:
- every function should #include the headers containing the prototypes
  of it's global functions
- static functions in C files shouldn't be marked "inline", gcc should
  know best when to inline them
- make needlessly global code static
- #if 0 the following unused global function:
  - stale_inode.c: is_stale_inode()

Signed-off-by: Adrian Bunk <bunk@stusta.de>
[removed stale inode related fixes as stale_inode.c is gone]
Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agofs/unionfs/: Remove stale_inode.c
Josef 'Jeff' Sipek [Sun, 28 Jan 2007 20:52:43 +0000 (15:52 -0500)]
fs/unionfs/: Remove stale_inode.c

The stale inode operations were heavily based on bad inode operations. This
patch removes stale_inode.c and converts all users of stale_inode_ops to
bad_inode_ops as there seems to be no reason to return ESTALE instead of
EIO.

This is the more appropriate than porting the bad_inode.c fix (commit
be6aab0e9fa6d3c6d75aa1e38ac972d8b4ee82b8) to stale_inode.c.

Signed-off-by: Josef 'Jeff' Sipek <jsipek@cs.sunysb.edu>
11 years agoUnionfs: Extended Attributes support
Josef "Jeff" Sipek [Sun, 28 Jan 2007 20:11:23 +0000 (15:11 -0500)]
Unionfs: Extended Attributes support

Extended attribute support.

Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu>
Signed-off-by: David Quigley <dquigley@fsl.cs.sunysb.edu>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
11 years agoUnionfs: Kconfig and Makefile
Josef "Jeff" Sipek [Sun, 28 Jan 2007 20:10:20 +0000 (15:10 -0500)]
Unionfs: Kconfig and Makefile

This patch contains the changes to fs Kconfig file, Makefiles, and Maintainers
file for Unionfs.

Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu>
Signed-off-by: David Quigley <dquigley@fsl.cs.sunysb.edu>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
11 years agoUnionfs: Unlink
Josef "Jeff" Sipek [Sun, 21 Jan 2007 23:47:57 +0000 (18:47 -0500)]
Unionfs: Unlink

This patch provides unlink functionality for Unionfs.

Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu>
Signed-off-by: David Quigley <dquigley@fsl.cs.sunysb.edu>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
11 years agoUnionfs: Include file
Josef "Jeff" Sipek [Sun, 21 Jan 2007 23:47:43 +0000 (18:47 -0500)]
Unionfs: Include file

Global include file - can be included from userspace by utilities.

Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu>
Signed-off-by: David Quigley <dquigley@fsl.cs.sunysb.edu>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
11 years agoUnionfs: Internal include file
Josef "Jeff" Sipek [Sun, 28 Jan 2007 20:05:29 +0000 (15:05 -0500)]
Unionfs: Internal include file

This patch contains an internal Unionfs include file. The include file is
specific to kernel code only, and therefore is separate from
include/linux/unionfs.h.

Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu>
Signed-off-by: David Quigley <dquigley@fsl.cs.sunysb.edu>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
11 years agoUnionfs: Helper macros/inlines
Josef "Jeff" Sipek [Sun, 21 Jan 2007 23:46:56 +0000 (18:46 -0500)]
Unionfs: Helper macros/inlines

This patch contains many macros and inline functions used thoughout Unionfs.

Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu>
Signed-off-by: David Quigley <dquigley@fsl.cs.sunysb.edu>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
11 years agoUnionfs: Handling of stale inodes
Josef "Jeff" Sipek [Sun, 28 Jan 2007 20:05:07 +0000 (15:05 -0500)]
Unionfs: Handling of stale inodes

Provides nicer handling of stale inodes.

Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu>
Signed-off-by: David Quigley <dquigley@fsl.cs.sunysb.edu>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
11 years agoUnionfs: Superblock operations
Josef "Jeff" Sipek [Sun, 21 Jan 2007 23:46:47 +0000 (18:46 -0500)]
Unionfs: Superblock operations

This patch contains the superblock operations for Unionfs.

Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu>
Signed-off-by: David Quigley <dquigley@fsl.cs.sunysb.edu>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
11 years agoUnionfs: Miscellaneous helper functions
Josef "Jeff" Sipek [Sun, 21 Jan 2007 23:46:39 +0000 (18:46 -0500)]
Unionfs: Miscellaneous helper functions

This patch contains miscellaneous helper functions used thoughout Unionfs.

Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu>
Signed-off-by: David Quigley <dquigley@fsl.cs.sunysb.edu>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
11 years agoUnionfs: Privileged operations workqueue
Josef "Jeff" Sipek [Sun, 21 Jan 2007 23:46:11 +0000 (18:46 -0500)]
Unionfs: Privileged operations workqueue

Workqueue & helper functions used to perform privileged operations on
behalf of the user process.

Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu>
Signed-off-by: David Quigley <dquigley@fsl.cs.sunysb.edu>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
11 years agoUnionfs: Rename
Josef "Jeff" Sipek [Sun, 21 Jan 2007 23:45:56 +0000 (18:45 -0500)]
Unionfs: Rename

This patch provides rename functionality for Unionfs.

Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu>
Signed-off-by: David Quigley <dquigley@fsl.cs.sunysb.edu>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
11 years agoUnionfs: Readdir state
Josef "Jeff" Sipek [Sun, 21 Jan 2007 23:45:48 +0000 (18:45 -0500)]
Unionfs: Readdir state

This file contains the routines for maintaining readdir state.

Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu>
Signed-off-by: David Quigley <dquigley@fsl.cs.sunysb.edu>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
11 years agoUnionfs: Main module functions
Josef "Jeff" Sipek [Sun, 21 Jan 2007 23:45:28 +0000 (18:45 -0500)]
Unionfs: Main module functions

Module init & cleanup code, as well as interposition functions.

Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu>
Signed-off-by: David Quigley <dquigley@fsl.cs.sunysb.edu>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
11 years agoUnionfs: Lookup helper functions
Josef "Jeff" Sipek [Sun, 21 Jan 2007 23:45:13 +0000 (18:45 -0500)]
Unionfs: Lookup helper functions

This patch provides helper functions for the lookup operations in Unionfs.

Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu>
Signed-off-by: David Quigley <dquigley@fsl.cs.sunysb.edu>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
11 years agoUnionfs: Inode operations
Josef "Jeff" Sipek [Sun, 21 Jan 2007 23:44:54 +0000 (18:44 -0500)]
Unionfs: Inode operations

This patch provides the inode operations for Unionfs.

Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu>
Signed-off-by: David Quigley <dquigley@fsl.cs.sunysb.edu>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
11 years agoUnionfs: Directory manipulation helper functions
Josef "Jeff" Sipek [Sun, 21 Jan 2007 23:44:35 +0000 (18:44 -0500)]
Unionfs: Directory manipulation helper functions

This patch contains directory manipulation helper functions.

Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu>
Signed-off-by: David Quigley <dquigley@fsl.cs.sunysb.edu>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
11 years agoUnionfs: Directory file operations
Josef "Jeff" Sipek [Sun, 21 Jan 2007 23:44:24 +0000 (18:44 -0500)]
Unionfs: Directory file operations

This patch provides directory file operations.

Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu>
Signed-off-by: David Quigley <dquigley@fsl.cs.sunysb.edu>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
11 years agoUnionfs: File operations
Josef "Jeff" Sipek [Sun, 21 Jan 2007 23:44:05 +0000 (18:44 -0500)]
Unionfs: File operations

This patch provides the file operations for Unionfs.

Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu>
Signed-off-by: David Quigley <dquigley@fsl.cs.sunysb.edu>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
11 years agoUnionfs: Dentry operations
Josef "Jeff" Sipek [Sun, 28 Jan 2007 20:05:21 +0000 (15:05 -0500)]
Unionfs: Dentry operations

This patch contains the dentry operations for Unionfs.

Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu>
Signed-off-by: David Quigley <dquigley@fsl.cs.sunysb.edu>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
11 years agoUnionfs: Copyup Functionality
Josef "Jeff" Sipek [Sun, 21 Jan 2007 23:42:32 +0000 (18:42 -0500)]
Unionfs: Copyup Functionality

This patch contains the functions used to perform copyup operations in unionfs.

Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu>
Signed-off-by: David Quigley <dquigley@fsl.cs.sunysb.edu>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
11 years agoUnionfs: Common file operations
Josef "Jeff" Sipek [Sun, 21 Jan 2007 23:42:16 +0000 (18:42 -0500)]
Unionfs: Common file operations

This patch contains helper functions used through the rest of the code which
pertains to files.

Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu>
Signed-off-by: David Quigley <dquigley@fsl.cs.sunysb.edu>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
11 years agoUnionfs: Branch management functionality
Josef "Jeff" Sipek [Sun, 21 Jan 2007 23:42:01 +0000 (18:42 -0500)]
Unionfs: Branch management functionality

This patch contains the ioctls to increase the union generation and to query
which branch a file exists on.

Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu>
Signed-off-by: David Quigley <dquigley@fsl.cs.sunysb.edu>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
11 years agoUnionfs: Documentation
Josef "Jeff" Sipek [Tue, 20 Feb 2007 04:31:35 +0000 (23:31 -0500)]
Unionfs: Documentation

This patch contains documentation for Unionfs. You will find several files
outlining basic unification concepts and rename semantics.

Signed-off-by: Josef "Jeff" Sipek <jsipek@cs.sunysb.edu>
Signed-off-by: David Quigley <dquigley@fsl.cs.sunysb.edu>
Signed-off-by: Erez Zadok <ezk@cs.sunysb.edu>
11 years agoLinux 3.15.10 v3.15.10
Greg Kroah-Hartman [Thu, 14 Aug 2014 01:52:00 +0000 (09:52 +0800)]
Linux 3.15.10

11 years agoxfs: log vector rounding leaks log space
Dave Chinner [Mon, 19 May 2014 22:18:09 +0000 (08:18 +1000)]
xfs: log vector rounding leaks log space

commit 110dc24ad2ae4e9b94b08632fe1eb2fcdff83045 upstream.

The addition of direct formatting of log items into the CIL
linear buffer added alignment restrictions that the start of each
vector needed to be 64 bit aligned. Hence padding was added in
xlog_finish_iovec() to round up the vector length to ensure the next
vector started with the correct alignment.

This adds a small number of bytes to the size of
the linear buffer that is otherwise unused. The issue is that we
then use the linear buffer size to determine the log space used by
the log item, and this includes the unused space. Hence when we
account for space used by the log item, it's more than is actually
written into the iclogs, and hence we slowly leak this space.

This results on log hangs when reserving space, with threads getting
stuck with these stack traces:

Call Trace:
[<ffffffff81d15989>] schedule+0x29/0x70
[<ffffffff8150d3a2>] xlog_grant_head_wait+0xa2/0x1a0
[<ffffffff8150d55d>] xlog_grant_head_check+0xbd/0x140
[<ffffffff8150ee33>] xfs_log_reserve+0x103/0x220
[<ffffffff814b7f05>] xfs_trans_reserve+0x2f5/0x310
.....

The 4 bytes is significant. Brain Foster did all the hard work in
tracking down a reproducable leak to inode chunk allocation (it went
away with the ikeep mount option). His rough numbers were that
creating 50,000 inodes leaked 11 log blocks. This turns out to be
roughly 800 inode chunks or 1600 inode cluster buffers. That
works out at roughly 4 bytes per cluster buffer logged, and at that
I started looking for a 4 byte leak in the buffer logging code.

What I found was that a struct xfs_buf_log_format structure for an
inode cluster buffer is 28 bytes in length. This gets rounded up to
32 bytes, but the vector length remains 28 bytes. Hence the CIL
ticket reservation is decremented by 32 bytes (via lv->lv_buf_len)
for that vector rather than 28 bytes which are written into the log.

The fix for this problem is to separately track the bytes used by
the log vectors in the item and use that instead of the buffer
length when accounting for the log space that will be used by the
formatted log item.

Again, thanks to Brian Foster for doing all the hard work and long
hours to isolate this leak and make finding the bug relatively
simple.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
Cc: Bill <billstuff2001@sbcglobal.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoarch/sparc/math-emu/math_32.c: drop stray break operator
Andrey Utkin [Mon, 4 Aug 2014 20:47:41 +0000 (23:47 +0300)]
arch/sparc/math-emu/math_32.c: drop stray break operator

[ Upstream commit 093758e3daede29cb4ce6aedb111becf9d4bfc57 ]

This commit is a guesswork, but it seems to make sense to drop this
break, as otherwise the following line is never executed and becomes
dead code. And that following line actually saves the result of
local calculation by the pointer given in function argument. So the
proposed change makes sense if this code in the whole makes sense (but I
am unable to analyze it in the whole).

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=81641
Reported-by: David Binderman <dcb314@hotmail.com>
Signed-off-by: Andrey Utkin <andrey.krieger.utkin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agosparc64: ldc_connect() should not return EINVAL when handshake is in progress.
Sowmini Varadhan [Fri, 1 Aug 2014 13:50:40 +0000 (09:50 -0400)]
sparc64: ldc_connect() should not return EINVAL when handshake is in progress.

[ Upstream commit 4ec1b01029b4facb651b8ef70bc20a4be4cebc63 ]

The LDC handshake could have been asynchronously triggered
after ldc_bind() enables the ldc_rx() receive interrupt-handler
(and thus intercepts incoming control packets)
and before vio_port_up() calls ldc_connect(). If that is the case,
ldc_connect() should return 0 and let the state-machine
progress.

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Karl Volz <karl.volz@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agosunsab: Fix detection of BREAK on sunsab serial console
Christopher Alexander Tobias Schulze [Sun, 3 Aug 2014 14:01:53 +0000 (16:01 +0200)]
sunsab: Fix detection of BREAK on sunsab serial console

[ Upstream commit fe418231b195c205701c0cc550a03f6c9758fd9e ]

Fix detection of BREAK on sunsab serial console: BREAK detection was only
performed when there were also serial characters received simultaneously.
To handle all BREAKs correctly, the check for BREAK and the corresponding
call to uart_handle_break() must also be done if count == 0, therefore
duplicate this code fragment and pull it out of the loop over the received
characters.

Patch applies to 3.16-rc6.

Signed-off-by: Christopher Alexander Tobias Schulze <cat.schulze@alice-dsl.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agobbc-i2c: Fix BBC I2C envctrl on SunBlade 2000
Christopher Alexander Tobias Schulze [Sun, 3 Aug 2014 13:44:52 +0000 (15:44 +0200)]
bbc-i2c: Fix BBC I2C envctrl on SunBlade 2000

[ Upstream commit 5cdceab3d5e02eb69ea0f5d8fa9181800baf6f77 ]

Fix regression in bbc i2c temperature and fan control on some Sun systems
that causes the driver to refuse to load due to the bbc_i2c_bussel resource not
being present on the (second) i2c bus where the temperature sensors and fan
control are located. (The check for the number of resources was removed when
the driver was ported to a pure OF driver in mid 2008.)

Signed-off-by: Christopher Alexander Tobias Schulze <cat.schulze@alice-dsl.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agosparc64: Guard against flushing openfirmware mappings.
David S. Miller [Tue, 5 Aug 2014 03:07:37 +0000 (20:07 -0700)]
sparc64: Guard against flushing openfirmware mappings.

[ Upstream commit 4ca9a23765da3260058db3431faf5b4efd8cf926 ]

Based almost entirely upon a patch by Christopher Alexander Tobias
Schulze.

In commit db64fe02258f1507e13fe5212a989922323685ce ("mm: rewrite vmap
layer") lazy VMAP tlb flushing was added to the vmalloc layer.  This
causes problems on sparc64.

Sparc64 has two VMAP mapped regions and they are not contiguous with
eachother.  First we have the malloc mapping area, then another
unrelated region, then the vmalloc region.

This "another unrelated region" is where the firmware is mapped.

If the lazy TLB flushing logic in the vmalloc code triggers after
we've had both a module unload and a vfree or similar, it will pass an
address range that goes from somewhere inside the malloc region to
somewhere inside the vmalloc region, and thus covering the
openfirmware area entirely.

The sparc64 kernel learns about openfirmware's dynamic mappings in
this region early in the boot, and then services TLB misses in this
area.  But openfirmware has some locked TLB entries which are not
mentioned in those dynamic mappings and we should thus not disturb
them.

These huge lazy TLB flush ranges causes those openfirmware locked TLB
entries to be removed, resulting in all kinds of problems including
hard hangs and crashes during reboot/reset.

Besides causing problems like this, such huge TLB flush ranges are
also incredibly inefficient.  A plea has been made with the author of
the VMAP lazy TLB flushing code, but for now we'll put a safety guard
into our flush_tlb_kernel_range() implementation.

Since the implementation has become non-trivial, stop defining it as a
macro and instead make it a function in a C source file.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agosparc64: Do not insert non-valid PTEs into the TSB hash table.
David S. Miller [Mon, 4 Aug 2014 23:34:01 +0000 (16:34 -0700)]
sparc64: Do not insert non-valid PTEs into the TSB hash table.

[ Upstream commit 18f38132528c3e603c66ea464727b29e9bbcb91b ]

The assumption was that update_mmu_cache() (and the equivalent for PMDs) would
only be called when the PTE being installed will be accessible by the user.

This is not true for code paths originating from remove_migration_pte().

There are dire consequences for placing a non-valid PTE into the TSB.  The TLB
miss frramework assumes thatwhen a TSB entry matches we can just load it into
the TLB and return from the TLB miss trap.

So if a non-valid PTE is in there, we will deadlock taking the TLB miss over
and over, never satisfying the miss.

Just exit early from update_mmu_cache() and friends in this situation.

Based upon a report and patch from Christopher Alexander Tobias Schulze.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agosparc: Hook up renameat2 syscall.
David S. Miller [Tue, 22 Jul 2014 05:27:56 +0000 (22:27 -0700)]
sparc: Hook up renameat2 syscall.

[ Upstream commit 26053926feb1c16ade9c30bc7443bf28d829d08e ]

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agosctp: fix possible seqlock seadlock in sctp_packet_transmit()
Eric Dumazet [Tue, 5 Aug 2014 14:49:52 +0000 (16:49 +0200)]
sctp: fix possible seqlock seadlock in sctp_packet_transmit()

[ Upstream commit 757efd32d5ce31f67193cc0e6a56e4dffcc42fb1 ]

Dave reported following splat, caused by improper use of
IP_INC_STATS_BH() in process context.

BUG: using __this_cpu_add() in preemptible [00000000] code: trinity-c117/14551
caller is __this_cpu_preempt_check+0x13/0x20
CPU: 3 PID: 14551 Comm: trinity-c117 Not tainted 3.16.0+ #33
 ffffffff9ec898f0 0000000047ea7e23 ffff88022d32f7f0 ffffffff9e7ee207
 0000000000000003 ffff88022d32f818 ffffffff9e397eaa ffff88023ee70b40
 ffff88022d32f970 ffff8801c026d580 ffff88022d32f828 ffffffff9e397ee3
Call Trace:
 [<ffffffff9e7ee207>] dump_stack+0x4e/0x7a
 [<ffffffff9e397eaa>] check_preemption_disabled+0xfa/0x100
 [<ffffffff9e397ee3>] __this_cpu_preempt_check+0x13/0x20
 [<ffffffffc0839872>] sctp_packet_transmit+0x692/0x710 [sctp]
 [<ffffffffc082a7f2>] sctp_outq_flush+0x2a2/0xc30 [sctp]
 [<ffffffff9e0d985c>] ? mark_held_locks+0x7c/0xb0
 [<ffffffff9e7f8c6d>] ? _raw_spin_unlock_irqrestore+0x5d/0x80
 [<ffffffffc082b99a>] sctp_outq_uncork+0x1a/0x20 [sctp]
 [<ffffffffc081e112>] sctp_cmd_interpreter.isra.23+0x1142/0x13f0 [sctp]
 [<ffffffffc081c86b>] sctp_do_sm+0xdb/0x330 [sctp]
 [<ffffffff9e0b8f1b>] ? preempt_count_sub+0xab/0x100
 [<ffffffffc083b350>] ? sctp_cname+0x70/0x70 [sctp]
 [<ffffffffc08389ca>] sctp_primitive_ASSOCIATE+0x3a/0x50 [sctp]
 [<ffffffffc083358f>] sctp_sendmsg+0x88f/0xe30 [sctp]
 [<ffffffff9e0d673a>] ? lock_release_holdtime.part.28+0x9a/0x160
 [<ffffffff9e0d62ce>] ? put_lock_stats.isra.27+0xe/0x30
 [<ffffffff9e73b624>] inet_sendmsg+0x104/0x220
 [<ffffffff9e73b525>] ? inet_sendmsg+0x5/0x220
 [<ffffffff9e68ac4e>] sock_sendmsg+0x9e/0xe0
 [<ffffffff9e1c0c09>] ? might_fault+0xb9/0xc0
 [<ffffffff9e1c0bae>] ? might_fault+0x5e/0xc0
 [<ffffffff9e68b234>] SYSC_sendto+0x124/0x1c0
 [<ffffffff9e0136b0>] ? syscall_trace_enter+0x250/0x330
 [<ffffffff9e68c3ce>] SyS_sendto+0xe/0x10
 [<ffffffff9e7f9be4>] tracesys+0xdd/0xe2

This is a followup of commits f1d8cba61c3c4b ("inet: fix possible
seqlock deadlocks") and 7f88c6b23afbd315 ("ipv6: fix possible seqlock
deadlock in ip6_finish_output2")

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Reported-by: Dave Jones <davej@redhat.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agobatman-adv: Fix out-of-order fragmentation support
Sven Eckelmann [Mon, 26 May 2014 15:21:39 +0000 (17:21 +0200)]
batman-adv: Fix out-of-order fragmentation support

[ Upstream commit d9124268d84a836f14a6ead54ff9d8eee4c43be5 ]

batadv_frag_insert_packet was unable to handle out-of-order packets because it
dropped them directly. This is caused by the way the fragmentation lists is
checked for the correct place to insert a fragmentation entry.

The fragmentation code keeps the fragments in lists. The fragmentation entries
are kept in descending order of sequence number. The list is traversed and each
entry is compared with the new fragment. If the current entry has a smaller
sequence number than the new fragment then the new one has to be inserted
before the current entry. This ensures that the list is still in descending
order.

An out-of-order packet with a smaller sequence number than all entries in the
list still has to be added to the end of the list. The used hlist has no
information about the last entry in the list inside hlist_head and thus the
last entry has to be calculated differently. Currently the code assumes that
the iterator variable of hlist_for_each_entry can be used for this purpose
after the hlist_for_each_entry finished. This is obviously wrong because the
iterator variable is always NULL when the list was completely traversed.

Instead the information about the last entry has to be stored in a different
variable.

This problem was introduced in 610bfc6bc99bc83680d190ebc69359a05fc7f605
("batman-adv: Receive fragmented packets and merge").

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoiovec: make sure the caller actually wants anything in memcpy_fromiovecend
Sasha Levin [Fri, 1 Aug 2014 03:00:35 +0000 (23:00 -0400)]
iovec: make sure the caller actually wants anything in memcpy_fromiovecend

[ Upstream commit 06ebb06d49486676272a3c030bfeef4bd969a8e6 ]

Check for cases when the caller requests 0 bytes instead of running off
and dereferencing potentially invalid iovecs.

Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agonet: Correctly set segment mac_len in skb_segment().
Vlad Yasevich [Thu, 31 Jul 2014 14:33:06 +0000 (10:33 -0400)]
net: Correctly set segment mac_len in skb_segment().

[ Upstream commit fcdfe3a7fa4cb74391d42b6a26dc07c20dab1d82 ]

When performing segmentation, the mac_len value is copied right
out of the original skb.  However, this value is not always set correctly
(like when the packet is VLAN-tagged) and we'll end up copying a bad
value.

One way to demonstrate this is to configure a VM which tags
packets internally and turn off VLAN acceleration on the forwarding
bridge port.  The packets show up corrupt like this:
16:18:24.985548 52:54:00:ab:be:25 > 52:54:00:26:ce:a3, ethertype 802.1Q
(0x8100), length 1518: vlan 100, p 0, ethertype 0x05e0,
        0x0000:  8cdb 1c7c 8cdb 0064 4006 b59d 0a00 6402 ...|...d@.....d.
        0x0010:  0a00 6401 9e0d b441 0a5e 64ec 0330 14fa ..d....A.^d..0..
        0x0020:  29e3 01c9 f871 0000 0101 080a 000a e833)....q.........3
        0x0030:  000f 8c75 6e65 7470 6572 6600 6e65 7470 ...unetperf.netp
        0x0040:  6572 6600 6e65 7470 6572 6600 6e65 7470 erf.netperf.netp
        0x0050:  6572 6600 6e65 7470 6572 6600 6e65 7470 erf.netperf.netp
        0x0060:  6572 6600 6e65 7470 6572 6600 6e65 7470 erf.netperf.netp
        ...

This also leads to awful throughput as GSO packets are dropped and
cause retransmissions.

The solution is to set the mac_len using the values already available
in then new skb.  We've already adjusted all of the header offset, so we
might as well correctly figure out the mac_len using skb_reset_mac_len().
After this change, packets are segmented correctly and performance
is restored.

CC: Eric Dumazet <edumazet@google.com>
Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agomacvlan: Initialize vlan_features to turn on offload support.
Vlad Yasevich [Thu, 31 Jul 2014 14:30:25 +0000 (10:30 -0400)]
macvlan: Initialize vlan_features to turn on offload support.

[ Upstream commit 081e83a78db9b0ae1f5eabc2dedecc865f509b98 ]

Macvlan devices do not initialize vlan_features.  As a result,
any vlan devices configured on top of macvlans perform very poorly.
Initialize vlan_features based on the vlan features of the lower-level
device.

Signed-off-by: Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agonet: sctp: inherit auth_capable on INIT collisions
Daniel Borkmann [Tue, 22 Jul 2014 13:22:45 +0000 (15:22 +0200)]
net: sctp: inherit auth_capable on INIT collisions

[ Upstream commit 1be9a950c646c9092fb3618197f7b6bfb50e82aa ]

Jason reported an oops caused by SCTP on his ARM machine with
SCTP authentication enabled:

Internal error: Oops: 17 [#1] ARM
CPU: 0 PID: 104 Comm: sctp-test Not tainted 3.13.0-68744-g3632f30c9b20-dirty #1
task: c6eefa40 ti: c6f52000 task.ti: c6f52000
PC is at sctp_auth_calculate_hmac+0xc4/0x10c
LR is at sg_init_table+0x20/0x38
pc : [<c024bb80>]    lr : [<c00f32dc>]    psr: 40000013
sp : c6f538e8  ip : 00000000  fp : c6f53924
r10: c6f50d80  r9 : 00000000  r8 : 00010000
r7 : 00000000  r6 : c7be4000  r5 : 00000000  r4 : c6f56254
r3 : c00c8170  r2 : 00000001  r1 : 00000008  r0 : c6f1e660
Flags: nZcv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
Control: 0005397f  Table: 06f28000  DAC: 00000015
Process sctp-test (pid: 104, stack limit = 0xc6f521c0)
Stack: (0xc6f538e8 to 0xc6f54000)
[...]
Backtrace:
[<c024babc>] (sctp_auth_calculate_hmac+0x0/0x10c) from [<c0249af8>] (sctp_packet_transmit+0x33c/0x5c8)
[<c02497bc>] (sctp_packet_transmit+0x0/0x5c8) from [<c023e96c>] (sctp_outq_flush+0x7fc/0x844)
[<c023e170>] (sctp_outq_flush+0x0/0x844) from [<c023ef78>] (sctp_outq_uncork+0x24/0x28)
[<c023ef54>] (sctp_outq_uncork+0x0/0x28) from [<c0234364>] (sctp_side_effects+0x1134/0x1220)
[<c0233230>] (sctp_side_effects+0x0/0x1220) from [<c02330b0>] (sctp_do_sm+0xac/0xd4)
[<c0233004>] (sctp_do_sm+0x0/0xd4) from [<c023675c>] (sctp_assoc_bh_rcv+0x118/0x160)
[<c0236644>] (sctp_assoc_bh_rcv+0x0/0x160) from [<c023d5bc>] (sctp_inq_push+0x6c/0x74)
[<c023d550>] (sctp_inq_push+0x0/0x74) from [<c024a6b0>] (sctp_rcv+0x7d8/0x888)

While we already had various kind of bugs in that area
ec0223ec48a9 ("net: sctp: fix sctp_sf_do_5_1D_ce to verify if
we/peer is AUTH capable") and b14878ccb7fa ("net: sctp: cache
auth_enable per endpoint"), this one is a bit of a different
kind.

Giving a bit more background on why SCTP authentication is
needed can be found in RFC4895:

  SCTP uses 32-bit verification tags to protect itself against
  blind attackers. These values are not changed during the
  lifetime of an SCTP association.

  Looking at new SCTP extensions, there is the need to have a
  method of proving that an SCTP chunk(s) was really sent by
  the original peer that started the association and not by a
  malicious attacker.

To cause this bug, we're triggering an INIT collision between
peers; normal SCTP handshake where both sides intent to
authenticate packets contains RANDOM; CHUNKS; HMAC-ALGO
parameters that are being negotiated among peers:

  ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ---------->
  <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] ---------
  -------------------- COOKIE-ECHO -------------------->
  <-------------------- COOKIE-ACK ---------------------

RFC4895 says that each endpoint therefore knows its own random
number and the peer's random number *after* the association
has been established. The local and peer's random number along
with the shared key are then part of the secret used for
calculating the HMAC in the AUTH chunk.

Now, in our scenario, we have 2 threads with 1 non-blocking
SEQ_PACKET socket each, setting up common shared SCTP_AUTH_KEY
and SCTP_AUTH_ACTIVE_KEY properly, and each of them calling
sctp_bindx(3), listen(2) and connect(2) against each other,
thus the handshake looks similar to this, e.g.:

  ---------- INIT[RANDOM; CHUNKS; HMAC-ALGO] ---------->
  <------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] ---------
  <--------- INIT[RANDOM; CHUNKS; HMAC-ALGO] -----------
  -------- INIT-ACK[RANDOM; CHUNKS; HMAC-ALGO] -------->
  ...

Since such collisions can also happen with verification tags,
the RFC4895 for AUTH rather vaguely says under section 6.1:

  In case of INIT collision, the rules governing the handling
  of this Random Number follow the same pattern as those for
  the Verification Tag, as explained in Section 5.2.4 of
  RFC 2960 [5]. Therefore, each endpoint knows its own Random
  Number and the peer's Random Number after the association
  has been established.

In RFC2960, section 5.2.4, we're eventually hitting Action B:

  B) In this case, both sides may be attempting to start an
     association at about the same time but the peer endpoint
     started its INIT after responding to the local endpoint's
     INIT. Thus it may have picked a new Verification Tag not
     being aware of the previous Tag it had sent this endpoint.
     The endpoint should stay in or enter the ESTABLISHED
     state but it MUST update its peer's Verification Tag from
     the State Cookie, stop any init or cookie timers that may
     running and send a COOKIE ACK.

In other words, the handling of the Random parameter is the
same as behavior for the Verification Tag as described in
Action B of section 5.2.4.

Looking at the code, we exactly hit the sctp_sf_do_dupcook_b()
case which triggers an SCTP_CMD_UPDATE_ASSOC command to the
side effect interpreter, and in fact it properly copies over
peer_{random, hmacs, chunks} parameters from the newly created
association to update the existing one.

Also, the old asoc_shared_key is being released and based on
the new params, sctp_auth_asoc_init_active_key() updated.
However, the issue observed in this case is that the previous
asoc->peer.auth_capable was 0, and has *not* been updated, so
that instead of creating a new secret, we're doing an early
return from the function sctp_auth_asoc_init_active_key()
leaving asoc->asoc_shared_key as NULL. However, we now have to
authenticate chunks from the updated chunk list (e.g. COOKIE-ACK).

That in fact causes the server side when responding with ...

  <------------------ AUTH; COOKIE-ACK -----------------

... to trigger a NULL pointer dereference, since in
sctp_packet_transmit(), it discovers that an AUTH chunk is
being queued for xmit, and thus it calls sctp_auth_calculate_hmac().

Since the asoc->active_key_id is still inherited from the
endpoint, and the same as encoded into the chunk, it uses
asoc->asoc_shared_key, which is still NULL, as an asoc_key
and dereferences it in ...

  crypto_hash_setkey(desc.tfm, &asoc_key->data[0], asoc_key->len)

... causing an oops. All this happens because sctp_make_cookie_ack()
called with the *new* association has the peer.auth_capable=1
and therefore marks the chunk with auth=1 after checking
sctp_auth_send_cid(), but it is *actually* sent later on over
the then *updated* association's transport that didn't initialize
its shared key due to peer.auth_capable=0. Since control chunks
in that case are not sent by the temporary association which
are scheduled for deletion, they are issued for xmit via
SCTP_CMD_REPLY in the interpreter with the context of the
*updated* association. peer.auth_capable was 0 in the updated
association (which went from COOKIE_WAIT into ESTABLISHED state),
since all previous processing that performed sctp_process_init()
was being done on temporary associations, that we eventually
throw away each time.

The correct fix is to update to the new peer.auth_capable
value as well in the collision case via sctp_assoc_update(),
so that in case the collision migrated from 0 -> 1,
sctp_auth_asoc_init_active_key() can properly recalculate
the secret. This therefore fixes the observed server panic.

Fixes: 730fc3d05cd4 ("[SCTP]: Implete SCTP-AUTH parameter processing")
Reported-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Tested-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agobna: fix performance regression
Ivan Vecera [Tue, 29 Jul 2014 14:29:30 +0000 (16:29 +0200)]
bna: fix performance regression

[ Upstream commit c36c9d50cc6af5c5bfcc195f21b73f55520c15f9 ]

The recent commit "e29aa33 bna: Enable Multi Buffer RX" is causing
a performance regression. It does not properly update 'cmpl' pointer
at the end of the loop in NAPI handler bnad_cq_process(). The result is
only one packet / per NAPI-schedule is processed.

Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agotcp: Fix integer-overflow in TCP vegas
Christoph Paasch [Tue, 29 Jul 2014 11:40:57 +0000 (13:40 +0200)]
tcp: Fix integer-overflow in TCP vegas

[ Upstream commit 1f74e613ded11517db90b2bd57e9464d9e0fb161 ]

In vegas we do a multiplication of the cwnd and the rtt. This
may overflow and thus their result is stored in a u64. However, we first
need to cast the cwnd so that actually 64-bit arithmetic is done.

Then, we need to do do_div to allow this to be used on 32-bit arches.

Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: Doug Leith <doug.leith@nuim.ie>
Fixes: 8d3a564da34e (tcp: tcp_vegas cong avoid fix)
Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoip: make IP identifiers less predictable
Eric Dumazet [Sat, 26 Jul 2014 06:58:10 +0000 (08:58 +0200)]
ip: make IP identifiers less predictable

[ Upstream commit 04ca6973f7c1a0d8537f2d9906a0cf8e69886d75 ]

In "Counting Packets Sent Between Arbitrary Internet Hosts", Jeffrey and
Jedidiah describe ways exploiting linux IP identifier generation to
infer whether two machines are exchanging packets.

With commit 73f156a6e8c1 ("inetpeer: get rid of ip_id_count"), we
changed IP id generation, but this does not really prevent this
side-channel technique.

This patch adds a random amount of perturbation so that IP identifiers
for a given destination [1] are no longer monotonically increasing after
an idle period.

Note that prandom_u32_max(1) returns 0, so if generator is used at most
once per jiffy, this patch inserts no hole in the ID suite and do not
increase collision probability.

This is jiffies based, so in the worst case (HZ=1000), the id can
rollover after ~65 seconds of idle time, which should be fine.

We also change the hash used in __ip_select_ident() to not only hash
on daddr, but also saddr and protocol, so that ICMP probes can not be
used to infer information for other protocols.

For IPv6, adds saddr into the hash as well, but not nexthdr.

If I ping the patched target, we can see ID are now hard to predict.

21:57:11.008086 IP (...)
    A > target: ICMP echo request, seq 1, length 64
21:57:11.010752 IP (... id 2081 ...)
    target > A: ICMP echo reply, seq 1, length 64

21:57:12.013133 IP (...)
    A > target: ICMP echo request, seq 2, length 64
21:57:12.015737 IP (... id 3039 ...)
    target > A: ICMP echo reply, seq 2, length 64

21:57:13.016580 IP (...)
    A > target: ICMP echo request, seq 3, length 64
21:57:13.019251 IP (... id 3437 ...)
    target > A: ICMP echo reply, seq 3, length 64

[1] TCP sessions uses a per flow ID generator not changed by this patch.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Jeffrey Knockel <jeffk@cs.unm.edu>
Reported-by: Jedidiah R. Crandall <crandall@cs.unm.edu>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Hannes Frederic Sowa <hannes@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoinetpeer: get rid of ip_id_count
Eric Dumazet [Mon, 2 Jun 2014 12:26:03 +0000 (05:26 -0700)]
inetpeer: get rid of ip_id_count

[ Upstream commit 73f156a6e8c1074ac6327e0abd1169e95eb66463 ]

Ideally, we would need to generate IP ID using a per destination IP
generator.

linux kernels used inet_peer cache for this purpose, but this had a huge
cost on servers disabling MTU discovery.

1) each inet_peer struct consumes 192 bytes

2) inetpeer cache uses a binary tree of inet_peer structs,
   with a nominal size of ~66000 elements under load.

3) lookups in this tree are hitting a lot of cache lines, as tree depth
   is about 20.

4) If server deals with many tcp flows, we have a high probability of
   not finding the inet_peer, allocating a fresh one, inserting it in
   the tree with same initial ip_id_count, (cf secure_ip_id())

5) We garbage collect inet_peer aggressively.

IP ID generation do not have to be 'perfect'

Goal is trying to avoid duplicates in a short period of time,
so that reassembly units have a chance to complete reassembly of
fragments belonging to one message before receiving other fragments
with a recycled ID.

We simply use an array of generators, and a Jenkin hash using the dst IP
as a key.

ipv6_select_ident() is put back into net/ipv6/ip6_output.c where it
belongs (it is only used from this file)

secure_ip_id() and secure_ipv6_id() no longer are needed.

Rename ip_select_ident_more() to ip_select_ident_segs() to avoid
unnecessary decrement/increment of the number of segments.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agotcp: Fix integer-overflows in TCP veno
Christoph Paasch [Tue, 29 Jul 2014 10:07:27 +0000 (12:07 +0200)]
tcp: Fix integer-overflows in TCP veno

[ Upstream commit 45a07695bc64b3ab5d6d2215f9677e5b8c05a7d0 ]

In veno we do a multiplication of the cwnd and the rtt. This
may overflow and thus their result is stored in a u64. However, we first
need to cast the cwnd so that actually 64-bit arithmetic is done.

A first attempt at fixing 76f1017757aa0 ([TCP]: TCP Veno congestion
control) was made by 159131149c2 (tcp: Overflow bug in Vegas), but it
failed to add the required cast in tcp_veno_cong_avoid().

Fixes: 76f1017757aa0 ([TCP]: TCP Veno congestion control)
Signed-off-by: Christoph Paasch <christoph.paasch@uclouvain.be>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoip_tunnel(ipv4): fix tunnels with "local any remote $remote_ip"
Dmitry Popov [Mon, 28 Jul 2014 23:07:52 +0000 (03:07 +0400)]
ip_tunnel(ipv4): fix tunnels with "local any remote $remote_ip"

[ Upstream commit 95cb5745983c222867cc9ac593aebb2ad67d72c0 ]

Ipv4 tunnels created with "local any remote $ip" didn't work properly since
7d442fab0 (ipv4: Cache dst in tunnels). 99% of packets sent via those tunnels
had src addr = 0.0.0.0. That was because only dst_entry was cached, although
fl4.saddr has to be cached too. Every time ip_tunnel_xmit used cached dst_entry
(tunnel_rtable_get returned non-NULL), fl4.saddr was initialized with
tnl_params->saddr (= 0 in our case), and wasn't changed until iptunnel_xmit().

This patch adds saddr to ip_tunnel->dst_cache, fixing this issue.

Reported-by: Sergey Popov <pinkbyte@gentoo.org>
Signed-off-by: Dmitry Popov <ixaphire@qrator.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agonet: phy: re-apply PHY fixups during phy_register_device
Florian Fainelli [Mon, 28 Jul 2014 23:28:07 +0000 (16:28 -0700)]
net: phy: re-apply PHY fixups during phy_register_device

[ Upstream commit d92f5dec6325079c550889883af51db1b77d5623 ]

Commit 87aa9f9c61ad ("net: phy: consolidate PHY reset in phy_init_hw()")
moved the call to phy_scan_fixups() in phy_init_hw() after a software
reset is performed.

By the time phy_init_hw() is called in phy_device_register(), no driver
has been bound to this PHY yet, so all the checks in phy_init_hw()
against the PHY driver and the PHY driver's config_init function will
return 0. We will therefore never call phy_scan_fixups() as we should.

Fix this by calling phy_scan_fixups() and check for its return value to
restore the intended functionality.

This broke PHY drivers which do register an early PHY fixup callback to
intercept the PHY probing and do things like changing the 32-bits unique
PHY identifier when a pseudo-PHY address has been used, as well as
board-specific PHY fixups that need to be applied during driver probe
time.

Reported-by: Hauke Merthens <hauke-m@hauke-m.de>
Reported-by: Jonas Gorski <jogo@openwrt.org>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agonet: sendmsg: fix NULL pointer dereference
Andrey Ryabinin [Sat, 26 Jul 2014 17:26:58 +0000 (21:26 +0400)]
net: sendmsg: fix NULL pointer dereference

[ Upstream commit 40eea803c6b2cfaab092f053248cbeab3f368412 ]

Sasha's report:
> While fuzzing with trinity inside a KVM tools guest running the latest -next
> kernel with the KASAN patchset, I've stumbled on the following spew:
>
> [ 4448.949424] ==================================================================
> [ 4448.951737] AddressSanitizer: user-memory-access on address 0
> [ 4448.952988] Read of size 2 by thread T19638:
> [ 4448.954510] CPU: 28 PID: 19638 Comm: trinity-c76 Not tainted 3.16.0-rc4-next-20140711-sasha-00046-g07d3099-dirty #813
> [ 4448.956823]  ffff88046d86ca40 0000000000000000 ffff880082f37e78 ffff880082f37a40
> [ 4448.958233]  ffffffffb6e47068 ffff880082f37a68 ffff880082f37a58 ffffffffb242708d
> [ 4448.959552]  0000000000000000 ffff880082f37a88 ffffffffb24255b1 0000000000000000
> [ 4448.961266] Call Trace:
> [ 4448.963158] dump_stack (lib/dump_stack.c:52)
> [ 4448.964244] kasan_report_user_access (mm/kasan/report.c:184)
> [ 4448.965507] __asan_load2 (mm/kasan/kasan.c:352)
> [ 4448.966482] ? netlink_sendmsg (net/netlink/af_netlink.c:2339)
> [ 4448.967541] netlink_sendmsg (net/netlink/af_netlink.c:2339)
> [ 4448.968537] ? get_parent_ip (kernel/sched/core.c:2555)
> [ 4448.970103] sock_sendmsg (net/socket.c:654)
> [ 4448.971584] ? might_fault (mm/memory.c:3741)
> [ 4448.972526] ? might_fault (./arch/x86/include/asm/current.h:14 mm/memory.c:3740)
> [ 4448.973596] ? verify_iovec (net/core/iovec.c:64)
> [ 4448.974522] ___sys_sendmsg (net/socket.c:2096)
> [ 4448.975797] ? put_lock_stats.isra.13 (./arch/x86/include/asm/preempt.h:98 kernel/locking/lockdep.c:254)
> [ 4448.977030] ? lock_release_holdtime (kernel/locking/lockdep.c:273)
> [ 4448.978197] ? lock_release_non_nested (kernel/locking/lockdep.c:3434 (discriminator 1))
> [ 4448.979346] ? check_chain_key (kernel/locking/lockdep.c:2188)
> [ 4448.980535] __sys_sendmmsg (net/socket.c:2181)
> [ 4448.981592] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2600)
> [ 4448.982773] ? trace_hardirqs_on (kernel/locking/lockdep.c:2607)
> [ 4448.984458] ? syscall_trace_enter (arch/x86/kernel/ptrace.c:1500 (discriminator 2))
> [ 4448.985621] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2600)
> [ 4448.986754] SyS_sendmmsg (net/socket.c:2201)
> [ 4448.987708] tracesys (arch/x86/kernel/entry_64.S:542)
> [ 4448.988929] ==================================================================

This reports means that we've come to netlink_sendmsg() with msg->msg_name == NULL and msg->msg_namelen > 0.

After this report there was no usual "Unable to handle kernel NULL pointer dereference"
and this gave me a clue that address 0 is mapped and contains valid socket address structure in it.

This bug was introduced in f3d3342602f8bcbf37d7c46641cb9bca7618eb1c
(net: rework recvmsg handler msg_name and msg_namelen logic).
Commit message states that:
"Set msg->msg_name = NULL if user specified a NULL in msg_name but had a
 non-null msg_namelen in verify_iovec/verify_compat_iovec. This doesn't
 affect sendto as it would bail out earlier while trying to copy-in the
 address."
But in fact this affects sendto when address 0 is mapped and contains
socket address structure in it. In such case copy-in address will succeed,
verify_iovec() function will successfully exit with msg->msg_namelen > 0
and msg->msg_name == NULL.

This patch fixes it by setting msg_namelen to 0 if msg_name == NULL.

Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: <stable@vger.kernel.org>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Andrey Ryabinin <a.ryabinin@samsung.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agobnx2x: fix crash during TSO tunneling
Dmitry Kravkov [Thu, 24 Jul 2014 15:54:47 +0000 (18:54 +0300)]
bnx2x: fix crash during TSO tunneling

[ Upstream commit fe26566d8a05151ba1dce75081f6270f73ec4ae1 ]

When TSO packet is transmitted additional BD w/o mapping is used
to describe the packed. The BD needs special handling in tx
completion.

kernel: Call Trace:
kernel: <IRQ>  [<ffffffff815e19ba>] dump_stack+0x19/0x1b
kernel: [<ffffffff8105dee1>] warn_slowpath_common+0x61/0x80
kernel: [<ffffffff8105df5c>] warn_slowpath_fmt+0x5c/0x80
kernel: [<ffffffff814a8c0d>] ? find_iova+0x4d/0x90
kernel: [<ffffffff814ab0e2>] intel_unmap_page.part.36+0x142/0x160
kernel: [<ffffffff814ad0e6>] intel_unmap_page+0x26/0x30
kernel: [<ffffffffa01f55d7>] bnx2x_free_tx_pkt+0x157/0x2b0 [bnx2x]
kernel: [<ffffffffa01f8dac>] bnx2x_tx_int+0xac/0x220 [bnx2x]
kernel: [<ffffffff8101a0d9>] ? read_tsc+0x9/0x20
kernel: [<ffffffffa01f8fdb>] bnx2x_poll+0xbb/0x3c0 [bnx2x]
kernel: [<ffffffff814d041a>] net_rx_action+0x15a/0x250
kernel: [<ffffffff81067047>] __do_softirq+0xf7/0x290
kernel: [<ffffffff815f3a5c>] call_softirq+0x1c/0x30
kernel: [<ffffffff81014d25>] do_softirq+0x55/0x90
kernel: [<ffffffff810673e5>] irq_exit+0x115/0x120
kernel: [<ffffffff815f4358>] do_IRQ+0x58/0xf0
kernel: [<ffffffff815e94ad>] common_interrupt+0x6d/0x6d
kernel: <EOI>  [<ffffffff810bbff7>] ? clockevents_notify+0x127/0x140
kernel: [<ffffffff814834df>] ? cpuidle_enter_state+0x4f/0xc0
kernel: [<ffffffff81483615>] cpuidle_idle_call+0xc5/0x200
kernel: [<ffffffff8101bc7e>] arch_cpu_idle+0xe/0x30
kernel: [<ffffffff810b4725>] cpu_startup_entry+0xf5/0x290
kernel: [<ffffffff815cfee1>] start_secondary+0x265/0x27b
kernel: ---[ end trace 11aa7726f18d7e80 ]---

Fixes: a848ade408b ("bnx2x: add CSUM and TSO support for encapsulation protocols")
Reported-by: Yulong Pei <ypei@redhat.com>
Cc: Michal Schmidt <mschmidt@redhat.com>
Signed-off-by: Dmitry Kravkov <Dmitry.Kravkov@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoxfrm: Fix installation of AH IPsec SAs
Tobias Brunner [Thu, 26 Jun 2014 13:12:45 +0000 (15:12 +0200)]
xfrm: Fix installation of AH IPsec SAs

[ Upstream commit a0e5ef53aac8e5049f9344857d8ec5237d31e58b ]

The SPI check introduced in ea9884b3acf3311c8a11db67bfab21773f6f82ba
was intended for IPComp SAs but actually prevented AH SAs from getting
installed (depending on the SPI).

Fixes: ea9884b3acf3 ("xfrm: check user specified spi for IPComp")
Cc: Fan Du <fan.du@windriver.com>
Signed-off-by: Tobias Brunner <tobias@strongswan.org>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoxfrm: Fix refcount imbalance in xfrm_lookup
Steffen Klassert [Wed, 18 Jun 2014 10:34:21 +0000 (12:34 +0200)]
xfrm: Fix refcount imbalance in xfrm_lookup

[ Upstream commit b7eea4545ea775df957460f58eb56085a8892856 ]

xfrm_lookup must return a dst_entry with a refcount for the caller.
Git commit 1a1ccc96abb ("xfrm: Remove caching of xfrm_policy_sk_bundles")
removed this refcount for the socket policy case accidentally.
This patch restores it and sets DST_NOCACHE flag to make sure
that the dst_entry is freed when the refcount becomes null.

Fixes: 1a1ccc96abb ("xfrm: Remove caching of xfrm_policy_sk_bundles")
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agonet: bcmgenet: correctly pad short packets
Florian Fainelli [Tue, 22 Jul 2014 18:01:52 +0000 (11:01 -0700)]
net: bcmgenet: correctly pad short packets

[ Upstream commit 474ea9cafc459976827a477f2c30eaf6313cb7c1 ]

Packets shorter than ETH_ZLEN were not padded with zeroes, hence leaking
potentially sensitive information. This bug has been present since the
driver got accepted in commit 1c1008c793fa46703a2fee469f4235e1c7984333
("net: bcmgenet: add main driver file").

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agoLinux 3.15.9 v3.15.9
Greg Kroah-Hartman [Thu, 7 Aug 2014 23:53:58 +0000 (16:53 -0700)]
Linux 3.15.9

11 years agox86/espfix/xen: Fix allocation of pages for paravirt page tables
Boris Ostrovsky [Wed, 9 Jul 2014 17:18:18 +0000 (13:18 -0400)]
x86/espfix/xen: Fix allocation of pages for paravirt page tables

commit 8762e5092828c4dc0f49da5a47a644c670df77f3 upstream.

init_espfix_ap() is currently off by one level when informing hypervisor
that allocated pages will be used for ministacks' page tables.

The most immediate effect of this on a PV guest is that if
'stack_page = __get_free_page()' returns a non-zeroed-out page the hypervisor
will refuse to use it for a page table (which it shouldn't be anyway). This will
result in warnings by both Xen and Linux.

More importantly, a subsequent write to that page (again, by a PV guest) is
likely to result in fatal page fault.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: http://lkml.kernel.org/r/1404926298-5565-1-git-send-email-boris.ostrovsky@oracle.com
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
11 years agolib/btree.c: fix leak of whole btree nodes
Minfei Huang [Wed, 4 Jun 2014 23:11:53 +0000 (16:11 -0700)]
lib/btree.c: fix leak of whole btree nodes

commit c75b53af2f0043aff500af0a6f878497bef41bca upstream.

I use btree from 3.14-rc2 in my own module.  When the btree module is
removed, a warning arises:

 kmem_cache_destroy btree_node: Slab cache still has objects
 CPU: 13 PID: 9150 Comm: rmmod Tainted: GF          O 3.14.0-rc2 #1
 Hardware name: Inspur NF5270M3/NF5270M3, BIOS CHEETAH_2.1.3 09/10/2013
 Call Trace:
   dump_stack+0x49/0x5d
   kmem_cache_destroy+0xcf/0xe0
   btree_module_exit+0x10/0x12 [btree]
   SyS_delete_module+0x198/0x1f0
   system_call_fastpath+0x16/0x1b

The cause is that it doesn't release the last btree node, when height = 1
and fill = 1.

[akpm@linux-foundation.org: remove unneeded test of NULL]
Signed-off-by: Minfei Huang <huangminfei@ucloud.cn>
Cc: Joern Engel <joern@logfs.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>