exec: make argv/envp memory visible to oom-killer
authorOleg Nesterov <oleg@redhat.com>
Tue, 14 Dec 2010 22:40:11 +0000 (23:40 +0100)
committerAndi Kleen <ak@linux.intel.com>
Tue, 14 Dec 2010 22:40:11 +0000 (23:40 +0100)
commitc9b14e575935b3540e08f4467cf6588be2669710
tree8eb17d161bcbf305fbdbd9b0ca1c8e6974f39e71
parentf79f5c7c8c2fd31a9e00586aa5e5ed443af88cef
exec: make argv/envp memory visible to oom-killer

commit 3c77f845722158206a7209c45ccddc264d19319c upstream.

Brad Spengler published a local memory-allocation DoS that
evades the OOM-killer (though not the virtual memory RLIMIT):
http://www.grsecurity.net/~spender/64bit_dos.c

execve()->copy_strings() can allocate a lot of memory, but
this is not visible to oom-killer, nobody can see the nascent
bprm->mm and take it into account.

With this patch get_arg_page() increments current's MM_ANONPAGES
counter every time we allocate the new page for argv/envp. When
do_execve() succeds or fails, we change this counter back.

Technically this is not 100% correct, we can't know if the new
page is swapped out and turn MM_ANONPAGES into MM_SWAPENTS, but
I don't think this really matters and everything becomes correct
once exec changes ->mm or fails.

Compared to upstream:

before 2.6.36 kernel, oom-killer's badness() takes
mm->total_vm into account and nothing else. So
acct_arg_size() has to play with this counter too.

Reported-by: Brad Spengler <spender@grsecurity.net>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Reviewed-and-discussed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
fs/exec.c
include/linux/binfmts.h