The existence of the problem is the kernel version 2.6.17 – 2.6.24.1.
First of all, analyze the causes of overflow caused by:
The role of Vmsplice a file descriptor (must be a pipe) and a memory linked with one another. The realization of this feature through the fs / splice.c the do_vmsplice () function to achieve, in the function of, the definition of the two arrays:
struct page * pages [PIPE_BUFFERS];
struct partial_page partial [PIPE_BUFFERS];
PIPE_BUFFERS overflow in the presence of the value of the version of the problem is defined as 16. These two functions are passed to the get_iovec_page_array () this function in.
To 2.6.22.14 version of the source code as an example, see fs / splice.c the beginning of the 1565 line get_iovec_page_array function.
In the function we see:
error = get_user (len, & iov-> iov_len);
if (unlikely (! len))
break;
Len here is only to determine a positive number on the ok, and the len can be controlled by user.
npages = (off + len + PAGE_SIZE – 1)>> PAGE_SHIFT;
if (npages> PIPE_BUFFERS – buffers)
npages = PIPE_BUFFERS – buffers;
error = get_user_pages (current, current-> mm,
(unsigned long) base, npages, 0, 0, & pages [buffers], NULL);
npages the value is calculated through the len, then we will UINT32_MAX value is set to len, then the calculation of off + len + PAGE_SIZE the results will lead to a number of parcels integer (integer wrap), then the npages will be 0, this is the unexpected. We are now analyzing the get_user_pages have been unexpected in the npages value, what kind of results. get_user_pages is used to page the user space (pages) mapping (pin) into the memory, and to get their page structure (struct page) of the pointer. However, in get_user_pages () function inside the page used to deal with do () while () at the end is:
len -;
) While (len & & start <vma-> vm_end);
If the value of len is 0, then (as we expected), then this cycle will cycle at least once, len value will be reduced to -1, the continued implementation of an error page until there is no effective implementation of the address mapping , the pointer will stop and return. However, at this time, he may have in reserve than he was allocated memory space more content to its page in the array. That is to say in this case get_user_pages () will overflow the array pages, wrote more than PIPE_BUFFERS (16) a pointer to the array. However, the actual experience of being overrun by the use of the procedure is the partial array array.
In do_vmsplice () in the definition of the partial array of the same was passed on to the get_iovec_page_array (). In the partial array is described in the need to write to pipe in other parts of the page. In get_user_pages () return, followed by a loop:
for (i = 0; i <error; i + +) (
const int plen = min_t (size_t, len, PAGE_SIZE – off);
partial [buffers]. offset = off;
partial [buffers]. len = plen;
off = 0;
len -= plen;
buffers + +;
)
In this case, as all pages have been written by the calculated offset (offset) will be zero, and the length (length) value is PAGE_SIZE (4096). From get_user_pages () return value of the error, will be in the case of overflow pages have been mapped in the number of pages: 46. Partial effect, is the same array is defined as 16 elements, so the same top of this cycle will lead to the occurrence of overflow.
The two arrays are vmsplice_to_page () in the statement. Partial distribution of the memory array will be placed on the pages below, so that the partial array to be overflow, then this cycle will be the same on the top of the pages overflow the array. Therefore the contents of an array of pages will be rewritten to 0, rather than the previous structure of the indicators point to pages.
When these are completed, control returns to vmsplice_to_page () – overflow and insufficient to cover the return address. For splice_to_pipe () call it seems to end, but some interesting things happened. In the beginning of this function, there is a test:
if (! pipe-> readers) (
send_sig (SIGPIPE, current, 0);
if (! ret)
ret =-EPIPE;
break;
)
If we look at the attack code, we will see
if (pipe (pi) <0) die ( “pipe”, errno);
close (pi [0]);
In the call vmsplice () before the client will read the pipe closed. Splice_to_pipe will therefore immediately withdraw, however, to pull out, will perform the following steps:
while (page_nr <spd_pages)
page_cache_release (spd-> pages [page_nr ++]);
We know get_user_pages () function will lock the call to the relevant pages in memory to allow access to its core; above two line is a code clean-up prior to return and unlock locked and no longer used in the present pages. However, in our example, pages of the contents of the array has been rewritten to 0. So what happens next will be to deceive the core (kernel oops), because the pages filled with the contents of the array is not a legitimate address. Overflow code through a number of small methods, such as the use of some specific mmap () call will be in memory address at the bottom of the contents of arbitrary structure.
When running in kernel mode, go directly to user space even though the value of the pointer may cause a lot of problems, but indeed can be intolerable. If the address is valid and also the presence of memory-related, the value is less direct success. Kernel work when he thought it was the point in the struct page pointer of the memory space, he did not receive any error message; but has been constructed through the procedures exploit the content of the data.
Under normal circumstances kernle will look for each individual page. However, in some cases, or have more than one page consisting of a collection known as the “compound pages”. This happens in a period was needed for kernel space for the size greater than the size of a page; When this call occurs, a group of compound pages to be passed to the caller. Rather special is that they were released, it will be removed separately, it would have split the action took place. Compound pages so there will be a general property pages do not: When the pages were released, we will call destructor.
We look at the attacks is how to set up procedures for low-memory page structures of:
pages [0] -> flags = 1 <<PG_compound;
pages [0] -> private = (unsigned long) pages [0];
pages [0] -> count = 1;
pages [1] -> lru.next = (long) kernel_code;
When the kernel in user space and start looking for the 0 position when the page structure will be found that the page structure is a group of compound page. Destructor (stored in the first two page structure lru.next) the point is that exploit code for some previously defined kernel_code (). Because the count is set to 1, therefore the implementation of page_cache_release () (the count value by 1) that would produce the target, no spare, and this page looks like a section of compound page, destructor will be called. At this time, stored in the location of kernel_code arbitrary code can be run in kernel state.




No user commented in " The analysis of Linux Kernel vmsplice Exploit "
Follow-up comment rss or Leave a TrackbackLeave A Reply