We begin our part by discussing the basics of an overflow, followed by the next sections in the next parts: memory, registers and shellcode giving us an adequate understanding on these areas and how they later help us better understand buffer overflows and ways to exploit them. Types of buffer overflows are then described in detail, followed by controlling our overflow to our chosen code.
What is a buffer overflow?
Buffer overflows are a common vulnerability on all platforms, but are by far the most commonly exploited bug on the Linux/Unix Operating systems. Buffer over flow occurs when something is filled beyond its capacity. Imagine pouring water into a container with more than it can store, the water will spill over and create a mess. A similar situation applies to computer programs where a certain amount of space is allocated to store data for the program during execution. If too much data is inputted into the fixed amount of space, then this space known as the buffer will overflow. Hence the overflow is known as a buffer overflow.
Buffer overflow or buffer overrun occurs when a program allows input to be written beyond the end of an allocated buffer. When a memory block is allocated to store data only data up to that limit is allowed and no more. Any more data inputted would produce unwanted results. These results would overwrite critical areas of memory which would give an attacker the ability to alter the execution flow of the program. Having the ability to control the flow of execution gives the attacker the ability to execute anything he wishes to.
These buffer overflows simply rise from programming errors which come down to poor programming by developers by not setting any boundaries on the size of input the program can handle. C and C++ are the two most popular languages that produce buffer overflows. These languages allow direct access to application memory and therefore generally improve performance for the application. Higher level languages such as C#, Visual Basic have checks in place and do not normally give direct memory access but at a cost to performance.
Imagine what we would be able to do with this vulnerability, add our own account, remote control a machine, execute another program, etc. We will be able to do anything without the user even realizing that their machine has been compromised. This is the ower a hacker will achieve by simply overflowing and exploiting this vulnerability. As we can now see the popularity of this specific vulnerability and as to why is brings so much interest.
Buffer overflows are all about memory. If memory is protected then buffer overflows will not be able to take place and overflows would be a thing of the past. Before examining the details of the types of overflows a good understanding of how memory works is vital in order to appreciate the beauty of the overflow.
Memory is just an area of storage numbered with addresses. On Intel x86 processors a 32 bit addressing scheme is used which means that there are 2^32 addresses available which equates to 4,294,967,296 addresses. Every program when loaded gets assigned a 4 GB virtual memory space which gives the program more memory than the actual physical memory on the machine. The mapping from virtual to physical addresses is handled by the memory management unit (MMU) which is a chip on the motherboard in conjunction with the operating system. The MMU not only provides translation of addresses for programs and having a large memory space but also provides protection and reduces memory fragmentation.
Windows XP Professional" /fastdetect /NoExecute=OptIn /3GB
When a process is loaded into memory the information is basically broken down into sections. There are three segments of the program, .data, .bss and .text. The .data and .bss are reserved for global variables and are writable in memory. The .data segment contains static initialised data while the .bss contains non-initialised data and does not change at runtime. An initialised variable would look like int a = 0; whereas non-initialised variable would be int a. The .text segment is mapped as read only in memory and this is the actual program code. Lastly the stack and heap are initialized. The stack stores local variables, functional calls and program information. The heap stores dynamic variables and also the same program information.
Stack and heap are types of memory which we will look in depth in the next parts.
One more important detail about memory which needs mentioning is the
ordering of 4-byte words. On an Intel system the ordering is represented as
little-endian. Little-endian means that the least significant byte comes first. So
if we are going to use an address of 0x7C82D9FB then we will need to input in
as 0xFBD9827C. So a memory address of a 4 byte word is stored in reverse
to be continued in the next parts,
by Anwar Mohamed