I am in the midst of my pursuit for the Offensive Security Certified Professional (OSCP) certification in May, and today I will be creating notes for Buffer Overflow. Let’s get right into it!
IA-32 CPU Primer
System organisation basics:
Control Unit: The brain of the CPU 🠒 Gets instructions, parses them, retrieves and stores data in memory
Execution Unit: The body of the CPU 🠒 Carries out the execution
Registers: Internal memory locations used as ‘variables’
Flags: Indicator of an ‘event’ (when execution is happening)
CPU registers are small bits of memory inside the CPU itself used for CPU calculations and execution. We will be focusing on General Purpose Registers and EIP.
General Purpose Registers:
EAX: Accumulator register – used for storing operands and result data
EBX: Base Register – Pointer to Data
ECX: Counter Register – Loop Operations
EDX: Data Register – I/O Pointer
ESI: Source Index – Data Pointer for memory operations
EDI: Destination Index – Data Pointer for memory operations
ESP: Stack Point Register
EBP: Stack Base Pointer Register
- Contains the memory address of the next instruction to be executed
- The focus of almost all shell coding, exploit research, etc
- Control of EIP grants control of execution
- By extension, control of the program and the underlying system.
Most RCE exploits consist of creative ways to place arbitrary values into EIP.
x86 Assembly Primer
What is assembly?
- Low-level programming language
- Used to communicate with the microprocessor directly
- Specific to the processor family (Intel, ARM, MIPS, etc)
- Almost one-to-one correspondence with machine language
- 32-bit CPU registers
- Can manage only a theoretical maximum of 4 gigabytes of RAM (Less than 3.5 GB)
- An extension of the original 23-bit x86 architecture for 640-bit processors
- 64 bit CPU registers can access much wider ranges of memory: Up to 17,179,869,184 GB!
[label] mnemonic [operands] [; comment]
Label: Used to represent either an identifier or a constant
Mnemonic: Identifies the purpose of the statement. It is not required if a line contains only a label or a comment.
Operands: Specifies the data to be manipulated. Most instructions take two operands.
Comment: For developers notes. Text is ignored by the assemblers.
Intel syntax: Destination register, then the source register:
Win32 Process Memory
When an application is initialised, Windows creates the process and assigns virtual memory to it
What is a Stack?
- Located in the RAM
- Used for local variables, function calls, function parameters, and other temporary data
- Last in, first-out structure
- Grows from higher to lower address
Every time a function is called, the functional parameters and the saved values of the stack pointer registers, as well as EIP. When the function returns, the saved value of EIP is popped off of the stack and returned to the EIP
Stack GROWS down. The highest point of the stack is at the lowest address in memory. 32 bits is 4 bytes.
- Assembly has several instructions specifically designed to interact with the stack.
PUSH <operand>; Example: PUSH EAX
- Decrements ESP and then places the operand (A register, address, etc.) onto the top of the stack. The stack GROWS.
POP <operand>; Example: POP EAX
- Load the value from the top of the stack into the location specified in the operand, then increments ESP. The stack SHRINKS.
- Transfers program to control to a return address located on the top of the stack
- Typically, this address is placed on the stack by a call instruction when a function is called. This instruction is intended to return to a normal execution flow after a function is finished executing.
Okay… so why does this matter?
Programming languages (primarily C, C++) require the programmer to manually define the size to be allocated in memory for variables.
If user-controlled data is accepted in a variable, the programmer must be sure to include logic to check the supplied data and ensure it is appropriately sized before it is accepted into process memory and placed onto the stack.
Many functions in these languages have no default protections.
If boundary checking is not done, too much data will be written onto the stack to fit within the bytes allocated for it.
The excess data will overwrite higher memory address sequentially and everything stored there, perhaps even beyond the current stack frame.
The stack is used to store memory addressed to be loaded into EIP at some later time, such as return addresses.
That’s all folks! I understand how buffer overflow works and hope you do too. Time to get my hands dirty with some exploitation. 😉