I am in the midst of my pursuit for theĀ Offensive Security Certified Professional (OSCP) certification in May, and today I will be creating notes for Buffer Overflow. Let’s get right into it!

IA-32 CPU Primer

System organisation basics:

Control Unit: The brain of the CPU šŸ ’ Gets instructions, parses them, retrieves and stores data in memory

Execution Unit: The body of the CPU šŸ ’ Carries out the execution

Registers: Internal memory locations used as ‘variables’

Flags: Indicator of an ‘event’ (when execution is happening)

IA-32 Registers:

CPU registers are small bits of memory inside the CPU itself used for CPU calculations and execution. We will be focusing on General Purpose Registers and EIP.

General Purpose Registers:

EAX: Accumulator register – used for storing operands and result data

EBX: Base Register – Pointer to Data

ECX: Counter Register – Loop Operations

EDX: Data Register – I/O Pointer

ESI: Source Index – Data Pointer for memory operations

EDI: DestinationĀ  Index – Data Pointer for memory operations

ESP: Stack Point Register

EBP: Stack Base Pointer Register

EIP:

  • Contains the memory address of the next instruction to be executed
  • Read-only
  • The focus of almost all shell coding, exploit research, etc
  • Control of EIP grants control of execution
  • By extension, control of the program and the underlying system.

Most RCE exploits consist of creative ways to place arbitrary values into EIP.

x86 Assembly Primer

What is assembly?

  • Low-level programming language
  • Used to communicate with the microprocessor directly
  • Specific to the processor family (Intel, ARM, MIPS, etc)
  • Almost one-to-one correspondence with machine language

Intel Architecture

x86:

  • 32-bit CPU registers
  • Can manage only a theoretical maximum of 4 gigabytes of RAM (Less than 3.5 GB)

x86_64:

  • An extension of the original 23-bit x86 architecture for 640-bit processors
  • 64 bit CPU registers can access much wider ranges of memory: Up to 17,179,869,184 GB!

Instruction Format

[label] mnemonic [operands] [; comment]

Label: Used to represent either an identifier or a constant

Mnemonic: Identifies the purpose of the statement. It is not required if a line contains only a label or a comment.

Operands: Specifies the data to be manipulated. Most instructions take two operands.

Comment: For developers notes. Text is ignored by the assemblers.

Intel syntax: Destination register, then the source register:

 

Win32 Process Memory

When an application is initialised, Windows creates the process and assigns virtual memory to it

What is a Stack?

  • Located in the RAM
  • Used for local variables, function calls, function parameters, and other temporary data
  • Last in, first-out structure
  • Grows from higher to lower address

Every time a function is called, the functional parameters and the saved values of the stack pointer registers, as well as EIP. When the function returns, the saved value of EIP is popped off of the stack and returned to the EIP

Stack GROWS down. The highest point of the stack is at the lowest address in memory. 32 bits is 4 bytes.

Stack Instruction

  • Assembly has several instructions specifically designed to interact with the stack.

PUSH <operand>; Example: PUSH EAX

  • Decrements ESP and then places the operand (A register, address, etc.) onto the top of the stack. The stack GROWS.

POP <operand>; Example: POP EAX

  • Load the value from the top of the stack into the location specified in the operand, then increments ESP. The stack SHRINKS.

RET

  • Transfers program to control to a return address located on the top of the stack
    • Typically, this address is placed on the stack by aĀ call instruction when a function is called. This instruction is intended to return to a normal execution flow after a function is finished executing.

Okay… so why does this matter?

Programming languages (primarily C, C++) require the programmer to manually define the size to be allocated in memory for variables.

If user-controlled data is accepted in a variable, the programmer must be sure to include logic to check the supplied data and ensure it is appropriately sized before it is accepted into process memory and placed onto the stack.

Many functions in these languages have no default protections.


If boundary checking is not done, too much data will be written onto the stack to fit within the bytes allocated for it.

The excess data will overwrite higher memory address sequentially and everything stored there, perhaps even beyond the current stack frame.

The stack is used to store memory addressed to be loaded into EIP at some later time, such as return addresses.

Visualisation

 

That’s all folks! I understand how buffer overflow works and hope you do too. Time to get my hands dirty with some exploitation. šŸ˜‰