3 minutes
The Missing Bits:Part 1
I remember my college days, the good old days. The time we spend with friends. The most troublesome part of it was the exam. when I try to remember the answer to a question all I can remember was English songs. I never questioned that behavior, maybe because I love songs or maybe even if I try I won’t be able to find an answer. Now when I recall those moments I can tell you one thing for sure, my brain cells holding question answers were overwritten by song lyrics. I can’t complain because I tried to fit so many things in a limited space. The same applies to computer memory also. When we try to fit x+y
amounts of data in x
amount of space, it overflows. Resulting in adjacent memory regions being overwritten by unwanted data. Some tried to fix this behavior while some tried to exploit it. As a result, a new class of vulnerability was born the Buffer overflow.
It is destructive yet beautiful at the same time, a true art. Hackers have a sweet spot for buffer overflows because of its destructive nature. In this series i will guide you through how these vulnerabilities arise and how to exploit them.
Basics
We must learn the basics before trying to do something more advanced. Understanding buffer overflow means understanding Architecture, Addressing, and Memory layout.
- Architecture
Architecture is all about the design, here we are concerned about CPU design. You might’ve already heard about x86 and x64 architectures. The processor design determines what software can run on the computer and what other hardware components are supported.
- Addressing
It is similar to our home address, a way to find something. Memory addresses are references to a memory location represented in hex. Since we are dealing with a 64-bit machine each memory address will be 64 bit long(8-byte long) and the system can represent 2⁶⁴ distinct memory addresses.
sample 64 bit memory address:
0x7fffffffec50
but there is a catch, just because an architecture uses 64-bit pointers, doesn’t mean that all the bits of those pointers are actually used. Its because the current implementations do not allow the entire virtual address space of 2⁶⁴ bytes (16 EB) to be used. So even though memory addresses are 64 bits long, the userspace is restricted to the first 48 bits. Keep this in mind because if you specified an address greater than 0x00007fffffffffff
, you’ll raise an exception.
- Memory layout
Memory is the place where all our data are getting stored. When a process gets loaded into memory a certain amount of memory is allocated to that process in an organized manner. Known as the memory layout of the process.
stack
: Stack is where all the local variables are stored(function local variables). It works on LIFO(last-in-fist-out) order and grows downwards(higher memory to lower memory address). A stack frame is created for that function containing all the local variables for that particular function when a function occurs. It is called a PUSH operation. When the function returns the stack frame is removed through the POP operation.
Heap
: Heap is accountable for dynamic memory allocations(Eg; malloc()).
BSS
: BSS stands for ‘Block Started by Symbol’. It contains all global and static variables that are initialized to zero or do not have explicit initialization in the source code.
DATA
: It contains both static and global data that are initialized with non-zero values.
TEXT
: Text segment contains executable instructions of your program, it’s also called code segment.
Since we are about to explore stack-based buffer overflow our utmost concern is the stack region. We need to understand what is going on in the stack and the use of registers during process execution.