High-level vs. Low-level
add rax, 1
-> 4883C001
-> 01001000 10000011 11000000 00000001
print("Hello World!")
-> write(1, "Hello World!", 12);_exit(0);
(C) -> mov rax, 1; mov rdi, 1; mov rsi, "Hello World!", {etc}
-> shellcode -> binaryArchitecture
.bss
section for unassigned variables)rip
registerrax
, rbx
, rcx
, rdx
, rsi
, rdi
, r8
, r9
, 10
)
rax
through rdx
rdi
and rsi
are used for destination and source operandsr8-10
can be used when others are in userbp
, rsp
, rip
)
rbp
is the base stack pointer, keeping track of the beginning of the stackrsp
points to the current location within the stack (the top)
argc
at the start of executionrip
is the instruction pointer, showing the location of the next instructionsrax
is 64 bits, eax
is the lower 32 bits, ax
is the lower 16 bites, and al
is the lowest 8 bitsrax
is used for the syscall number - for example, 1
in rax
means to print the data
rdi
, rsi
, rdx
, rcx
, r8
, and r9
are used (in order given) as arguments for the syscall0x0
to 0xffffffffffffffff
(on 64 bit systems)add 2
)add rax
)call 0xffffffffaa8a25ff
)call [rax]
)add rsp
)0x0011223344556677
would be stored as 0x7766554433221100
Assembly File Structure
.data
, which contain the variables, and .text
, which contains the code to be executedglobal _start
), which tells code to begin execution at _start
.data
section using db
for a list of bytes, dw
for a list of words, dd
for a list of digits
message db "Hello World!", 0x0a
0x0a
is just a line feed (newline), placing it there just appends it to the string%d
, to specify what type of string it isequ
instruction to evaluate an expression
length equ $-message
would set length
to equal the distance from where we’re currently at to the value (which in this case is negative message), so length
would be the length of Hello World!
.text
holds all the assembly instructions and loads them into the Text
portion of the stack (upon which they are executed)
Data
segmentglobal _start
section .data
message db "Hello World!"
length equ $-message
section .text
_start:
mov rax, 1 ; syscall number 1 means use the sys_write syscall
mov rdi, 1 ; 1st argument - file descriptor 1 means output to stdout
mov rsi, message ; 2nd argument - pointer to message string
mov rdx, length ; 3rd argument - number of bytes to write
syscall
mov rax, 60 ; syscall 60 is exit
mov rdi, 0 ; 1st argument - return exit code 0
syscall
Assembling a file
.s
or .asm
files
.s
filenasm
nasm -f elf64 {filename}.s
.o
fileld
ld -o {output_name} {assembled_file}.o
-m elf_i386
flagDisassembling a file
objdump
to dump the machine code from a file and interpret assembly into instructions
objdump -M {syntax_like_intel} -d {binary} -s
-M
can be used to specify more disassembly instructions-s
is used for strings, so we can get stuff from the text
sectionGDB
wget -O ~/.gdbinit-gef.py -q https://gef.blah.cat/py
and then echo source ~/.gdbinit-gef.py >> ~/.gdbinit
gdb {binary}
will open it up with GEF ready to gohelp {command}
- display usage of individual gdb commands (like the ones below)info {category}
- view general program information, such as functions, variables, breakpoints, or the stackdisas {function_name}
- disassemble a functionregisters
- examine register contentsb {function_name}
- set an execution breakpoint on a function
b *0x{memory_address}
- set a breakpoint at a specific point in memoryd {breakpoint_id}
- delete a breakpointn
- go to the next functionni
- go to the next instruction (skipping function calls)
si
for a more detailed instruction step-through (every single machine instruction run on the processor)
sum()
, whereas ni
would skip over thisc
- continue to next breakpointr
- run the program
set args {args}
- can be used to set the arguments before executionx/{count}{format}{size} {$register_or_0xAddress}
- examine memory at a certain point
{count}
is the number of times to iterate{format}
is x
for hex, s
for string, and i
for instruction{size}
is b
for byte, h
for halfword, w
for word, g
for giant (8 bytes)x/4xb $rip
would examine the next 4 instructions in 8 byte portions starting at the memory address stored in rip
patch
(via GEF) to modify memory at a given address
patch {type/size} {location} {values_to_change_to}
byte
, word
, dword
, qword
, or string
set
in GDB
set ${reg}={value}
!command
- run a shell command (useful for something like !strings
)Data Movement
mov
to move a value into a register
mov rax, 1
- puts 1 in rax
mov rax, rsp
- moves the address in rsp into rax
mov rax, [rsp]
- moves the value at rsp into rax
lea
to load an address with pointer arithmetic into a register
lea rax, [rsp + 10]
- load the address of rsp
+ 0xa
into rax
lea rax, [rbx + rcx*4 + 32]
would load rbx
+ rcx * 4
+ 0x20
into rax
Arithmetic
inc
and dec
to increment/decrement by 1add
, sub
, and imul
to add/subtract/multiply destination by sourceBitwise Instructions
and
, or
, not
, and xor
will all perform their respective operations
or rax, rax
will just set rax
to itself (same with and
), whereas xor
would set it to 0not rax
would invert rax
Loops
rcx
register
rcx
will be used, so if we forget to set our loop iterations here it could underflow from 0 lol{loop_name}:
and instructions following
loop {loop_name}
_start
functionBranching
jmp
jmp
doesn’t decrement rcx
, running jmp
on the current function is basically a while true loopjz
/jnz
- jump if destination equal to zero/not equal to zerojs
/jns
- jump if destination negative/non-negativejg
/jge
- jump if destination greater than (or equal to) sourcejl
/jle
- jump if destination lesser than (or equal to) sourcecmovz
, which moves the source into the destination if the zero flag is set
cmovl
or setz
Carry
, {Reserved}, Parity
, {Reserved}, Auxiliary Carry
, {Reserved}, Zero
, Sign
, Trap
, Interrupt
, Direction
, Overflow
cmp
, which will subtract the 2nd operand from the 1st and populate the flags accordingly
Using the Stack
push rax
and then pop
it afterwardsub
from rsp
before calling a function if we aren’t aligned, as long as we add
to rsp
afterward
rax
and rbx
), then we don’t need to worry about stack alignmentrsp
points to the top, while rbp
points to the baseSubroutines
call
to call a function, which basically pushes the instruction pointer to the stack (so we know where to return to) and then jumps to the associated point in memory
ret
to pop the address at rsp
into rip
and jump to itFunctions
rax
extern {function}
, such as printf
or scanf
libc
library within the ld
function
-lc --dynamic-linker /lib64/ld-linux-x86-64.so.2
rdi
and the string to print into rsi
, and then call printf
scanf
, we’ll need a buffer to hold the input
.bss
section (after .data
) with {variable_name} resb {number_of_reserved_bytes}
Pwntools
pwn asm '{assembly}' -c '{arch_like_amd64}'
python3 -c 'from pwn import *; file = ELF("{binary}"); print(file.section(".text").hex())'
run_shellcode
:from pwn import *; context(os="linux",arch="amd64",log_level="error")
run_shellcode(unhex("{shellcode}")).interactive()
Shellcoding Techniques
mov rbx, 'rld!'
, push rbx
, mov rbx, 'Hello Wo
, push rbx
rsi
to rsp
(rsi
is the argument to print, and rsp
is the current start of the string)rsp
mov rax, 1
, we’d want to do mov al, 1
Shellcoding Tools
pwn disasm
pwn disasm '{shellcode}' -c 'amd64'
/bin/sh
using execve
(syscall number 59)execve("/bin//sh", ["/bin//sh"], NULL)
rax
hold 59
(syscall), rdi
and rsi
hold ['/bin//sh']
(pointer to program to execute and list of argument pointers), and rdx
hold NULL
(no env variables)
/
to /bin/sh
so it’s 8 bytesglobal _start
section .text
_start:
mov al, 59 ; execve syscall number
xor rdx, rdx ; set env to NULL
push rdx ; push NULL string terminator
mov rdi, '/bin//sh' ; first arg to /bin/sh
push rdi ; push to stack
mov rdi, rsp ; move pointer to ['/bin//sh']
push rdx ; push NULL string terminator
push rdi ; push second arg to ['/bin//sh']
mov rsi, rsp ; pointer to args
syscall
pwn
also has shellcraft
, which we can use to generate a /bin/sh
shell with pwn shellcraft amd64.linux.sh
-r
to run the shellcodemsfvenom
to generate or encode a payload
msfvenom -p 'linux/x64/exec' CMD='sh' -a 'x64' --platform 'linux' -f 'hex'
-e 'x64/xor'
objcopy -O binary -j .text {binary} {binary_name}.bin
msfvenom -p - -a 'x64' --platform 'linux' -f 'hex' < {binary_name}.bin
will give us the shellcode