High-level vs. Low-level
add rax, 1 -> 4883C001 -> 01001000 10000011 11000000 00000001print("Hello World!") -> write(1, "Hello World!", 12);_exit(0); (C) -> mov rax, 1; mov rdi, 1; mov rsi, "Hello World!", {etc} -> shellcode -> binaryArchitecture
.bss section for unassigned variables)rip registerrax, rbx, rcx, rdx, rsi, rdi, r8, r9, 10)
rax through rdxrdi and rsi are used for destination and source operandsr8-10 can be used when others are in userbp, rsp, rip)
rbp is the base stack pointer, keeping track of the beginning of the stackrsp points to the current location within the stack (the top)
argc at the start of executionrip is the instruction pointer, showing the location of the next instructionsrax is 64 bits, eax is the lower 32 bits, ax is the lower 16 bites, and al is the lowest 8 bitsrax is used for the syscall number - for example, 1 in rax means to print the data
rdi, rsi, rdx, rcx, r8, and r9 are used (in order given) as arguments for the syscall0x0 to 0xffffffffffffffff (on 64 bit systems)add 2)add rax)call 0xffffffffaa8a25ff)call [rax])add rsp)0x0011223344556677 would be stored as 0x7766554433221100Assembly File Structure
.data, which contain the variables, and .text, which contains the code to be executedglobal _start), which tells code to begin execution at _start.data section using db for a list of bytes, dw for a list of words, dd for a list of digits
message db "Hello World!", 0x0a
0x0a is just a line feed (newline), placing it there just appends it to the string%d, to specify what type of string it isequ instruction to evaluate an expression
length equ $-message would set length to equal the distance from where we’re currently at to the value (which in this case is negative message), so length would be the length of Hello World!.text holds all the assembly instructions and loads them into the Text portion of the stack (upon which they are executed)
Data segmentglobal _start
section .data
message db "Hello World!"
length equ $-message
section .text
_start:
mov rax, 1 ; syscall number 1 means use the sys_write syscall
mov rdi, 1 ; 1st argument - file descriptor 1 means output to stdout
mov rsi, message ; 2nd argument - pointer to message string
mov rdx, length ; 3rd argument - number of bytes to write
syscall
mov rax, 60 ; syscall 60 is exit
mov rdi, 0 ; 1st argument - return exit code 0
syscall
Assembling a file
.s or .asm files
.s filenasm
nasm -f elf64 {filename}.s.o fileld
ld -o {output_name} {assembled_file}.o
-m elf_i386 flagDisassembling a file
objdump to dump the machine code from a file and interpret assembly into instructions
objdump -M {syntax_like_intel} -d {binary} -s
-M can be used to specify more disassembly instructions-s is used for strings, so we can get stuff from the text sectionGDB
wget -O ~/.gdbinit-gef.py -q https://gef.blah.cat/py and then echo source ~/.gdbinit-gef.py >> ~/.gdbinit
gdb {binary} will open it up with GEF ready to gohelp {command} - display usage of individual gdb commands (like the ones below)info {category} - view general program information, such as functions, variables, breakpoints, or the stackdisas {function_name} - disassemble a functionregisters - examine register contentsb {function_name} - set an execution breakpoint on a function
b *0x{memory_address} - set a breakpoint at a specific point in memoryd {breakpoint_id} - delete a breakpointn - go to the next functionni - go to the next instruction (skipping function calls)
si for a more detailed instruction step-through (every single machine instruction run on the processor)
sum(), whereas ni would skip over thisc - continue to next breakpointr - run the program
set args {args} - can be used to set the arguments before executionx/{count}{format}{size} {$register_or_0xAddress} - examine memory at a certain point
{count} is the number of times to iterate{format} is x for hex, s for string, and i for instruction{size} is b for byte, h for halfword, w for word, g for giant (8 bytes)x/4xb $rip would examine the next 4 instructions in 8 byte portions starting at the memory address stored in rippatch (via GEF) to modify memory at a given address
patch {type/size} {location} {values_to_change_to}
byte, word, dword, qword, or stringset in GDB
set ${reg}={value}!command - run a shell command (useful for something like !strings)Data Movement
mov to move a value into a register
mov rax, 1 - puts 1 in raxmov rax, rsp - moves the address in rsp into raxmov rax, [rsp] - moves the value at rsp into raxlea to load an address with pointer arithmetic into a register
lea rax, [rsp + 10] - load the address of rsp + 0xa into raxlea rax, [rbx + rcx*4 + 32] would load rbx + rcx * 4 + 0x20 into raxArithmetic
inc and dec to increment/decrement by 1add, sub, and imul to add/subtract/multiply destination by sourceBitwise Instructions
and, or, not, and xor will all perform their respective operations
or rax, rax will just set rax to itself (same with and), whereas xor would set it to 0not rax would invert raxLoops
rcx register
rcx will be used, so if we forget to set our loop iterations here it could underflow from 0 lol{loop_name}: and instructions following
loop {loop_name}_start functionBranching
jmp
jmp doesn’t decrement rcx, running jmp on the current function is basically a while true loopjz/jnz - jump if destination equal to zero/not equal to zerojs/jns - jump if destination negative/non-negativejg/jge - jump if destination greater than (or equal to) sourcejl/jle - jump if destination lesser than (or equal to) sourcecmovz, which moves the source into the destination if the zero flag is set
cmovl or setzCarry, {Reserved}, Parity, {Reserved}, Auxiliary Carry, {Reserved}, Zero, Sign, Trap, Interrupt, Direction, Overflowcmp, which will subtract the 2nd operand from the 1st and populate the flags accordingly
Using the Stack
push rax and then pop it afterwardsub from rsp before calling a function if we aren’t aligned, as long as we add to rsp afterward
rax and rbx), then we don’t need to worry about stack alignmentrsp points to the top, while rbp points to the baseSubroutines
call to call a function, which basically pushes the instruction pointer to the stack (so we know where to return to) and then jumps to the associated point in memory
ret to pop the address at rsp into rip and jump to itFunctions
raxextern {function}, such as printf or scanf
libc library within the ld function
-lc --dynamic-linker /lib64/ld-linux-x86-64.so.2rdi and the string to print into rsi, and then call printfscanf, we’ll need a buffer to hold the input
.bss section (after .data) with {variable_name} resb {number_of_reserved_bytes}Pwntools
pwn asm '{assembly}' -c '{arch_like_amd64}'python3 -c 'from pwn import *; file = ELF("{binary}"); print(file.section(".text").hex())'run_shellcode:from pwn import *; context(os="linux",arch="amd64",log_level="error")
run_shellcode(unhex("{shellcode}")).interactive()
Shellcoding Techniques
mov rbx, 'rld!', push rbx, mov rbx, 'Hello Wo, push rbxrsi to rsp (rsi is the argument to print, and rsp is the current start of the string)rspmov rax, 1, we’d want to do mov al, 1Shellcoding Tools
pwn disasm
pwn disasm '{shellcode}' -c 'amd64'/bin/sh using execve (syscall number 59)execve("/bin//sh", ["/bin//sh"], NULL)
rax hold 59 (syscall), rdi and rsi hold ['/bin//sh'] (pointer to program to execute and list of argument pointers), and rdx hold NULL (no env variables)
/ to /bin/sh so it’s 8 bytesglobal _start
section .text
_start:
mov al, 59 ; execve syscall number
xor rdx, rdx ; set env to NULL
push rdx ; push NULL string terminator
mov rdi, '/bin//sh' ; first arg to /bin/sh
push rdi ; push to stack
mov rdi, rsp ; move pointer to ['/bin//sh']
push rdx ; push NULL string terminator
push rdi ; push second arg to ['/bin//sh']
mov rsi, rsp ; pointer to args
syscall
pwn also has shellcraft, which we can use to generate a /bin/sh shell with pwn shellcraft amd64.linux.sh
-r to run the shellcodemsfvenom to generate or encode a payload
msfvenom -p 'linux/x64/exec' CMD='sh' -a 'x64' --platform 'linux' -f 'hex'
-e 'x64/xor'objcopy -O binary -j .text {binary} {binary_name}.binmsfvenom -p - -a 'x64' --platform 'linux' -f 'hex' < {binary_name}.bin will give us the shellcode