Description
In this homework, you have to implement a simple instruction-level debugger that allows a user to debug a program interactively at the assembly instruction level. You can implement the debugger by using the ptrace interface.
To simplify your program, your debugger only has to handle static-nopie programs.
We use hello64 (https://up23.zoolab.org/up23/hw2/hello64), hello (https://up23.zoolab.org/up23/hw2/hello), guess (https://up23.zoolab.org/up23/hw2/guess) to demonstrate the usage of the debugger.
Launch the program
Unlike gdb and lldb , your debugger launches the target program when the debugger starts.
The program should stop at the entry point, waiting for the user’s cont or si commands.
-
usage: ./sdb [program]
./sdb ./hello64
When the program is launched, the debugger should print the name of the executable and the entry point address. Before waiting for the user’s input, the debugger should disassemble 5 instructions starting from the current program counter (rip). The detail requirement is described in the following paragraph.
** program ‘./hello64’ |
loaded. entry point 0x4000b0 |
||||
4000b0: b8 04 00 |
00 |
00 |
mov |
eax, 4 |
|
4000b5: bb 01 00 |
00 |
00 |
mov |
ebx, 1 |
|
4000ba: b9 d4 |
00 |
60 |
00 |
mov |
ecx, 0x6000d4 |
4000bf: ba 0e |
00 |
00 |
00 |
mov |
edx, 0xe |
4000c4: cd 80 |
int |
0x80 |
|||
(sdb) |
Disassemble
When returning from execution, the debugger should disassemble 5 instruction starting from the current program counter. The address of the 5 instructions should be within the range of the text section specified in the ELF file. We do not care about the format, but in each line, there should be
-
address, eg. 40000b0
-
raw instructions in grouping of 1 byte, eg. b8 04 00 00 00
-
mnemonic, eg. mov
-
operands of the instruction, eg. eax, 4
And make sure the output is aligned with the columns.
Hint: You can link against the capstone library for disassembling.
-
After typing an invalid command or using a command which is not si , cont , timetravel , the debugger should not disassemble the program.
-
Patched instructions like 0xcc (int3) should not appear in the output.
(sdb) si |
||
4000c4: cd 80 |
int |
0x80 |
4000c6: b8 01 00 00 00 |
mov |
eax, 1 |
4000cb: bb 00 00 00 00 |
mov |
ebx, 0 |
4000d0: cd 80 |
int |
0x80 |
4000d2: c3 |
ret |
|
(sdb) si |
||
hello, world! |
||
4000c6: b8 01 00 00 00 |
mov |
eax, 1 |
4000cb: bb 00 00 00 00 |
mov |
ebx, 0 |
4000d0: cd 80 |
int |
0x80 |
4000d2: c3 |
ret |
-
the address is out of the range of the text section. (sdb)
(sdb) si
-
-
4000cb: bb 00
00 00 00
mov
ebx,
0
4000d0: cd
80
int
0x80
4000d2: c3
ret
-
** the address is out of the range of the text section.
Step Instruction
When the user use si command, the target program should execute a single instruction.
(sdb) si |
||
4000c4: cd 80 |
int |
0x80 |
4000c6: b8 01 00 00 00 |
mov |
eax, 1 |
4000cb: bb 00 00 00 00 |
mov |
ebx, 0 |
4000d0: cd 80 |
int |
0x80 |
4000d2: c3 |
ret |
|
(sdb) si |
||
hello, world! |
||
4000c6: b8 01 00 00 00 |
mov |
eax, 1 |
4000cb: bb 00 00 00 00 |
mov |
ebx, 0 |
4000d0: cd 80 |
int |
0x80 |
4000d2: c3 |
ret |
-
the address is out of the range of the text section. (sdb)
Continue
The cont command continues the execution of the target program. The program should keep running until it terminates or hits a breakpoint.
You can only use two ptrace(PTRACE_SINGLE_STEP) and two int3 at most in the implementation of cont , or you will get 0 points.
break <address in hexdecimal>
** program ‘./hello64’ |
loaded. entry point 0x4000b0 |
||||
4000b0: b8 04 00 |
00 |
00 |
mov |
eax, 4 |
|
4000b5: bb 01 |
00 |
00 |
00 |
mov |
ebx, 1 |
4000ba: b9 d4 |
00 |
60 |
00 |
mov |
ecx, 0x6000d4 |
4000bf: ba 0e |
00 |
00 |
00 |
mov |
edx, 0xe |
4000c4: cd 80 |
int |
0x80 |
(sdb) break 0x4000ba
-
set a breakpoint at 0x4000ba. (sdb) cont
-
hit a breakpoint at 0x4000ba.
4000ba: |
b9 |
d4 |
00 |
60 |
00 |
mov |
ecx, 0x6000d4 |
4000bf: |
ba 0e 00 00 00 |
mov |
edx, 0xe |
||||
4000c4: |
cd 80 |
int |
0x80 |
||||
4000c6: |
b8 |
01 |
00 |
00 |
00 |
mov |
eax, 1 |
4000cb: bb 00 |
00 |
00 |
00 |
mov |
ebx, 0 |
||
(sdb) cont |
|||||||
hello, world! |
|||||||
** the target |
program terminated. |
Breakpoint
A user can use to set a breakpoint. The target program should
stop before the instruction at the specified address is executed. Then it should print a message
about the program. If the user resumes the program with si |
instead of cont , the program |
||||
should not stop at the breakpoint twice. The debugger still needs to print the message. |
|||||
** program ‘./hello64’ |
loaded. entry point 0x4000b0 |
||||
4000b0: b8 04 00 |
00 |
00 |
mov |
eax, 4 |
|
4000b5: bb 01 |
00 |
00 |
00 |
mov |
ebx, 1 |
4000ba: b9 d4 |
00 |
60 |
00 |
mov |
ecx, 0x6000d4 |
4000bf: ba 0e |
00 |
00 |
00 |
mov |
edx, 0xe |
4000c4: cd 80 |
int |
0x80 |
(sdb) break 0x4000ba
-
set a breakpoint at 0x4000ba. (sdb) si
4000b5: bb 01 |
00 |
00 |
00 |
mov |
ebx, 1 |
4000ba: b9 d4 |
00 |
60 |
00 |
mov |
ecx, 0x6000d4 |
4000bf: ba 0e |
00 |
00 |
00 |
mov |
edx, 0xe |
4000c4: cd 80 |
int |
0x80 |
|||
4000c6: b8 01 |
00 |
00 |
00 |
mov |
eax, 1 |
(sdb) si |
|||||
** hit a breakpoint |
0x4000ba. |
||||
4000ba: b9 d4 |
00 |
60 |
00 |
mov |
ecx, 0x6000d4 |
4000bf: ba 0e |
00 |
00 |
00 |
mov |
edx, 0xe |
4000c4: cd 80 |
int |
0x80 |
|||
4000c6: b8 01 |
00 |
00 |
00 |
mov |
eax, 1 |
4000cb: bb 00 |
00 |
00 |
00 |
mov |
ebx, 0 |
Sometimes you might see some bugs that are hard to replicate. Use the anchor command set a checkpoint and use the timetravel command to restore the process status.
Hint:
There are two ways to implement this feature.
-
Snapshot the process memory and general purpose registers.
-
Patch fork into the target process and stop the parent or child as the checkpoint.
This functionality is inspired by the Checkpoint/Restore In Userspace(CRIU) (https://criu.org/Main_Page). gdb also has a similar feature checkpoint which is implemented in a different way.
** program ‘./hello64’ |
loaded. entry point 0x4000b0 |
||||
4000b0: b8 04 00 |
00 |
00 |
mov |
eax, 4 |
|
4000b5: bb 01 00 |
00 |
00 |
mov |
ebx, 1 |
|
4000ba: b9 d4 |
00 |
60 |
00 |
mov |
ecx, 0x6000d4 |
4000bf: ba 0e |
00 |
00 |
00 |
mov |
edx, 0xe |
4000c4: cd 80 |
int |
0x80 |
|||
(sdb) anchor |
-
dropped an anchor (sdb) break 0x4000cb
-
set a breakpoint at 0x4000cb (sdb) cont
hello, world!
-
hit a breakpoint at 0x4000cb
-
-
4000cb: bb 00
00 00 00
mov
ebx,
0
4000d0: cd
80
int
0x80
4000d2: c3
ret
-
-
the address is out of the range of the text section. (sdb) timetravel
-
go back to the anchor point
4000b0: b8 04 00 00 00 |
mov |
eax, 4 |
4000b5: bb 01 00 00 00 |
mov |
ebx, 1 |
4000ba: b9 d4 00 60 00 |
mov |
ecx, 0x6000d4 |
4000bf: ba 0e 00 00 00 |
mov |
edx, 0xe |
4000c4: cd 80 |
int |
0x80 |
(sdb) cont |
||
hello, world! |
||
** hit a breakpoint at 0x4000cb |
||
4000cb: bb 00 00 00 00 |
mov |
ebx, 0 |
4000d0: cd 80 |
int |
0x80 |
4000d2: c3 |
ret |
** the address is out of the range of the text section.
Example 1 (10pt)
Command: ./sdb ./hello
Inputs: cont
** program ‘./hello’ loaded. entry point 0x401000 |
||||||
401000: |
f3 |
0f |
1e fa |
endbr64 |
||
401004: |
55 |
push |
rbp |
|||
401005: |
48 |
89 |
e5 |
mov |
rbp, rsp |
|
401008: |
ba 0e |
00 00 00 |
mov |
edx, 0xe |
||
40100d: |
48 |
8d |
05 ec 0f |
00 00 |
lea |
rax, [rip + 0xfec] |
(sdb) cont |
||||||
hello world! |
||||||
** the target |
program terminated. |
Example 2 (10pt)
Command: ./sdb ./hello
Inputs:
break 0x401030
break 0x40103b
cont
cont
si
si
** program ‘./hello’ loaded. entry point 0x401000 |
||
401000: f3 0f 1e fa |
endbr64 |
|
401004: 55 |
push |
rbp |
401005: 48 89 e5 |
mov |
rbp, rsp |
401008: ba 0e 00 00 00 |
mov |
edx, 0xe |
40100d: 48 8d 05 ec 0f 00 00 |
lea |
rax, [rip + 0xfec] |
(sdb) break 0x401030 |
||
** set a breakpoint at 0x401030 |
||
(sdb) break 0x40103b |
||
** set a breakpoint at 0x40103b |
||
(sdb) cont |
||
** hit a breakpoint at 0x401030 |
||
401030: 0f 05 |
syscall |
|
401032: c3 |
ret |
|
401033: b8 00 00 00 00 |
mov |
eax, 0 |
401038: 0f 05 |
syscall |
|
40103a: c3 |
ret |
|
(sdb) cont |
||
hello world! |
||
** hit a breakpoint at 0x40103b |
||
40103b: b8 3c 00 00 00 |
mov |
eax, 0x3c |
401040: 0f 05 |
syscall |
-
the address is out of the range of the text section. (sdb) si
401040: 0f 05 syscall
-
the address is out of the range of the text section. (sdb) si
-
the target program terminated.
Example 3 (10pt)
Command: ./sdb ./guess
Inputs:
break 0x4010bf
break 0x40111e
cont
anchor
cont
haha
timetravel
cont
42
cont
** program ‘./guess’ loaded. entry point 0x40108b |
||||||
40108b: f3 0f 1e |
fa |
endbr64 |
||||
40108f: 55 |
push |
rbp |
||||
401090: 48 |
89 |
e5 |
mov |
rbp, rsp |
||
401093: 48 |
83 |
ec 10 |
sub |
rsp, 0x10 |
||
401097: ba 12 |
00 |
00 |
00 |
mov |
edx, 0x12 |
(sdb) break 0x4010bf
-
set a breakpoint at 0x4010bf (sdb) break 0x40111e
-
set a breakpoint at 0x40111e (sdb) cont
guess a number > ** hit a breakpoint at 0x4010bf
-
-
4010bf: bf 00 00 00
00
mov
edi, 0
4010c4: e8
67
00
00
00
call
0x401130
4010c9: 48
89
45
f8
mov
qword ptr
[rbp – 8], rax
4010cd:
48
8d
05
3e
0f
00 00
lea
rax,
[rip
+ 0xf3e]
4010d4:
48
89
c6
mov
rsi,
rax
-
(sdb) anchor
-
dropped an anchor (sdb) cont
haha
no no |
no |
|||||||
** hit a breakpoint at 0x40111e |
||||||||
40111e: bf 00 00 00 00 |
mov |
edi, 0 |
||||||
401123: |
e8 |
10 |
00 |
00 |
00 |
call |
0x401138 |
|
401128: |
b8 |
01 |
00 |
00 |
00 |
mov |
eax, 1 |
|
40112d: |
0f |
05 |
syscall |
|||||
40112f: |
c3 |
ret |
||||||
(sdb) |
timetravel |
|||||||
** go |
back to |
the anchor point |
||||||
4010bf: |
bf 00 00 00 00 |
mov |
edi, 0 |
|||||
4010c4: |
e8 |
67 |
00 |
00 |
00 |
call |
0x401130 |
|
4010c9: |
48 |
89 |
45 |
f8 |
mov |
qword ptr [rbp – 8], rax |
||
4010cd: |
48 |
8d |
05 |
3e |
0f 00 00 |
lea |
rax, [rip + 0xf3e] |
|
4010d4: |
48 |
89 |
c6 |
mov |
rsi, rax |
|||
(sdb) |
cont |
|||||||
42 |
||||||||
yes |
||||||||
** hit a breakpoint at 0x40111e |
||||||||
40111e: bf 00 00 00 00 |
mov |
edi, 0 |
||||||
401123: |
e8 |
10 |
00 |
00 |
00 |
call |
0x401138 |
|
401128: |
b8 |
01 |
00 |
00 |
00 |
mov |
eax, 1 |
|
40112d: |
0f |
05 |
syscall |
|||||
40112f: |
c3 |
ret |
||||||
(sdb) |
cont |
|||||||
** the target |
program terminated. |
Grading
2. [70%] We use N additional test cases to evaluate your implementation. You get 70 points for
each correct test case.
N