python - 使用自己的 python 调试器获取系统调用时遇到问题

标签 python linux x86 system-calls ptrace

我试图在 32 位测试 linux 系统(Lubuntu)上用 python3 编写一个简单的调试器,它应该能够捕获任意程序的所有系统调用(在这种情况下:/bin/ls).为此,我使用 ptrace 系统调用来单步执行整个过程。在每一步之后,我读取寄存器以找到指令指针 eip 以从下一条指令读取 2 个字节。如果这 2 个字节是 0xcd0x80,则表示 int 80 是系统调用。我知道还有用于此目的的 PTRACE_SYSCALL,但我想在不使用它的情况下执行此操作。

在下面我向您展示了代码,它似乎可以工作,但是有一些奇怪的行为:

为了弄清楚这是否有效,我使用 strace 将它的输出与我自己的系统调用进行比较。而且我的程序似乎只显示了系统调用的第一部分,第二部分就不见了。为了向您展示我在下面发布了我的程序和 strace 的输出。有人知道这里可能出了什么问题吗?

import os               # os interaction
from struct import pack # dealing with bytes (ptrace)
import ctypes           # support c data structures

""" ========================================================== """

# 32 bit reg process structrue
class UserRegsStruct(ctypes.Structure):
    _fields_ = [
        ("ebx", ctypes.c_ulong),
        ("ecx", ctypes.c_ulong),
        ("edx", ctypes.c_ulong),
        ("esi", ctypes.c_ulong),
        ("edi", ctypes.c_ulong),
        ("ebp", ctypes.c_ulong),
        ("eax", ctypes.c_ulong),
        ("xds", ctypes.c_ulong),
        ("xes", ctypes.c_ulong),
        ("xfs", ctypes.c_ulong),
        ("xgs", ctypes.c_ulong),
        ("orig_eax", ctypes.c_ulong),
        ("eip", ctypes.c_ulong),
        ("xcs", ctypes.c_ulong),
        ("eflags", ctypes.c_ulong),
        ("esp", ctypes.c_ulong),
        ("xss", ctypes.c_ulong),
    ]

# ptrace constants
PTRACE_TRACEME = 0
PTRACE_PEEKDATA = 2
PTRACE_SINGLESTEP = 9
PTRACE_GETREGS = 12

CPU_WORD_SIZE = 4   # size of cpu word size (32 bit = 4 bytes)

# for syscalls
libc = ctypes.CDLL('libc.so.6')

# check if child (tracee) is still running
def WIFSTOPPED(status):
    return (status & 0xff) == 0x7f

# read from process memory by PTRACE_PEEKDATA
def ReadProcessMemory(pid, address, size):

    # address must be aligned!!
    offset = address % CPU_WORD_SIZE
    if offset:
        address -= offset
        word = libc.ptrace(PTRACE_PEEKDATA, pid, address, 0)
        wordbytes = pack("i", word)
        subsize = min(CPU_WORD_SIZE - offset, size)
        data = wordbytes[offset:offset + subsize]
        size -= subsize
        address += CPU_WORD_SIZE
    else:
        data = bytes(0)

    while size:
        word = libc.ptrace(PTRACE_PEEKDATA, pid, address, 0)
        wordbytes = pack("i", word)
        if size < CPU_WORD_SIZE:
            data += wordbytes[:size]
            break
        data += wordbytes
        size -= CPU_WORD_SIZE
        address += CPU_WORD_SIZE

    return data

""" ========================================================== """

# extract syscall names
fp = open("/usr/include/i386-linux-gnu/asm/unistd_32.h", "r")
syscalls = [0] * 400

for line in fp:
    if "__NR_" in line:
        a = line.rstrip().split(" ")
        name = a[1].split("NR_")[1]
        number = int(a[2])
        syscalls[number] = name

# "int 80" asm instruction = (0xCD 0x80)
a0 = 0xcd
a1 = 0x80

# create child tracee
pid = os.fork()

if pid == 0:    # in tracee
    libc.ptrace(PTRACE_TRACEME, 0, 0, 0)    # make child traceable
    os.execv("/bin/ls", [":-P"])            # run test programm
else:           # in tracer
    pid, status = os.waitpid(pid, 0)
    regs = UserRegsStruct()

# catch all syscalls
while True:

    libc.ptrace(PTRACE_SINGLESTEP, pid, 0, 0)               # execute next instruction
    pid, status = os.waitpid(pid, 0)                        # wait for tracee
    libc.ptrace(PTRACE_GETREGS, pid, 0, ctypes.byref(regs)) # get register values
    data = ReadProcessMemory(pid, regs.eip, 2)              # read 2 bytes from instruction pointer address

    # now check if this is a syscall
    if data[0] == a0 and data[1] == a1:
        print("HEUREKA! SYSCALL at " + hex(regs.eip) + ": " + syscalls[regs.eax])

    if WIFSTOPPED(status) == False: break # exit loop when tracee stopped

这产生了以下输出:

HEUREKA! SYSCALL at 0xb7fae2c5: brk
HEUREKA! SYSCALL at 0xb7fa3944: access
HEUREKA! SYSCALL at 0xb7faf7ae: mmap2
HEUREKA! SYSCALL at 0xb7faf689: access
HEUREKA! SYSCALL at 0xb7faf4b5: openat
HEUREKA! SYSCALL at 0xb7faf419: fstat64
HEUREKA! SYSCALL at 0xb7faf7ae: mmap2
HEUREKA! SYSCALL at 0xb7faf755: close
HEUREKA! SYSCALL at 0xb7faa758: access
HEUREKA! SYSCALL at 0xb7faf4b5: openat
HEUREKA! SYSCALL at 0xb7faf57e: read
HEUREKA! SYSCALL at 0xb7faf419: fstat64
HEUREKA! SYSCALL at 0xb7faf7ae: mmap2
HEUREKA! SYSCALL at 0xb7faf7ae: mmap2
HEUREKA! SYSCALL at 0xb7faf7ae: mmap2
HEUREKA! SYSCALL at 0xb7faf755: close
HEUREKA! SYSCALL at 0xb7faa758: access
HEUREKA! SYSCALL at 0xb7faf4b5: openat
HEUREKA! SYSCALL at 0xb7faf57e: read
HEUREKA! SYSCALL at 0xb7faf419: fstat64
HEUREKA! SYSCALL at 0xb7faf7ae: mmap2
HEUREKA! SYSCALL at 0xb7faf822: mprotect
HEUREKA! SYSCALL at 0xb7faf7ae: mmap2
HEUREKA! SYSCALL at 0xb7faf7ae: mmap2
HEUREKA! SYSCALL at 0xb7faf755: close
HEUREKA! SYSCALL at 0xb7faa758: access
HEUREKA! SYSCALL at 0xb7faf4b5: openat
HEUREKA! SYSCALL at 0xb7faf57e: read
HEUREKA! SYSCALL at 0xb7faf419: fstat64
HEUREKA! SYSCALL at 0xb7faf7ae: mmap2
HEUREKA! SYSCALL at 0xb7faf7ae: mmap2
HEUREKA! SYSCALL at 0xb7faf755: close
HEUREKA! SYSCALL at 0xb7faa758: access
HEUREKA! SYSCALL at 0xb7faf4b5: openat
HEUREKA! SYSCALL at 0xb7faf57e: read
HEUREKA! SYSCALL at 0xb7faf419: fstat64
HEUREKA! SYSCALL at 0xb7faf7ae: mmap2
HEUREKA! SYSCALL at 0xb7faf7ae: mmap2
HEUREKA! SYSCALL at 0xb7faf755: close
HEUREKA! SYSCALL at 0xb7faa758: access
HEUREKA! SYSCALL at 0xb7faf4b5: openat
HEUREKA! SYSCALL at 0xb7faf57e: read
HEUREKA! SYSCALL at 0xb7faf419: fstat64
HEUREKA! SYSCALL at 0xb7faf7ae: mmap2
HEUREKA! SYSCALL at 0xb7faf7ae: mmap2
HEUREKA! SYSCALL at 0xb7faf7ae: mmap2
HEUREKA! SYSCALL at 0xb7faf755: close
HEUREKA! SYSCALL at 0xb7faf7ae: mmap2
HEUREKA! SYSCALL at 0xb7f95bd9: set_thread_area
HEUREKA! SYSCALL at 0xb7faf822: mprotect
HEUREKA! SYSCALL at 0xb7faf822: mprotect
HEUREKA! SYSCALL at 0xb7faf822: mprotect
HEUREKA! SYSCALL at 0xb7faf822: mprotect
HEUREKA! SYSCALL at 0xb7faf822: mprotect
HEUREKA! SYSCALL at 0xb7faf822: mprotect
HEUREKA! SYSCALL at 0xb7faf822: mprotect
HEUREKA! SYSCALL at 0xb7faf7ff: munmap
test.py

这是 strace 的输出:

execve("/bin/ls", ["/bin/ls"], 0xbfef5e40 /* 45 vars */) = 0
brk(NULL)                               = 0x220c000
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f00000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=89915, ...}) = 0
mmap2(NULL, 89915, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7eea000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/lib/i386-linux-gnu/libselinux.so.1", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\0L\0\0004\0\0\0"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0644, st_size=169960, ...}) = 0
mmap2(NULL, 179612, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7ebe000
mmap2(0xb7ee7000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x28000) = 0xb7ee7000
mmap2(0xb7ee9000, 3484, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7ee9000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/lib/i386-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\1\1\1\3\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\20\220\1\0004\0\0\0"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=1942840, ...}) = 0
mmap2(NULL, 1948188, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7ce2000
mprotect(0xb7eb7000, 4096, PROT_NONE)   = 0
mmap2(0xb7eb8000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1d5000) = 0xb7eb8000
mmap2(0xb7ebb000, 10780, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7ebb000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/lib/i386-linux-gnu/libpcre.so.3", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\360\16\0\0004\0\0\0"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0644, st_size=480564, ...}) = 0
mmap2(NULL, 483512, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7c6b000
mmap2(0xb7ce0000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x74000) = 0xb7ce0000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/lib/i386-linux-gnu/libdl.so.2", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\320\n\0\0004\0\0\0"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0644, st_size=13796, ...}) = 0
mmap2(NULL, 16500, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7c66000
mmap2(0xb7c69000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x2000) = 0xb7c69000
close(3)                                = 0
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/lib/i386-linux-gnu/libpthread.so.0", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\300P\0\0004\0\0\0"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=142820, ...}) = 0
mmap2(NULL, 123544, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7c47000
mmap2(0xb7c62000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1a000) = 0xb7c62000
mmap2(0xb7c64000, 4760, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7c64000
close(3)                                = 0
mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7c45000
set_thread_area({entry_number=-1, base_addr=0xb7c45780, limit=0x0fffff, seg_32bit=1, contents=0, read_exec_only=0, limit_in_pages=1, seg_not_present=0, useable=1}) = 0 (entry_number=6)
mprotect(0xb7eb8000, 8192, PROT_READ)   = 0
mprotect(0xb7c62000, 4096, PROT_READ)   = 0
mprotect(0xb7c69000, 4096, PROT_READ)   = 0
mprotect(0xb7ce0000, 4096, PROT_READ)   = 0
mprotect(0xb7ee7000, 4096, PROT_READ)   = 0
mprotect(0x469000, 4096, PROT_READ)     = 0
mprotect(0xb7f2d000, 4096, PROT_READ)   = 0
munmap(0xb7eea000, 89915)               = 0

直到这里完全符合我自己的输出,但剩余的系统调用从未出现在我的程序中。这就是问题所在。我希望有人知道答案:P 如果您有任何问题,请提出!

set_tid_address(0xb7c457e8)             = 9767
set_robust_list(0xb7c457f0, 12)         = 0
rt_sigaction(SIGRTMIN, {sa_handler=0xb7c4baf0, sa_mask=[], sa_flags=SA_SIGINFO}, NULL, 8) = 0
rt_sigaction(SIGRT_1, {sa_handler=0xb7c4bb80, sa_mask=[], sa_flags=SA_RESTART|SA_SIGINFO}, NULL, 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [RTMIN RT_1], NULL, 8) = 0
ugetrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM_INFINITY}) = 0
uname({sysname="Linux", nodename="p200300D053D7310F22107AFFFE01D58C", ...}) = 0
statfs("/sys/fs/selinux", 0xbffeddb4)   = -1 ENOENT (No such file or directory)
statfs("/selinux", 0xbffeddb4)          = -1 ENOENT (No such file or directory)
brk(NULL)                               = 0x220c000
brk(0x222d000)                          = 0x222d000
brk(0x222e000)                          = 0x222e000
openat(AT_FDCWD, "/proc/filesystems", O_RDONLY|O_CLOEXEC) = 3
fstat64(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
read(3, "nodev\tsysfs\nnodev\trootfs\nnodev\tr"..., 1024) = 401
read(3, "", 1024)                       = 0
close(3)                                = 0
brk(0x222d000)                          = 0x222d000
access("/etc/selinux/config", F_OK)     = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/usr/lib/locale/locale-archive", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=3365136, ...}) = 0
mmap2(NULL, 2097152, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7a45000
close(3)                                = 0
ioctl(1, TCGETS, {B38400 opost isig icanon echo ...}) = 0
ioctl(1, TIOCGWINSZ, {ws_row=48, ws_col=198, ws_xpixel=0, ws_ypixel=0}) = 0
openat(AT_FDCWD, ".", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_CLOEXEC|O_DIRECTORY) = 3
fstat64(3, {st_mode=S_IFDIR|0775, st_size=4096, ...}) = 0
getdents64(3, /* 3 entries */, 32768)   = 80
getdents64(3, /* 0 entries */, 32768)   = 0
close(3)                                = 0
fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 1), ...}) = 0
write(1, "test.py\n", 8test.py
)                = 8
close(1)                                = 0
close(2)                                = 0
exit_group(0)                           = ?
+++ exited with 0 +++


最佳答案

如果您寻找int 0x80,您将错过使用sysenter指令进行的正常32位系统调用(通常通过 glibc 调用 VDSO 页面)。 https://blog.packagecloud.io/eng/2016/04/05/the-definitive-guide-to-linux-system-calls/。 (同样在旧的 AMD CPU 上,32 位 syscall 也是可能的,如果它们太旧而无法支持 sysenter,则可能默认使用。)

我想早期的 ld.so 代码使用遗留的 int 0x80 机制而不是调用 VDSO。 (这是有道理的;VDSO 将自己呈现为 一个映射到内存中的 ELF 共享对象;直到动态链接器将函数指针设置到其中,它才能使用它。)

64 位模式更简单:对于 64 位 ABI,一切都使用 syscall


请注意,在指令执行之前或之后检查机器代码可能会被试图隐藏在您的跟踪之外的代码欺骗。第二个线程可能会在您查看机器代码字节后交叉修改它,在它执行之前。 (也许让一个线程存储一个标志,这将导致另一个线程在它注意到时立即存储。如果时机合适,这可能会在您的 ptrace 提取和执行下一个单步之间潜伏。)

strace(或沙盒/系统调用日志记录或代码过滤工具可能试图欺骗它)在 64 位模式下试图弄清楚是调用了 32 位还是 64 位 ABI(因为调用号不同)。 Can ptrace tell if an x86 system call used the 64-bit or 32-bit ABI?(或者曾经是,直到 Linux 内核 5.3 添加了 PTRACE_GET_SYSCALL_INFO)。

可以在 64 位代码中调用 int 0x80,即使它基本上不是一个好主意:What is the explanation of this x86 Hello World using 32-bit int 0x80 Linux system calls from _start? 有一些关于内核端发生的事情的细节系统调用。


同样,只有当您关心试图从您的跟踪器中混淆其事件的程序时,这才是一个问题,例如作为反调试措施。让另一个线程覆盖正在执行的代码不会偶然发生。但在设计调试/跟踪工具时需要注意这一点。

如果将此代码用作库,那么有人可能会尝试从中构建沙盒系统调用过滤器,那么真正的危险就来了。例如检查所有文件访问系统调用中的路径,或拒绝未以只读方式打开的 open 调用。那么逃避追踪就成了一个真正的安全问题。 (当然,一般来说还有更好的沙盒方法。)

关于python - 使用自己的 python 调试器获取系统调用时遇到问题,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60439155/

相关文章:

python - 加速 Python 中的循环

python - 将 ahk 转换为 python

python - Tensorflow找不到GPU

linux - Linux在进行IO时如何管理内存

linux - 在linux命令行上创建串行邮件

python - 推特/通用分类训练语料库

linux - 如何将一个 docker 卷上的文件解压缩到另一个?

c++ - 为什么 C++ 中的这个 lambda 包含每个引用?

assembly - I386指令集中GRPL指令的作用是什么?

gcc - 无法从引导加载程序(NASM + GCC工具链)调用实模式C函数