c - 尝试从汇编程序(64 位)的 glibc 调用 C 函数

我一直在学习 Assembly Language Step by Step: Third Edition 并且正在学习最后一章“走向 C”。我试图获得一种一致的方法来转换 32 位代码，该代码在我的 64 位 Ubuntu 系统上调用 C 库 (glibc) 函数 puts。 (我想继续阅读文本的最后 50 页，这些文本可能更深入地介绍了 C [更多极客双关语]，但来自用 32 位代码编写的汇编基础)。代码是:

SECTION .data           ; Section containing initialised data
EatMsg: db "Eat at Joe's!",0

SECTION .text           ; Section containing code
extern puts             ; Simple "put string" routine from clib
global main             ; Required so linker can find entry point
main:
        push ebp        ; Set up stack frame for debugger
        mov ebp,esp
        push ebx        ; Must preserve ebp, ebx, esi, & edi
        push esi
        push edi

;;; Everything before this is boilerplate; use it for all ordinary apps!
        push EatMsg     ; Push address of message on the stack
        call puts       ; Call clib function for displaying strings
        add esp,4       ; Clean stack by adjusting ESP back 4 bytes

;;; Everything after this is boilerplate; use it for all ordinary apps!
        pop edi         ; Restore saved registers
        pop esi
        pop ebx
        mov esp,ebp     ; Destroy stack frame before returning
        pop ebp
        ret             ; Return control to Linux

建议的 nasm 和链接器命令是

nasm -f elf -g -F stabs eatclib.asm
gcc eatclib.o -o eatclib

我找到的最接近解决方案的方法是:Call C functions from 64-bit assembly .

我试过将扩展寄存器转换为rbp、rsp等；在调用 puts 后将堆栈指针调整 8 位而不是 4 位，并使用以下方法调整 makefile:

nasm -f elf64 -g -F dwarf eatclib.asm

和

gcc eatclib.o -o eatclib -m64 -static

但是出现了段错误。

我对 C 调用约定的理解仍然模糊/脆弱，以至于当我尝试使用 gdb 调试器时，我并没有真正深入地尝试找出错误(问题都只是有点熟悉32 位约定，与 C 无关)。本书旨在成为几乎没有 C 背景的新手汇编程序员的入门书。

从另一个方向尝试，一个使用 puts 和字符串的简单 C 程序生成的文件(使用 gcc -S 选项)为:

.file   "SayHello.c"
        .text
        .section        .rodata
        .align 8
.LC0:
        .string "This is based on an example from C Primer Plus"
        .text
        .globl  main
        .type   main, @function
main:
.LFB0:
        .cfi_startproc
        pushq   %rbp
        .cfi_def_cfa_offset 16
        .cfi_offset 6, -16
        movq    %rsp, %rbp
        .cfi_def_cfa_register 6
        leaq    .LC0(%rip), %rdi
        call    puts@PLT
        movl    $0, %eax
        popq    %rbp
        .cfi_def_cfa 7, 8
        ret

编译后的代码在这里运行(除了 .cfi 指令，.rodata 表示什么，以及为什么 gas 卡住了 @ PLT on puts。)这当然是 gas 语法，我使用的文本主要是 NASM。

我还尝试使用加载程序而不是 gcc，并在 Professional Assembly Language(作者 Richard Blum)的第 89 页找到一行

ld -dynamic-linker /lib/ld-linux.so.2 -o eatclib -lc eatclib.o

但最终出现了我之前遇到的非常典型的链接器错误:

ld: i386 architecture of input file `eatclib.o' is incompatible with i386:x86-64 output
ld: warning: cannot find entry symbol _start; defaulting to 0000000000400250
makefile:2: recipe for target 'eatclib' failed

我试过将 -m32 选项传递给链接器，但也无济于事。

无论如何，我正在寻找可行的建议。在我的搜索中，我看到了一些示例，其中人们建议使用 apt-get 并安装新的(实际上是旧的)库，但这些似乎有效地破坏了系统范围内的 64 位内容——当我已经能够使用传递给链接器的 -melf_i386 选项运行以前的 32 位代码。

最佳答案

要汇编和链接使用 libc 的 64 位 nasm 代码，请键入:

nasm -f elf64 program.asm
gcc -o program program.o

根据您的系统和编程风格，您可能需要将 -no-pie 传递给 gcc，以便它接受位置相关的代码。

不建议在 libc 中链接时直接调用链接器，因为没有稳定的方法手动拉入 C 运行时初始化代码。仅仅将 -lc 传递给链接器不足以使 libc 正常工作。

注意 elf64 使 nasm 发出 64 位目标文件。除非另有说明，否则 gcc 在 64 位平台上使用 64 位代码，因此不需要其他选项。您可能想要添加调试符号，但请记住，stabs 是一种过时的格式。你可能想要这个:

nasm -f elf64 -gdwarf program.asm

机械地转换源代码或多或少是可能的。请记住以下差异:

指针和堆栈槽为 8 字节长，所有通用寄存器已扩展为 8 字节；前 8 个寄存器的 64 位变体称为 rax、rcx、rdx、rbx、 rsp, rbp, rsi, 和 rdi。
存在 8 个新的通用寄存器 r8 到 r15。它们的 32 位、16 位和 8 位版本称为 r8d、r8w、r8b` 等。
SSE 指令用于浮点而不是 x87 指令
64 位代码通常遵循与 32 位代码不同的调用约定。在 Linux 等类 UNIX 系统上，amd64 SysV ABI一般使用。在此 ABI 中，标量参数在寄存器 rdi、rsi、rdx、rcx 中从左向右传递、r8 和 r9。寄存器 rbx、rbp、rsp、r12、r13、 r14 和 r15 必须由被调用者保留，所有其他通用寄存器可以自由覆盖。浮点参数在 SSE 寄存器中传递和返回。如果参数太多，额外的参数将被传递到堆栈上。
SysV ABI 要求堆栈指针在函数调用时对齐到 16 字节。由于 call 指令压入 8 个字节，函数序言中的 push rbp 指令压入另外 8 个字节，默认情况下就是这种情况，除非您手动在堆栈上分配空间。请记住以 16 字节为增量执行此操作。

这是从您的问题中翻译成 64 位代码的代码。所有更改均已标记:

        SECTION .data
EatMsg: db "Eat at Joe's!",0

        SECTION .text
        extern puts
        global main
main:                           ; function entry (stack alignment: 16 bytes + 8 bytes)
        push rbp                ; setup...
        mov rbp, rsp            ; the stack frame (stack now aligned to 16 bytes + 0 bytes)

                                ; since we have so many registers, I only preserve those
                                ; I want to use and that must be preserved, of which there
                                ; are none in this program.

        lea rdi, [rel EatMsg]   ; load address of EatMsg into rdi
        call puts               ; call puts
                                ; no cleanup needed as we have not pushed anything

        pop rbp                 ; restore rbp
        ret                     ; return

请注意，我遗漏了一堆样板文件。 lea 用于加载 EatMsg 的地址，而不是更简单的 mov rdi, EatMsg，因此您的程序是位置无关的。如果您不知道这意味着什么，您可以放心地忽略这个花絮，稍后再看。

最后，您通常可以忽略 cfi 指令。它们为异常处理添加元数据，这仅在您的代码调用抛出异常的 C++ 函数时才重要。它们不会更改代码本身的行为。

关于c - 尝试从汇编程序(64 位)的 glibc 调用 C 函数，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/53982670/

c - 尝试从汇编程序(64 位)的 glibc 调用 C 函数

上一篇：c - 在c中使用结构体地址和指针访问结构体的第一个元素

下一篇：linux - 使用 awk/sed/bash 检索所有必需字段后打印