Trying to deobfuscate x64 code. You can skim/skip most of this, since it mainly exists to prove I have tried and exhausted all avenues.
Burning Purpose
To massively automate the deobfuscation within a [memory dump] of a 64-bit Windows game.
Methods currently in use
PCRE driven byte replacement
Works fine with simple obsfucation like Variant 1 & 2
Original (Variant 1):
48 8D 64 24 F8 - lea rsp,[rsp-08] ; Stack -= 8
48 89 2C 24 - mov [rsp],rbp ; Push RBP
48 8D 2D 156A5A00 - lea rbp,[7FF749022784] ; Put JMP target in RSP
48 87 2C 24 - xchg [rsp],rbp ; Pop RBP (RBP restored)
48 8D 64 24 08 - lea rsp,[rsp+08] ; Stack += 8 (Balanced)
FF 64 24 F8 - jmp qword ptr [rsp-08] ; JMP (target)
Original (Variant 2):
48 89 6c 24 f8 - mov [rsp-0x8], rbp
48 8d 64 24 f8 - lea rsp, [rsp-0x8]
(rest as per Variant 1)
Deobsfucated:
90 90 90 .. .. - (Variant 1: NOP * 9, Variant 2: NOP * 10)
90 90 NOP * 2 ; Pad instruction to preserve
; RIP of next instruction
E9 ?? ?? ?? ?? - JMP NEAR
90 90 90 .. .. - NOP * 13
De-obfuscation script:
#!/usr/bin/env sh
# |------------- 11 bytes--------| |-- 5 bytes--| |---------------- - 13 bytes---------|
# Signature: 48 8D 64 24 F8 48 89 2C 24 48 8D 2D ?? ?? ?? ?? 48 87 2C 24 48 8D 64 24 08 FF 64 24 F8 (29 bytes)
# Translate: 90 90 90 90 90 90 90 90 90 90 90 E9 ?? ?? ?? ?? 90 90 90 90 90 90 90 90 90 90 90 90 90
#
# |------------- 12 bytes ----------| |-- 5 bytes--| |---------------- - 13 bytes---------|
# Signature: 48 89 6c 24 f8 48 8d 64 24 f8 48 8d 2d ?? ?? ?? ?? 48 87 2c 24 48 8d 64 24 08 ff 64 24 f8 (30 bytes)
# Translate: 90 90 90 90 90 90 90 90 90 90 90 90 e9 ?? ?? ?? ?? 90 90 90 90 90 90 90 90 90 90 90 90 90
xxd -ps Game_Dumped.exe |
sed -e 's/\(..\)/\1 /g' |
tr '\n' ' ' |
perl -p -e "s/48 8d 64 24 f8 48 \
89 2c 24 48 8d 2d (.. .. .. ..) \
48 87 2c 24 48 8d 64 24 08 ff 64 24 f8/90 90 90 90 90 \
90 90 90 90 90 90 e9 \1 90 90 90 90 90 90 90 90 90 90 \
90 90 90/g ; s/48 89 6c 24 f8 48 8d 64 24 f8 48 8d 2d\
(.. .. .. ..) 48 87 2c 24 48 8d 64 24 08 ff 64 24 \
f8/90 90 90 90 90 90 90 90 90 90 90 90 e9 \1 90 90 90 \
90 90 90 90 90 90 90 90 90 90/g" |
xxd -r -ps > Game_Dumped_NOP2.exe
The 3rd variety
143fe7a26 48 89 6c 24 f8 mov [rsp-8], rbp
143fe7a2b 48 8d 64 24 f8 lea rsp, [rsp-8]
143fe7a30 e9 60 71 8d ff jmp loc_1438beb95
1438beb95 48 8d 2d 10 b0 3a fd lea rbp, sub_jump_target
1438beb9c 48 87 2c 24 xchg rbp, [rsp]
1438beba0 48 8d 64 24 08 lea rsp, [rsp+8]
1438beba5 ff 64 24 f8 jmp qword ptr [rsp-8]
This is the same as the Variant 2, except it has been broken into two sections. Planned solution will involve distorm3's flow control flags, and rewriting jmp to 0x143fe7a26.
Variations on stack manipulation
48 89 E0 mov rax, rsp
48 05 F8 FF FF FF add rax, 0FFFFFFFFFFFFFFF8h ; Add
48 89 C4 mov rsp, rax
48 89 1C 24 mov [rsp], rbx
What a horrifically long way to effect to decrement RSP by 8. But by now, I am fairly used to it, and it wouldn't bother me, but I have just enabled the "Stack Pointer" in IDA [6.8]'s General Options, and realising that IDA isn't including lea rsp, [rsp+-8] in it's calculations of the stack pointer, which is stopping it properly analysing the code.
RSP Bytes Disassembly
--- ------------------- -------------------------------
000 48 89 E0 mov rax, rsp
000 48 05 F8 FF FF FF add rax, 0FFFFFFFFFFFFFFF8h
000 48 89 C4 mov rsp, rax ; It tracked this
-20 48 89 1C 24 mov [rsp], rbx
-20 48 83 EC 20 sub rsp, 20h ; and this
000 48 8B 41 10 mov rax, [rcx+10h]
000 48 89 4C 24 F8 mov [rsp-8], rcx
000 48 8D 64 24 F8 lea rsp, [rsp-8] ; but not this
000 48 8B 1C 24 mov rbx, [rsp]
I am also starting to suspect that there are going to be many permutations of all these techniques, and I need to start addressing the problem in IDA.
The problem is, the only sample source I can find uses IDAPython idaapi low-level functions, so the code is ridiculously long, and as I am replacing a 5 byte instruction with a 4 byte one, I cannot find a way to alter the operand I have inadvertently created. (Fortunately, in this case, it's just CLC).
Update: I have fixed this issue, and the solution has drastically reduced the size of my script. The pertinent fix is below:
def replace_pattern(ea):
search = [0x48, 0x8d, 0x64, 0x24, 0xf8]
replace = [0x48, 0x83, 0xec, 0x08, 0x90]
current = []
for i in xrange(5):
current.append(idaapi.get_byte(ea+i))
if 0 == cmp(search, current):
for i in xrange(5):
# fixed: replace put_byte with patch_byte
idaapi.patch_byte(ea+i, replace[i])
return 1
return 0
[original code] Incidentally, the example code was written by our own Rolf Rolles
import idaapi
import idc
# Planned task: replace
# 48 8d 64 24 f8 lea rsp,[rsp-0x8]
# with
# 48 83 ec 08 sub rsp,0x8
# 90 nop
#
# Actual result:
# Replaced: 48 8d 64 24 f8 lea rsp,[rsp-0x8]
# with: : 48 83 ec 08 sub rsp,0x8
# f8 clc
#
# Verdict, close enough, but way too much code involved.
def match_pattern(ea):
search = [0x48, 0x8d, 0x64, 0x24, 0xf8]
replace = [0x48, 0x83, 0xec, 0x08, 0x90]
current = []
for i in xrange(5):
current.append(idaapi.get_byte(ea+i))
if 0 == cmp(search, current):
return 1
return 0
# Note: I thought I might be able to simply rewrite
# at a byte level, but it threw an exception.
#
# for i in xrange(4):
# idaapi.put_byte(ea+i, replace[i])
class deobfu_hook(idaapi.IDP_Hooks):
def __init__(self):
idaapi.IDP_Hooks.__init__(self)
self.n = idaapi.netnode("$ X86 Deobfuscator Modifications",0,1)
def custom_ana(self):
# Check first two bytes "by hand" for speed
b = idaapi.get_byte(idaapi.cmd.ea)
if b == 0x48: # First byte
b = idaapi.get_byte(idaapi.cmd.ea+1)
if b == 0x8d: # Second byte
# Discard speed, do a full match
if match_pattern(idaapi.cmd.ea, 0, 0):
# If matched, supply all required values for
# SUB RSP,8 - Surely there is an easier way!
idaapi.cmd.itype = 0xd1
idaapi.cmd.size = 4
idaapi.cmd.auxpref = 0x1810
idaapi.cmd.segpref = 0
idaapi.cmd.insnpref = 0x48
idaapi.cmd.flags = 2
idaapi.cmd.Op1.type = 1
idaapi.cmd.Op1.offb = 0
idaapi.cmd.Op1.offo = 0
idaapi.cmd.Op1.flags = 8
idaapi.cmd.Op1.dtyp = 7
idaapi.cmd.Op1.reg = 4
idaapi.cmd.Op1.phrase = 4
idaapi.cmd.Op1.value = 0
idaapi.cmd.Op1.addr = 0
idaapi.cmd.Op1.specval = 0
idaapi.cmd.Op1.specflag1 = 0
idaapi.cmd.Op1.specflag2 = 0
idaapi.cmd.Op1.specflag3 = 0
idaapi.cmd.Op1.specflag4 = 0
idaapi.cmd.Op2.type = 5
idaapi.cmd.Op2.offb = 3
idaapi.cmd.Op2.offo = 0
idaapi.cmd.Op2.flags = 8
idaapi.cmd.Op2.dtyp = 7
idaapi.cmd.Op2.reg = 0
idaapi.cmd.Op2.phrase = 0
idaapi.cmd.Op2.value = 8
idaapi.cmd.Op2.addr = 0
idaapi.cmd.Op2.specval = 0
idaapi.cmd.Op2.specflag1 = 0
idaapi.cmd.Op2.specflag2 = 0
idaapi.cmd.Op2.specflag3 = 0
idaapi.cmd.Op2.specflag4 = 0
return True
return False
class deobfu_t(idaapi.plugin_t):
flags = idaapi.PLUGIN_PROC | idaapi.PLUGIN_HIDE
comment = "Deobfuscator"
wanted_hotkey = ""
help = "Runs transparently"
wanted_name = "deobx86"
hook = None
def init(self):
self.hook = None
self.hook = deobfu_hook()
self.hook.hook()
print("deobfu init")
return idaapi.PLUGIN_KEEP
def run(self, arg):
pass
def term(self):
print("deobfu term")
if self.hook:
self.hook.unhook()
def PLUGIN_ENTRY():
print("PLUGIN_ENTRY:deobfu")
return deobfu_t()
Where do you want to go today?
I want a better solution, and I am not afraid to code it. I am how-ever in need of some starting advice, plus I need to make sure I am not re-coding the wheel here.
I have since written some other IDAPython code using the higher level idautils to create call trees, and collate xrefs and such. But I don't know how to rewrite the actual disassembled code at that level. There is one example in the IDAPython repo: https://github.com/pfalcon/idapython/blob/master/examples/ex_idphook_asm.py but that's
- Full of silly bugs (I fixed them)
- Hooks the Assembly command, not the disassembly process
I have reviewed the answer to a similar question at Creating IDA Pro debugger plugins - API documentation and examples?. I have looked at various examples of quite nice IDAPython code that can:
But, I have seen nothing about actually changing instructions.
I have not purchased The IDA Pro Book because I do not live in the U.S. and I do not want to wait n weeks for on-demand printing and delivery. I am not adverse to writing an .idc, as I'm quite familiar with C (more-so that with Python), although I suspect despite the shallower learning curve (and assumedly ready examples) it would be harder (long term) than using higher level IDAPython code. (I'm learning Python as I go, but .. well, didn't we all?)
Because the code I am working with is exclusively 64-bit, there is little (basically no) pre-existing deobfu samples or code out there.
And so here I find myself, asking for your patient guidance. (Very patient if you actually read all of that).
PS: I took the time to document everything I had done, because I know how little respect we all have for people who don't even attempt something before hollering for help.
PPS: OMG 2nd most avid member of RE is Igor Skochinsky, I can only bow in humility.
patch_byteinstead ofput_byte. – Orwellophile Jul 18 '16 at 17:03