add my code and shit
This commit is contained in:
141
README.md
141
README.md
@@ -152,6 +152,11 @@ The actual code in the test case is boring and doesn't matter. I'm sure there
|
|||||||
are disagreements on whether I've picked good candidates of "exotic" or new
|
are disagreements on whether I've picked good candidates of "exotic" or new
|
||||||
instructions, but those were the first that came to mind.
|
instructions, but those were the first that came to mind.
|
||||||
|
|
||||||
|
(It's also doubtful whether you'll ever encounter functions where the first
|
||||||
|
instructions are of this category, because most probably there's some setup
|
||||||
|
needed before, e.g. checking that adresses are aligned, initalizing loop
|
||||||
|
counters, yadda, yadda)
|
||||||
|
|
||||||
Test case: loop and TailRec
|
Test case: loop and TailRec
|
||||||
===========================
|
===========================
|
||||||
|
|
||||||
@@ -235,6 +240,133 @@ Which isn't right and will crash horribly.
|
|||||||
| MHook| | | X | | | | |
|
| MHook| | | X | | | | |
|
||||||
+----------+-----+------+------------+---+------+----+-------+
|
+----------+-----+------+------------+---+------+----+-------+
|
||||||
|
|
||||||
|
As expected nothing could correctly hook the loop. In fact I had to comment out
|
||||||
|
those parts because even Catch2 couldn't recover from the crashes generated by
|
||||||
|
the botched hooks. Some hooking engines are a bit lacking in their support for
|
||||||
|
newer instruction sets, but a simple update of the dissassembler library should
|
||||||
|
fix that.
|
||||||
|
|
||||||
|
I was pleasantly suprised by MinHook, both the general AIP and because it
|
||||||
|
managed to build a trampoline that worked perfectly even for the tail
|
||||||
|
recursion case. I'd recommend it, even though it seems theres no chance that
|
||||||
|
the dissassembler will ever be updated.
|
||||||
|
|
||||||
|
Detecting tail recursive functions / loops into overwritten code
|
||||||
|
================================================================
|
||||||
|
|
||||||
|
Back in 2015 I wanted to write my own hooking engine which would be able to
|
||||||
|
hook ALL THE FUNCTIONS! And I did actually start to write it and then
|
||||||
|
abandoded it, before I got to the interesting part. However since then I had
|
||||||
|
the basic idea down:
|
||||||
|
|
||||||
|
1) Find out how long the function is
|
||||||
|
2) Analyze it, by checking whether some jump could jump into the overwritten
|
||||||
|
instructions
|
||||||
|
3) Somehow fix that
|
||||||
|
|
||||||
|
Fixing that code probably means putting the whole function in the trampoline,
|
||||||
|
by definition there is no space where to put the additional/longer instructions.
|
||||||
|
|
||||||
|
However I think that hooking engines should at least fail fast if they can't
|
||||||
|
hook that function and give the user the ability to handle that error at that
|
||||||
|
stage instead of waiting for unpredictable crashes. I'll post example code
|
||||||
|
[here](https://git.free-hack.com/wacked/x64hook) and outline the general
|
||||||
|
technique below.
|
||||||
|
|
||||||
|
(My x64hook hooking engine doesn't work. There's literally two interesting
|
||||||
|
functions in it, and I give pseudocode for them below)
|
||||||
|
|
||||||
|
Estimate the length of a function
|
||||||
|
---------------------------------
|
||||||
|
|
||||||
|
Note: This is an estimation of the function length. There's various ways to go
|
||||||
|
about to do it, one way would be to search pro- and epilogue. Which would fail
|
||||||
|
for all functions that -- for whatever reason -- don't have that. I'm sure this
|
||||||
|
way also isn't perfect, but maybe it could be used as another source of
|
||||||
|
information[5].
|
||||||
|
|
||||||
|
Over the years I've seen various attempts at estimating the function length.
|
||||||
|
One of the top hits for my google history is a question on stackoverflow
|
||||||
|
which[3] uses the same technique that I've seen in various malware strains -
|
||||||
|
checking byte for byte until the RET opcode is found. Which won't work if
|
||||||
|
either:
|
||||||
|
|
||||||
|
1) The `RET imm16` opcode is used, which is often the case for __stdcall funcs.
|
||||||
|
2) There are multiple returns
|
||||||
|
3) The function doesn't actually return with the RET instruction. For example
|
||||||
|
if a function A at its end calls another function B, with A and B sharing the
|
||||||
|
same parameters and either A or B not modifying the stack pointer it is
|
||||||
|
perfectly possible to just jump to function B. Exectution will continue in B,
|
||||||
|
which ends with a normal RET.
|
||||||
|
4) The value 0xC3 appears for some other reason in the function.
|
||||||
|
|
||||||
|
4) can be easily solved by using a length disassember engine and just checking
|
||||||
|
the actual instruction byte. 1) and 3) aren't that hard either, you'll just
|
||||||
|
need to check for some additional opcodes. What about 2)?
|
||||||
|
|
||||||
|
The key insight I had was why a function might have multiple returns -- because
|
||||||
|
it needed to do additional work in some cases. Which meant that there had to be
|
||||||
|
branching, to sometimes skip some instructions or get to them.
|
||||||
|
|
||||||
|
If there is a branch backwards it's a loop. But a branch forwards means that
|
||||||
|
the function extends at least up to there[4]. Or in pseudocode:
|
||||||
|
|
||||||
|
offsetOfInstr = 0
|
||||||
|
funcLen = 0
|
||||||
|
furthestJump = 0
|
||||||
|
while(can dissasemble next instruction)
|
||||||
|
{
|
||||||
|
offsetOfInstr += funcLen;
|
||||||
|
|
||||||
|
|
||||||
|
op = getOpcode(instruction);
|
||||||
|
if(is_jump(op))
|
||||||
|
{
|
||||||
|
off = get_jump_offset(instruction);
|
||||||
|
if(off > furthestJump)
|
||||||
|
furthestJump = off;
|
||||||
|
}
|
||||||
|
|
||||||
|
if(is_end_of_function(op, furthestJump, offsetOfInstr))
|
||||||
|
{
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
bool is_end_of_function(opc, furthestJump, instrOffset)
|
||||||
|
{
|
||||||
|
if(opc == RET && furthestJump <= instrOffset)
|
||||||
|
return true;
|
||||||
|
else if(opc == UD_Ijmp)
|
||||||
|
{
|
||||||
|
if(destination is IMM || destination is register)
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
Detecting loops to the start of a function
|
||||||
|
------------------------------------------
|
||||||
|
|
||||||
|
firstJumpOffset = MAX_INT
|
||||||
|
foreach(instruction in function)
|
||||||
|
if(instruction is a jump)
|
||||||
|
jumpOffset = getOffset(instruction) // relative to function start
|
||||||
|
|
||||||
|
/* jumps to exactly the start of a function are fine, since that is
|
||||||
|
where our overwritten code starts. Thus it doesn't jump into the middle
|
||||||
|
of an instruction */
|
||||||
|
if(jumpOffset == 0)
|
||||||
|
continue
|
||||||
|
|
||||||
|
if(jumpOffset < firstJumpOffset)
|
||||||
|
firstJumpOffset = jumpOffset;
|
||||||
|
|
||||||
|
return firstJumpOffset < lengthNeededForHook
|
||||||
|
------------
|
||||||
|
|
||||||
[1] This is one of the things that could easily be improved, but haven't been
|
[1] This is one of the things that could easily be improved, but haven't been
|
||||||
because I just couldn't motivate myself. Putting the data right after the func
|
because I just couldn't motivate myself. Putting the data right after the func
|
||||||
meant that a section containing code needed to be writable. Which is bad. Also
|
meant that a section containing code needed to be writable. Which is bad. Also
|
||||||
@@ -246,3 +378,12 @@ compiled into. That would be read only data. Oh well.
|
|||||||
|
|
||||||
[2] And Microsoft decided at some point to make it even easier for their code
|
[2] And Microsoft decided at some point to make it even easier for their code
|
||||||
with the advent of hotpatching.
|
with the advent of hotpatching.
|
||||||
|
|
||||||
|
[3] https://stackoverflow.com/questions/8705215/get-the-size-length-of-a-c-function
|
||||||
|
|
||||||
|
[4] With some caveats, e.g. one could assume that no function is longer than
|
||||||
|
512 bytes. And obviously keeping in mind point 3
|
||||||
|
|
||||||
|
[5] Another heuristic would be to check for the next slide of filler
|
||||||
|
instructions, such as INT3 or NOP. Some compilers align functions on 16byte
|
||||||
|
boundarys and fill the gaps with those
|
||||||
Reference in New Issue
Block a user