Fandom

Magic Lantern Firmware Wiki

IDAPython/Code paths

< IDAPython

328pages on
this wiki
Add New Page
Talk0 Share

Ad blocker interference detected!


Wikia is a free-to-use site that makes money from advertising. We have a modified experience for viewers using ad blockers

Wikia is not accessible if you’ve made further modifications. Remove the custom ad blocker rule(s) and the page will load as expected.

Remember the C# compiler message: not all code paths return a value?

So what's a code path?

A path is a unique sequence of branches from the function entry to the exit [1]. Any module with a succession of n decisions in it can have up to 2^n paths within it [2].

The basicsEdit

In order to perform static analysis of a function, it's useful to identify the code paths. This can be done by considering jumps (especially conditional ones).

Conditional suffixes in ARM ASM are [here]. You might have noticed that each suffix has an opposite one, like EQ -- NE or LS -- GE.

A conditional branch looks like this:

CMP     R0, #0x1F
LDREQ   R0, [R4,#4]
CMPEQ   R0, #0
BNE     a
BL      b

which is equivalent to (in Python-like pseudocode):

if R0 == 0x1F:
    R0 = RAM(R4 + 4)
    if R0 != 0:
        goto a
b()

Other example:

CMP R0, 0
BNE a
BEQ b

This is an if-then-else construct.

All conditional suffixes (EQ, NE etc.) look at the flags to decide if a branch is taken or not. Flags are changed by CMP, CMN, TST, TEQ and MSR (afaik), and also by the instructions with the S suffix. If you know more instructions which change the flags, please edit.

So, when one of those instructions is encountered, we have another branch.

Code is here. It's a mess, so if you find bugs, please report them here.

A code path is stored as a list of RAM addresses (long integers) and is built recursively by taking (or not taking) each conditional branch.

To test this, GraphViz can create a flow chart from all the code paths:

Static

Tricky casesEdit

Complex branch conditionsEdit

A single "if" and two conditions which are not mutually exclusive: MI (opposite of PL) and CS (opposite of CC). I think both can be true or false or one true/other false simultaneously, so in this case there are 4 possible paths from the same point.

ROM:FF44F818                 MOVS    R2, R2,LSL#31
ROM:FF44F81C                 LDRMIB  R2, [R1],#1
ROM:FF44F820                 LDRCSB  R3, [R1],#1
ROM:FF44F824                 LDRCSB  R12, [R1],#1
ROM:FF44F828                 STRMIB  R2, [R0],#1
ROM:FF44F82C                 STRCSB  R3, [R0],#1
ROM:FF44F830                 STRCSB  R12, [R0],#1

Solution: when considering which branch to take, instead of saying "take the MI branch", it's better to say "take the MI or CS branch" (without limiting the number of non-opposite branch conditions). The list of conditions should be reset when encountering an instruction which changes the flags.

  • if you understand this and are native English speaker, please rephrase it

LoopsEdit

Those are really tricky. They can be found easily (in theory) by detecting a jump to a visited address (instruction).

Trick: there can be visited instructions which are not executed (because of the opposite branch condition. So there must be two lists: one with executable instructions (the actual code path) and another one with visited addresses (which I have to fix in the implementation).

Also on Fandom

Random Wiki