1. Objectives

Boss: AI has already done the work of code restoration. Do we still need to write a tutorial for code restoration?

Me: Of course I have to write it. AI is AI, it is a batch assembly line operation, how can it be as cool as my purely manual code? My code is warm.

A famous person once said: You can never make money beyond your cognitive scope. So You can’t command AI to do work beyond your cognitive scope.

2. Steps

2. Conditional breakpoints 3. Data printing

We want to know when the program runs to 0x1170 x4=0xdd89ca68To know the origin of the code, you need to start tracing from the previous code.

I searched upwards and found a judgment at 0x1144, which means that several cycles have been passed before.

.text:0000000000001144 7F 01 08 EB                 CMP             X11, X8
.text:0000000000001148 88 CC FF 54                 B.HI            loc_AD8

For the convenience of analysis, I only want to trace the code for analysis in the last loop, so a conditional breakpoint is needed.

Before making a conditional breakpoint, we print out the values of X11 and X8.

debugger.addBreakPoint(module.base + 0x1144, new BreakPointCallback() {
    @Override
    public boolean onHit(Emulator<?> emulator, long address) {
        Arm64RegisterContext ctx = emulator.getContext();
        int iX11 = ctx.getXInt(11);
        int iX8 = ctx.getXInt(8);;

        System.out.printf("X11 = %d , X8 = %d\n", iX11,iX8);

        return true;
    }
});

Run a bit.

X11 = 64 , X8 = 64

It’s strange that it’s only printed once, which means this is not a loop, or it’s a false loop.

Then keep looking upwards.

text:0000000000000C8C 8C 81 00 91                 ADD             X12, X12, #0x20 ; ' '
.text:0000000000000C90             ; 156:       while ( v24 < 0x10 );
.text:0000000000000C90 03 FA FF 54                 B.CC            loc_BD0
.text:0000000000000C94 FE 17 50 29                 LDP             W30, W5, [SP,#0x110+var_90]
.text:0000000000000C98 11 03 19 4A                 EOR             W17, W24, W25

It seems that 0xC94 should be the location of the last calculation. (We can first set a breakpoint at 0xC94. If it is triggered only once, it means it is available. If it is triggered multiple times, it means it is in the loop body.)

5. Trace code

There are two ways to run Unidbg Trace code. One is to write code directly to implement it and store the Trace result in traceCode1.log

try {
    emulator.traceCode(module.base+0xC94, module.base+0x1170).setRedirect(new PrintStream(new File("traceCode1.log")));
} catch (IOException e) {
    throw new IllegalStateException(e);
}

The other is to enter the Trace command in the debug command line.

First, set a breakpoint at 0xC94, enter the debugging window, enter the traceCode command and press Enter.

traceCode
Set trace LinuxModule{base=0x40000000, size=12288, name='libnative-lib.so'} instructions success.

If you want to save the trace results to a file, add the file name traceCode traceCode1.log

Then the c command continues to execute and the Trace results appear.

Start the restoration algorithm

Searching for 0xdd89ca68 in the Trace result, we found that it was flipped from 0x68ca89dd

0x40001154: "rev w4, w22" w22=0x68ca89dd => w4=0xdd89ca68

If you encounter unfamiliar instructions here, you can naturally consult AI

We first pick out the results related to 0x68ca89dd. The principle is to trace back from the results to the input parameters.

0x40001048: "add w12, w12, w15" w12=0x47448b48 w15=0x93f6f2bb => w12=0xdb3b7e03

0x40001098: "orn w12, w12, w11" w12=0xdb3b7e03 w11=0x49ac16b => w12=0xfb7f7e97
0x4000109c: "eor w12, w12, w16" w12=0xfb7f7e97 w16=0xa96ba62 => w12=0xf1e9c4f5
0x400010a0: "add w12, w12, w15" w12=0xf1e9c4f5 w15=0x93f6f2bb => w12=0x85e0b7b0
0x400010b0: "add w12, w12, w15" w12=0x85e0b7b0 w15=0x4212f2e5 => w12=0xc7f3aa95

0x4000107c: "add w11, w11, w17" w11=0xffa26fc9 w17=0x2cc489bc => w11=0x2c66f985
0x40001088: "add w11, w11, w17" w11=0x2c66f985 w17=0xf3d1564b => w11=0x20384fd0
0x4000108c: "ror w11, w11, #0xb" w11=0x20384fd0 => w11=0xfa040709

0x400010b4: "ror w12, w12, #0x1a" w12=0xc7f3aa95 => w12=0xfceaa571
0x40001090: "add w11, w11, w16" w11=0xfa040709 w16=0xa96ba62 => w11=0x49ac16b

0x400010b8: "add w12, w12, w11" w12=0xfceaa571 w11=0x49ac16b => w12=0x18566dc

0x40001048: "add w12, w12, w15" w12=0x47448b48 w15=0x93f6f2bb => w12=0xdb3b7e03

0x40001098: "orn w12, w12, w11" w12=0xdb3b7e03 w11=0x49ac16b => w12=0xfb7f7e97
0x4000109c: "eor w12, w12, w16" w12=0xfb7f7e97 w16=0xa96ba62 => w12=0xf1e9c4f5
0x400010a0: "add w12, w12, w15" w12=0xf1e9c4f5 w15=0x93f6f2bb => w12=0x85e0b7b0
0x400010b0: "add w12, w12, w15" w12=0x85e0b7b0 w15=0x4212f2e5 => w12=0xc7f3aa95

0x4000107c: "add w11, w11, w17" w11=0xffa26fc9 w17=0x2cc489bc => w11=0x2c66f985
0x40001088: "add w11, w11, w17" w11=0x2c66f985 w17=0xf3d1564b => w11=0x20384fd0
0x4000108c: "ror w11, w11, #0xb" w11=0x20384fd0 => w11=0xfa040709

0x400010b4: "ror w12, w12, #0x1a" w12=0xc7f3aa95 => w12=0xfceaa571
0x40001090: "add w11, w11, w16" w11=0xfa040709 w16=0xa96ba62 => w11=0x49ac16b

0x400010b8: "add w12, w12, w11" w12=0xfceaa571 w11=0x49ac16b => w12=0x18566dc

0x40001108: "add w22, w12, w22" w12=0x18566dc w22=0x67452301 => w22=0x68ca89dd

Then open VSCode and start writing code

int w12,w11,w15;

w12 = w12 | ~w11          // 0x40001098: "orn w12, w12, w11" w12=0xdb3b7e03 w11=0x49ac16b => w12=0xfb7f7e97
w12 = w12 ^ w16                 // 0x4000109c: "eor w12, w12, w16" w12=0xfb7f7e97 w16=0xa96ba62 => w12=0xf1e9c4f5

// w12 = (w12 | ~w11) ^ w16

w12 = w12 + w15                                // 0x400010a0: "add w12, w12, w15" w12=0xf1e9c4f5 w15=0x93f6f2bb => w12=0x85e0b7b0
w12 = w12 + Num_w15        // 0x400010b0: "add w12, w12, w15" w12=0x85e0b7b0 w15=0x4212f2e5 => w12=0xc7f3aa95
w12 = w12 >> 0x1a;        // 0x400010b4: "ror w12, w12, #0x1a" w12=0xc7f3aa95 => w12=0xfceaa571
w12 = w12 + w11;                // 0x400010b8: "add w12, w12, w11" w12=0xfceaa571 w11=0x49ac16b => w12=0x18566dc

Yes, the restoration algorithm is so simple and boring.

w12 = (w12 | ~w11) ^ w16 This formula makes us a little excited

It is very similar to the I of the standard MD5 algorithm.

#define F(x,y,z) ((x & y) | (~x & z))
#define G(x,y,z) ((x & z) | (y & ~z))
#define H(x,y,z) (x^y^z)
#define I(x,y,z) (y ^ (x | ~z))

In this way, we may be able to restore it more easily within the framework of the standard MD5 algorithm and only find the modified parts.

I can’t help but use AI

Throw this boring process toCursorWhat will happen

Use C language to restore the calculation process of w12 and add Chinese comments

0x40001048: "add w12, w12, w15" w12=0x47448b48 w15=0x93f6f2bb => w12=0xdb3b7e03

0x40001098: "orn w12, w12, w11" w12=0xdb3b7e03 w11=0x49ac16b => w12=0xfb7f7e97
0x4000109c: "eor w12, w12, w16" w12=0xfb7f7e97 w16=0xa96ba62 => w12=0xf1e9c4f5
0x400010a0: "add w12, w12, w15" w12=0xf1e9c4f5 w15=0x93f6f2bb => w12=0x85e0b7b0
0x400010b0: "add w12, w12, w15" w12=0x85e0b7b0 w15=0x4212f2e5 => w12=0xc7f3aa95

0x4000107c: "add w11, w11, w17" w11=0xffa26fc9 w17=0x2cc489bc => w11=0x2c66f985
0x40001088: "add w11, w11, w17" w11=0x2c66f985 w17=0xf3d1564b => w11=0x20384fd0
0x4000108c: "ror w11, w11, #0xb" w11=0x20384fd0 => w11=0xfa040709

0x400010b4: "ror w12, w12, #0x1a" w12=0xc7f3aa95 => w12=0xfceaa571
0x40001090: "add w11, w11, w16" w11=0xfa040709 w16=0xa96ba62 => w11=0x49ac16b

0x400010b8: "add w12, w12, w11" w12=0xfceaa571 w11=0x49ac16b => w12=0x18566dc

#include <stdio.h>

int main() {
    unsigned int w12 = 0x47448b48;
    unsigned int w15 = 0x93f6f2bb;
    unsigned int w11 = 0xffa26fc9;
    unsigned int w17 = 0x2cc489bc;
    unsigned int w16 = 0xa96ba62;

    w12 = w12 + w15;
    printf("w12 = %x\n", w12);

    w12 = (w12 | ~w11);
    printf("w12 = %x\n", w12);

    w12 = w12 ^ w16;
    printf("w12 = %x\n", w12);

    w12 = w12 + w15;
    printf("w12 = %x\n", w12);

    w12 = w12 + 0x4212f2e5;
    printf("w12 = %x\n", w12);

    w11 = w11 + w17;
    printf("w11 = %x\n", w11);

    w11 = w11 + 0xf3d1564b;
    printf("w11 = %x\n", w11);

    w11 = (w11 >> 11) | (w11 << (32 - 11));
    printf("w11 = %x\n", w11);

    w12 = (w12 >> 26) | (w12 << (32 - 26));
    printf("w12 = %x\n", w12);

    w11 = w11 + w16;
    printf("w11 = %x\n", w11);

    w12 = w12 + w11;
    printf("w12 = %x\n", w12);

    return 0;
}

I think it’s better than what I wrote. Programmers are going to evolve into a new breed, that is, to write for AI.prompt

Conclusion

I used to think being a leader was a blessing. I didn’t have to do any work and could just direct my subordinates to do their work.

Later, when I really became a leader, I realized that I had to know language A to Z, otherwise the guy who worked on PHP said that this thing could not be implemented, and the guy who worked on Flash said that this function would take a month to complete.

The same goes for AI. It can do mechanical and repetitive tasks better than you, but you still have to do the creative and guiding work yourself.