Python scripting in dirtyJOE can be used to modify CONSTANT_Utf8 objects from Constant Pool. One of the main use is decryption of CONSTANT_Utf8 objects in obfuscated .class files. Decrypted objects are useful in forensic analysis, they can be also used to translate obfuscated java application.

PyJOE library

In \scripts\ directory you can find file which contains few useful functions that can be used by other scripts:

def dj_normalizeUtf8(inBuffer):
Function responsible for translating the Java Utf8 input buffer (tuple) to list of integer that represents each character.
def dj_encodeUtf8(inBuffer):
Function responsible for translating list of integers that represent each characters to the Java Utf8 buffer (list).

def dj_decryptUTF8_ZKM(inBuffer, key):
Standard decrypter for strings encoded by Zelix KlassMaster obfuscator, it can be also used to re-encrypt strings (very useful for translation of obfuscated Java applications). Usage example can be found in \scripts\ file.

def dj_decryptUTF8_Allatori(inBuffer, key):
Standard decrypter for strings encoded by Allatori obfuscator, it can be also used to re-encrypt strings (very useful for translation of obfuscated Java applications). Usage example can be found in \scripts\ file.


I'll show how to use python scripting on sample obfuscated java malware called Boonana. Mentioned malware appeared in october 2010 and was obfuscated with Zelix KlassMaster 5.3.3E (according to entry in constant pool "ZKM5.3.3E").

Script should contain only one simple function that will be called by dirtyJOE on each encrypted Utf8 object:

def dj_decryptUTF8(inBuf):
    return []

inBuf - input buffer, it is tuple, where each byte of Utf8 string is represented as integer value
return - output buffer should be a list, where each byte of Utf8 string is represented as integer value

Universal script for decrypting Utf8 objects encrypted by Zelix KlassMaster obfuscator can be found in \scripts\ directory. It is very simple, all what is needed is to find proper 'key' value:

def dj_decryptUTF8_ZKM(inBuffer, key):
	i = 0
	ik = 0
	ret = []
	last = len(inBuffer)
	while i < last:
		c = inBuffer[i]
		if c == 0:
		defAdd = 1
		if (c >> 5) == 6:
			c = ((c & 0x1f) << 6) + (inBuffer[i+1] & 0x3f)
			defAdd = 2
		elif (c >> 4) == 0xE:
			c = ((c & 0xf) << 12) + ((inBuffer[i+1] & 0x3f) << 6) \
			+ (inBuffer[i+2] & 0x3f)
			defAdd = 3

		ret += [c ^ key[ik % 5]]
		i += defAdd
		ik += 1
	return ret

# 'instr' argument is input buffer represented as tuple
# function should return list object
def dj_decryptUTF8(inBuf):
	key = [48, 16, 127, 16, 97]
	return dj_decryptUTF8_ZKM(inBuf, key)

'key' value is very easy to find, all references to the encrypted object have to be checked, following those references will reveal function that is responsible for decryption. For example:

  • Lets get first encrypted CONSTANT_Utf8 object from Constant Pool, it is object number 147
  • Check references to this object be selecting 'Show References' option from the context menu
  • References window shows that object 147 is referenced only once by 'Constant Pool: 13'
  • Check references to object 13 (it is CONSTANT_String object)
  • References window shows that object 13 is referenced only once by 'Method: <clinit>, attribute: Code, bytecode@00000089'
Checking this method at given position reveals this code:
00000086 : dup
00000087 : bipush              15
00000089 : ldc                 "Xd `[?gxdB~dOS"
0000008B : jsr                 pos.000000B0
0000008E : aastore
jsr opcode is acronym for 'jump subroutine', and subroutine at pos.000000B0 looks like this:
000000B0 : astore_0
000000B1 : invokevirtual       char[] java.lang.String.toCharArray()
000000B4 : dup
000000B5 : arraylength
000000B6 : swap
000000B7 : iconst_0
000000B8 : istore_1
000000B9 : swap
000000BA : dup_x1
000000BB : iconst_1
000000BC : if_icmpgt           pos.0000010A
000000BF : dup
000000C0 : iload_1
000000C1 : dup2
000000C2 : caload
000000C3 : iload_1
000000C4 : iconst_5
000000C5 : irem
000000C6 : tableswitch         l: 0, h: 3, def: pos.000000F8,
                               pos.(000000E4, 000000E9, 000000EE, 000000F3)
000000E4 : bipush              48
000000E6 : goto                pos.000000FA
000000E9 : bipush              16
000000EB : goto                pos.000000FA
000000EE : bipush              127
000000F0 : goto                pos.000000FA
000000F3 : bipush              16
000000F5 : goto                pos.000000FA
000000F8 : bipush              97
000000FA : ixor
000000FB : i2c
000000FC : castore
000000FD : iinc                local.01, 1
00000100 : swap
00000101 : dup_x1
00000102 : ifne                pos.0000010A
00000105 : dup2
00000106 : swap
00000107 : goto                pos.000000C1
0000010A : swap
0000010B : dup_x1
0000010C : iload_1
0000010D : if_icmpgt           pos.000000BF
00000110 : new                 java.lang.String
00000113 : dup_x1
00000114 : swap
00000115 : invokespecial       void java.lang.String.(char[])
00000118 : invokevirtual       java.lang.String java.lang.String.intern()
0000011B : swap
0000011C : pop
0000011D : ret                 local.00
As it may be noticed, the green part of this subroutine contains key for decryption routine: 48, 16, 127, 16, 97. At this point, script can be tested in dirtyJOE by selecting 'Run Python Script' option from the Constant Pool context menu:

'Decrypt' button is used strictly for script testing purposes, it will show string after decryption in 'Preview' field. After clicking 'Save' button, object will be decrypted again, and user will be prompted by below message box:

When script is finished it can be run on all encrypted Utf8 objects (by choosing 'Run Python Script on All Utf8 Objects' option from the Constant Pool context menu):

'Decrypt' button has similar function as in previous window. After clicking 'Save' button user will be prompted to accept all changes:

That's pretty much all.

(C) 2008-2014, ReWolf