map-list

什么是 map_list?

整个 dex 文件的内容清单,位于数据段内,其文件偏移由 DexHeader 中的 mapOff 字段指定。

更加具体的解释在下面的 type 段落部分。


结构

来到 mapOff 指定的偏移处,首先是 DexMapList 结构,存储了 map_list 内 map_item (DexMapItem) 的个数和内容,也就是说在 size 之后,有 size 个DexMapItem 类型的数据。

DexMapList

1
2
3
4
5
6
7
/*
* Direct-mapped "map_list".
*/
struct DexMapList {
u4 size; /* DexMapItem 的个数 */
DexMapItem list[1]; /* DexMapItem 数组 */
};

DexMapItem

1
2
3
4
5
6
7
8
9
/*
* Direct-mapped "map_item".
*/
struct DexMapItem {
u2 type; /* 各 item 的类型,均以 kDexType 开头*/
u2 unused; /* 未使用,用于字节对齐 */
u4 size; /* 指定类型的个数 */
u4 offset; /* 指定类型数据的起始文件偏移 */
};

type

在这些类型中,除了 0x0000 表示的就是 DexHeader 本身之外,0x0001 ~ 0x1000 部分与 DexHeader 中定义的类型是一致的;
而 0x1001 ~ 0x2006 部分是对 data 段的细分。
这样设计可以作为一种文件检验方式,一旦和 DexHeader 的数据有所不同就可以判定该 dex 是损坏的;而且 map_list 部分的内容更详细,以此作为整个文件的索引想必是极好的。
下面手动查找的例子可以证明。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
/* map item type codes */
enum {
kDexTypeHeaderItem = 0x0000,
kDexTypeStringIdItem = 0x0001,
kDexTypeTypeIdItem = 0x0002,
kDexTypeProtoIdItem = 0x0003,
kDexTypeFieldIdItem = 0x0004,
kDexTypeMethodIdItem = 0x0005,
kDexTypeClassDefItem = 0x0006,
kDexTypeMapList = 0x1000,
kDexTypeTypeList = 0x1001,
kDexTypeAnnotationSetRefList = 0x1002,
kDexTypeAnnotationSetItem = 0x1003,
kDexTypeClassDataItem = 0x2000,
kDexTypeCodeItem = 0x2001,
kDexTypeStringDataItem = 0x2002,
kDexTypeDebugInfoItem = 0x2003,
kDexTypeAnnotationItem = 0x2004,
kDexTypeEncodedArrayItem = 0x2005,
kDexTypeAnnotationsDirectoryItem = 0x2006,
};

手工查找

某 dex 文件的 map_list 部分

1
2
3
4
5
6
7
8
9
10
11
12
13
00001000 10 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 |................|
00001010 01 00 00 00 5c 00 00 00 70 00 00 00 02 00 00 00 |....\...p.......|
00001020 19 00 00 00 e0 01 00 00 03 00 00 00 12 00 00 00 |................|
00001030 44 02 00 00 04 00 00 00 01 00 00 00 1c 03 00 00 |D...............|
00001040 05 00 00 00 2b 00 00 00 24 03 00 00 06 00 00 00 |....+...$.......|
00001050 02 00 00 00 7c 04 00 00 02 20 00 00 5c 00 00 00 |....|.... ..\...|
00001060 bc 04 00 00 01 10 00 00 0a 00 00 00 f8 09 00 00 |................|
00001070 04 20 00 00 01 00 00 00 4e 0a 00 00 03 10 00 00 |. ......N.......|
00001080 02 00 00 00 58 0a 00 00 06 20 00 00 01 00 00 00 |....X.... ......|
00001090 64 0a 00 00 03 20 00 00 09 00 00 00 7c 0a 00 00 |d.... ......|...|
000010a0 01 20 00 00 09 00 00 00 10 0c 00 00 00 20 00 00 |. ........... ..|
000010b0 02 00 00 00 c2 0f 00 00 00 10 00 00 01 00 00 00 |................|
000010c0 00 10 00 00 |....|

前 4 字节表明接下来会有 0x00000010 个 DexMapItem 结构;

序数 binary type size offset
0x01 00 00 00 00 01 00 00 00 00 00 00 00 kDexTypeHeaderItem 0x01 0x0
0x02 01 00 00 00 5c 00 00 00 70 00 00 00 kDexTypeStringIdItem 0x5c 0x70
0x03 02 00 00 00 19 00 00 00 e0 01 00 00 kDexTypeTypeIdItem 0x19 0x1e0
0x04 03 00 00 00 12 00 00 00 44 02 00 00 kDexTypeProtoIdItem 0x12 0x244
0x05 04 00 00 00 01 00 00 00 1c 03 00 00 kDexTypeFieldIdItem 0x01 0x3c1
0x06 05 00 00 00 2b 00 00 00 24 03 00 00 kDexTypeMethodIdItem 0x2b 0x324
0x07 06 00 00 00 02 00 00 00 7c 04 00 00 kDexTypeClassDefItem 0x02 0x47c
0x08 02 20 00 00 5c 00 00 00 bc 04 00 00 kDexTypeStringDataItem 0x5c 0x4bc
0x09 01 10 00 00 0a 00 00 00 f8 09 00 00 kDexTypeTypeList 0x0a 0x9f8
0x0a 04 20 00 00 01 00 00 00 4e 0a 00 00 kDexTypeAnnotationItem 0x01 0xa4e
0x0b 03 10 00 00 02 00 00 00 58 0a 00 00 kDexTypeAnnotationSetItem 0x02 0xa58
0x0c 06 20 00 00 01 00 00 00 64 0a 00 00 kDexTypeAnnotationsDirectoryItem 0x01 0xa64
0x0d 03 20 00 00 09 00 00 00 7c 0a 00 00 kDexTypeDebugInfoItem 0x09 0xa7c
0x0e 01 20 00 00 09 00 00 00 10 0c 00 00 kDexTypeCodeItem 0x09 0xc10
0x0f 00 20 00 00 02 00 00 00 c2 0f 00 00 kDexTypeClassDataItem 0x02 0xfc2
0x10 00 10 00 00 01 00 00 00 00 10 00 00 kDexTypeMapList 0x01 0x1000

这里的各项值可以和上一节 解析 dex 文件结构 - DexHeader 手动查找部分中的数据进行比对,发现是相同的。


写程序解析 map_list

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
class DexStruct(object):
DexMapList = {
"size": 0,
"DexMapItem": []
}
# DexMapItem = {
# "type" : 0
# "unused" : 0
# "size" : 0
# "offset" : 0
# }
DexMapItemCode = {
0x0000 : "kDexTypeHeaderItem" ,
0x0001 : "kDexTypeStringIdItem" ,
0x0002 : "kDexTypeTypeIdItem" ,
0x0003 : "kDexTypeProtoIdItem" ,
0x0004 : "kDexTypeFieldIdItem" ,
0x0005 : "kDexTypeMethodIdItem" ,
0x0006 : "kDexTypeClassDefItem" ,
0x1000 : "kDexTypeMapList" ,
0x1001 : "kDexTypeTypeList" ,
0x1002 : "kDexTypeAnnotationSetRefList" ,
0x1003 : "kDexTypeAnnotationSetItem" ,
0x2000 : "kDexTypeClassDataItem" ,
0x2001 : "kDexTypeCodeItem" ,
0x2002 : "kDexTypeStringDataItem" ,
0x2003 : "kDexTypeDebugInfoItem" ,
0x2004 : "kDexTypeAnnotationItem" ,
0x2005 : "kDexTypeEncodedArrayItem" ,
0x2006 : "kDexTypeAnnotationsDirectoryItem" ,
}
def parseMapList(map_data):
DexStruct.DexMapList['size'] = struct.unpack('H',map_data[:2])[0]
curPos = 4
for x in range(DexStruct.DexMapList['size']):
tmpDexMapItem = {
"type" : struct.unpack('H',map_data[curPos:curPos+2])[0],
"unused" : 0,
"size" : struct.unpack('I',map_data[curPos+4:curPos+8])[0],
"offset" : struct.unpack('I',map_data[curPos+8:curPos+12])[0] }
curPos += 12
DexStruct.DexMapList["DexMapItem"].append(tmpDexMapItem)

Reference

Dalvik Executable format

《Android 软件安全与逆向分析》