Android application files have an .apk file extension. APK files are archive files that generally contain the following files and directories:1

  •  META-INF directory:

  •  MANIFEST.MF: the Manifest file, which contains metadata about the comprehensive package.

  •  CERT.RSA: The certificate of the application.

  •  CERT.SF: The list of resources and SHA-1 digest of the corresponding lines from the MANIFEST.MF file.

  •  lib: A directory containing the compiled code that is specific to a software layer of a processor.

  •  res: A directory containing resources not compiled into resources.arsc (see below).

  •  assets: A directory containing applications assets.

  •  AndroidManifest.xml: An additional Android manifest file that includes the name, version, access rights, and referenced library files for the application. This file may be in Android binary XML, which can be converted into readable plaintext by tools such as AXMLPrinter2, android-apktool, and Androguard.

  •  classes.dex: The programming classes compiled in the .dex file format interpreted by the Dalvik Virtual Machine.

  •  resources.arsc: A file containing precompiled resources, such as binary XML, for example.

Every Android application runs in its own process, with its own instance of the Dalvik Virtual Machine. Dalvik has been written so that a device can run multiple VMs efficiently.

Android programs are compiled into .dex (Dalvik Executable) files, which are zipped into a single .apk file on the device. The .dex files can be created manually by the developer or automatically by Android when translating the compiled applications written in Java. These files are used to hold a set of class definitions and their associated data, and .dex files are interpreted by the Dalvik Virtual Machine.

.dex file structure includes the following elements:3

  •  Header: Contains basic file information such as size and elements

  •  String_ids: Contains a list of identifiers of all strings used

  •  Type_ids: Includes a list of identifiers of all the types (classes, boards, primitives)

  •  Proto_ids: Contains a list of prototypes (structures) for file references

  •  Fields: Includes a list of field identifiers

  •  Methods: Contains a list of identifiers of all the methods included in the file

  •  Classes: Consists of eight parts: class id, access_flags, super class type_id, interface list address, source file name string_id, class data address, address of the data initializing the static fields, address of the related annotations to the class

  •  Data: A section of data

Example Scenario for unpacking/decompiling a .apk package:

The classes.dex file within an .apk file contains programming classes for the application and review of this file can show us what the application is programmed to do.

Step 1: Unpacking

The compiled contents of an .apk file can be viewed by simply renaming the file to .zip and opening it. This process is known as “unpacking” the .apk file. In the example shown in the slide, the .apk file associated with the application “Pokemon Go APK”  is named pokemongo.apk.

By renaming it to pokemongo.apk.zip and then opening the zip file, you can now see and browse through the various files and folders within the .apk file. This may give you some important information about the application’s activities, both legitimate and illegitimate. In the example, an .apk file named Zombie Highway has been unpacked.

Step 2: Decompiling

The next part of the examination process is decompiling the classes.dex file for the application within the .apk file.

To do this, locate the classes.dex file from within the unpacked .apk file.

Next, COPY the classes.dex file into the directory associated with the program dex2jar. Then open a command prompt and navigate within the command prompt window to the “dex2jar” folder on your desktop.

Inside the dex2jar-2.0 folder exists the batch file which you will run against the classes.dex file that was just placed in this directory.

From the command prompt type:

d2j-dex2jar.bat classes.dex.

Step 3: Results of Decompiling

Running this batch file creates Java .jar file named classes_dex2jar.jar that contains all of the Java classes that were contained within the classes.dex file.

Dex2jar is included in the course VM, on the desktop in the fences area, and is available as a free download at https://github.com/pxb1988/dex2jar.

Step 4: Analysis of the Application

You can now use a free Java Decompiler tool called jd-gui to view the data you just unpacked and decompiled from the .apk file.

Double-click the jd-gui executable file and click Run if requested. Next, select File and Open File, then navigate to the classes_dex2jar.jar created in the previous step and open this file to view the underlying code.

The Java Decompiler allows you to view, navigate, and search the data that was packed inside the .apk file to see what the application was programmed to do. You may find IP addresses, email addresses, information about permissions, or other information that will help determine whether the application is malicious. It includes color-coded source code and will highlight the various classes you select as you click on them.

Color-coding for classes in JD-GUI is as follows:4

File-related classes (in red): for access, reading, and writing local files.

Java reflection classes (in green): for creating new classes and instances and invoking methods dynamically.

jd-gui is included in the course VM, on the desktop, in the fences area. It’s available as a free download at https://code.google.com/archive/p/innlab/downloads.

References:

[1] https://for585.com/nvaf6 (.apk file)

[2] https://for585.com/k14j0 (Dalvik Executable Format)

[3] https://for585.com/q6opw (.dex file anatomy description)

[4] https://for585.com/j8qyx (On Cyber Blog: GM Bot Android Malware Teardown)