Li L, Li D, Bissyandacute;e TF et al. On locating malicious code in piggybacked Android apps. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 32(6): 1108–1124 Nov. 2017. DOI 10.1007/s11390-017-1786-z

On Locating Malicious Code in Piggybacked Android Apps

Li Li1, Daoyuan Li1, Tegawendacute;e F. Bissyandacute;e1, Jacques Klein1, Haipeng Cai2, Member, ACM , IEEE

David Lo3, and Yves Le Traon1

1Interdisciplinary Centre for Security, Reliability and Trust, University of Luxembourg, Luxembourg 2721, Luxembourg 2School of Electrical Engineering and Computer Science, Washington State University, Washington, WA 99163, U.S.A. 3School of Information Systems, Singapore Management University, Singapore 178902, Singapore

E-mail: {li.li, daoyuan.li, tegawende.bissyande, jacques.klein}@uni.lu; hcai@eecs.wsu.edu; davidlo@smu.edu.sg yves.letraon@uni.lu

Received April 20, 2017; revised October 13, 2017.

Abstract To devise efficient approaches and tools for detecting malicious packages in the Android ecosystem, researchers are increasingly required to have a deep understanding of malware. There is thus a need to provide a framework for dissecting malware and locating malicious program fragments within app code in order to build a comprehensive dataset of malicious samples. Towards addressing this need, we propose in this work a tool-based approach called HookRanker, which provides ranked lists of potentially malicious packages based on the way malware behaviour code is triggered. With experiments on a ground truth of piggybacked apps, we are able to automatically locate the malicious packages from piggybacked Android apps with an accuracy@5 of 83.6% for such packages that are triggered through method invocations and an accuracy@5 of 82.2% for such packages that are triggered independently.

Keywords Android, piggybacked app, malicious code, HookRanker

1 Introduction

Malware is pervasive in the Android ecosys- tem. This is unfortunate since Android is the most widespread operating system in handheld devices and has increasing market shares in various home and office smart appliances. As we now heavily depend on mobile apps in various activities that pervade our modern life, security issues with Android web browsers, media play- ers, games, social networking or productivity apps can have severe consequences. Yet, regularly, high profile security mishaps with the Android platform shine the spotlight on how easily malware writers can exploit a large attack surface, eluding all detection systems both at the app store level and at the device level.

Nonetheless, research and practice on malware de- tection have produced a substantial number of ap- proaches and tools for addressing malware. The litera-

ture contains a large body of such work[1-4]. Unfor-

tunately, the proliferation of malware[5] in stores and

on user devices is a testimony that 1) state-of-the-art approaches have not matured enough to significantly address malware, and 2) malware writers are still able to react quickly to the capabilities of current detec- tion techniques. Broadly, malware detection techniques either leverage malware signatures or build machine learning (ML) classifiers based on static/dynamic fea- tures. On the one hand, it is rather tedious to manually build a (near) exhaustive database of malware signa- tures: new malware or modified malware is thus likely to slip through. On the other hand, ML classifiers are too generic to be relevant in the wild: features currently used in the literature, such as n-grams, permissions or system calls, allow to flag apps without providing any hint on either which malicious actions are actually de- tected, or where they are located in the app.

The challenges in Android malware detection are mainly due to a lack of accurate understanding of what constitutes a malicious code. In 2012, Zhou and

Regular Paper

Special Section on Software Systems 2017

This work was supported by the Fonds National de la Recherche (FNR), Luxembourg under projects AndroMap C13/IS/5921289 and Recommend C15/IS/10449467.

copy;2017 Springer Science Business Media, LLC amp; Science Press, China

Li Li et al.: On Locating Malicious Code in Piggybacked Android Apps 1109

Jiang[6] manually investigated 1 260 malware samples to characterize: 1) their installation process, i.e., which social engineering-based techniques (e.g., repackaging, update-attack, drive-by-attack) are used to slip them into users devices; 2) their activation process, i.e., which events (e.g., SMS RECEIVED) are used to trigger the ma- licious behaviour; 3) the category of malicious pack- ages (e.g., privilege escalation or personal information stealing); 4) how malware exploits the permission sys- tem. The produced dataset named MalGenome, has opened several directions in the research of malware de- tection, most of which either have focused on detecting specific malware types (e.g., malware leaking private data[7]), or are exploiting features such as permissions in ML classification[8]. The MalGenome dataset how- ever has shown its limitations in hunting for malware: the dataset, which was built manually, has become ob- solete as new malware families are now prevalent; the characterization provided in the study is too high-level to allow the inference of meaningful structural or se- mantic features of malware.

The ultimate goal of our work is to build an ap- proach towards systematizing the dissection of Android malware and automating the collection of malicious code packages in Android apps. Previous studies, in- cluding our own, have exposed statistical facts which suggest that malware writing is performed at an “indus- trial” scale and that a give

全文共33882字，剩余内容已隐藏，支付完成后下载完整资料

Li L，Li D，Bissyand#39;e TF et al。在搭载Android应用程序中查找恶意代码。计算机科学与技术学报32（6）：1108-1124 Nov. DOI DOI 10.1007 / s11390-017-1786-z

在搭载Android应用程序中查找恶意代码

Li Li1, Daoyuan Li1, Tegawendacute;e F. Bissyandacute;e1, Jacques Klein1, Haipeng Cai2, Member, ACM , IEEE

David Lo3, and Yves Le Traon1

1 卢森堡大学安全，可靠性和信任跨学科中心，卢森堡2721

2 华盛顿州立大学电气工程与计算机科学学院，华盛顿，WA 99163，美国

3 新加坡管理大学信息系统学院，新加坡178902，新加坡

电子邮件： {li.li, daoyuan.li, tegawende.bissyande, jacques.klein}@uni.lu; hcai@eecs.wsu.edu; davidlo@smu.edu.sg yves.letraon@uni.lu

2017年4月20日收到; 2017年10月13日修订。

摘要是为了设计有效的方法和工具来检测Android生态系统中的恶意软件包，研究人员越来越需要对恶意软件有深入的了解。因此需要提供一个框架来分析恶意软件并在应用程序代码中查找恶意程序片段，以便构建恶意样本的综合数据集。为了解决这一需求，我们在这项工作中提出了一种名为HookRanker的基于工具的方法，该方法根据触发恶意软件行为代码的方式提供潜在恶意软件包的排名列表。通过对背负式应用程序的基础实验进行实验，我们能够自动从背负式Android应用程序中找到恶意软件包，对于83.6%通过方法调用触发的软件包，准确率为5%，对于82.2%被独立触发的软件包准确率为5%。

关键词Android，搭载应用程序，恶意代码，HookRanker

介绍

恶意软件在Android生态系统中非常普遍。这是不幸的，因为Android是手持设备中最普遍的操作系统，并且在各种家庭和办公室智能设备中的市场份额不断增加。由于我们现在在各种活动中严重依赖移动应用，而这些活动已经渗透到我们的现代生活中，因此Android网络浏览器，媒体播放器，游戏，社交网络或生产力应用的安全问题可能会造成严重后果。然而，定期在Android平台上发生的高调安全事故突显了恶意软件编写者能够轻松利用大型攻击面的亮点，无论是在应用商店层面还是在设备层面，都避开了所有检测系统。

尽管如此，关于恶意软件检测的研究和实践已经产生了大量解决恶意软件的方法和工具。文献中包含了大量这样的工具[1-4]。不幸的是，商店和用户设备上的恶意软件[5]的激增证明：

最先进的方法尚未成熟到足以显着解决恶意软件的问题;

2）恶意软件编写者仍能够对目前检测技术的能力快速反应。广义上，恶意软件检测技术或者利用恶意软件签名或者构建机器学习（ML）分类器都是基于静态/动态特征。一方面，手动构建恶意软件签名的（接近）详尽数据库相当繁琐：因此，新恶意软件或修改后的恶意软件可能会漏掉。另一方面，ML分类器过于通用，无法与外部相关：文献中目前使用的功能（如n-gram，权限或系统调用）允许标记应用程序，而不会提供任何有关实际检测到哪些恶意操作或它们位于应用程序中的位置的提示。

Android恶意软件检测所面临的挑战主要是由于对构成恶意代码的内容缺乏准确的理解。 2012年，Zhou和Jiang [6]手动调查了1260个恶意软件样本，表征为：1）他们的安装过程，即使用那种基于社会工程学的技术（例如重新打包，更新攻击，逐个攻击）将其滑入用户设备中;

它们的激活过程，即哪些事件（例如SMS已收到）被用于触发恶意行为;

3）恶意软件包的类别（例如，特权升级或个人信息窃取）;

4）恶意软件如何利用许可系统。生成的名为MalGenome的数据集在恶意软件检测研究中开辟了多个方向，其中大多数集中在检测特定恶意软件类型（例如泄露私有数据的恶意软件[7])，或者利用ML分类中的权限[8]。然而，MalGenome数据集在搜寻恶意软件方面表现出了它的局限性：由于新的恶意软件系列现在比较流行，手动构建的数据集已经过时;在研究中提供的特征太高级别以至于无法推断出恶意软件有意义的结构或语义特征。

我们工作的最终目标是构建一种系统化解析Android恶意软件并自动收集Android应用程序中的恶意代码包的方法。之前的研究（包括我们自己的研究）已经暴露出统计事实，这些事实表明恶意软件编写是在“工业”规模上执行的，而且给定的恶意代码片段可以被大量的恶意软件广泛地重复使用[5-6]。恶意软件

开发者确实可以简单地打开一个良性的，最受欢迎的应用程序，然后在最终重新打包之前移植一些恶意代码。由此产生的应用程序，因此捎带恶意软件包，被称为捎带应用程序。我们的假设是，大多数恶意软件都搭载了良性应用程序，并通过MalGenome数据集证实，其中超过80％的样本是通过重新打包构建的。为了简单起见，在本文中，我们将通过捎带注入的任何代码包称为“恶意”包。实际上，这样的包可能产生以下三种情况：

1）直接有助于实现恶意行为；

有助于进一步隐藏对静态分析器的恶意操作；

3）提供被捎带者利用的商品功能（例如，以库的形式）来方便有效载荷挂钩。

然而，在应用程序中识别并提取准确的恶意代码是一项具有挑战性的工作。在任何情况下，恶意行为都可以作为多个包中不同行为步骤的编排来实现。据我们所知，最先进的研究主要是利用基于比较的方法（1对1 [1]或1对n [10]比较）来确定恶意有效载荷。只分析恶意软件样本的方法，系统地识别有助于恶意行为实施的软件包是很少的。因此，我们的目标是提出一个步骤，帮助分析师在Android应用程序中轻松识别恶意软件包，而无需提供其他用于比较的应用程序。为此，我们构建了HookRanker，这是一种排序方法，可以对包中的恶意状态进行排序。总的来说，我们做出以下努力。

9 我们提出了一种在搭载应用程序中定位挂钩的自动化方法（即，代码将执行上下文从良性转换为恶意代码或独立触发恶意代码）。我们的方法最终会生成两个最有可能的恶意软件包列表，这可以让恶意软件分析师迅速了解如何实施恶意行为以及如何触发恶意代码。我们的方法的一个关键特征是，它不需要具有原始良性版本的背负式应用程序，通常很难收获，以执行某种形式的差异分析。10我们提供了一个名为HookRanker的工具来自动推荐潜在的恶意软件包和组件。评估一组基准应用程序。

10 我们提供了一个名为HookRanker的工具来自动潜发现在的恶意软件包和组件。对一组基准应用程序的评估已经证明HookRanker能够有效地找到捎带应用程序的恶意软件包。

11 我们通过实验表明，我们的工作可以立即在一定程度上被研究人员和从业人员利用来构建输出可解释的分类器结果，即当应用程序被标记为恶意软件时，人们可以精确地了解其展示了来自特定恶意软件包的功能，因此可以直接指出相关类型/系列的恶意软件。

重复性

我们在线提供我们的数据集和实验结果（○1）。

本文是在2017年国际移动软件工程和系统会议（MobileSoft）上发布初步结果的简短论文[11]的扩展和改进版本。在以前的版本中，我们已经单独探索了适用于搭载应用程序的T ype1钩子，尽管我们实际上已经表明总共有两种钩子类型（包括T ype1和T ype2钩子（参见清单1）。在这个扩展中，除了涉及触发搭载骑手代码方法调用的T ype1钩子之外，我们还探讨T ype2挂钩用于捎带恶意应用程序，恶意骑手代码通过使用Android事件系统触发。

// Activity for launching the app
public class com. unity 3 d . player . Unity Player Proxy Activity extends android . app. Activity {
protected void on Create ( android . os. Bundle ) {
$ r 0 := @ this : com. unity 3 d . player . Unity Player Proxy Activity;
$ r 1 := @ parameter 0: android . os. Bundle ;

6 $ b 0 = 1;

7 specialinvoke $ r 0 .lt; android . app. Activity : void on Create ( android . os. Bundle ) gt;( $ r 1 );

8 staticinvoke lt; com. gamegod . Touydig : void init ( android . content . Context ) gt;( $ r 0 );

$ r 2 = newarray ( java . lang . String ) [ 2 ];
$ r 2 [ 0 ] = ' com. unity 3 d . player . Unity Player Activity';
$ r 2 [ 1 ] = ' com. unity 3 d . player . Unity Player Native Activity';
staticinvoke lt; com. unity 3 d . player . Unity Player Proxy Activity: void copy Player Prefs( android . content . Context , java . lang. String []) gt;($r0 , $ r 2 );

13 }}

15 // Broadcast Receiver for listening PACKAGE_ ADDED , CONNECTIVITY_ CHANGE , and BOOT_ COMPLETED events

16 public class com. mobile . co. UR extends Ad Push Receiver {...}

清单1. T ype1和T ype2钩子的示例。这个片段是从名为apscallion.sharq2的真正搭载应用程序中提取的。“ ”号表示注入到原始应用程序的代码。

本文的其余部分安排如下。第2部分提供了与背负式应用有关的必要背景信息，包括我们将在本文中提及的捎带术语。第3部分介绍了我们在恶意软件包中搭载捎带应用的方法。我们在第4部分评估我们的工作，并在第5部分评估有效性的风险和前景。第6部分讨论相关工作，第7部分结束本文。

bull;初步措施

我们现在提供对理解Android捎带的目的，技巧和关键问题至关重要的初步细节。具体来说，我们首先简要介绍2.1节中有关捎带过程的术语。然后，在2.2小节中，我们将介绍Android应用启动模型，这对于搭载捎带应用中的恶意软件包如何触发恶意行为至关重要。接下来，我们在2.3小节中总结了恶意软件编写者利用现有应用程序代码移植代码的技术。最后，在第2.4小节中，我们将介绍我们在这项工作中使用的地面实况数据集，以评估HookRanker的有效性。

bull;捎带术语

我们现在介绍我们将在本文其余部分提到的必要术语。图1显示了搭载恶意软件(○2)的构成部分，该恶意软件是通过将文献中提到的给定原始应用程序作为载体[12]构建的，并将恶意软件包移植到它（也称为恶意软件代码○3），简称骑手。恶意行为将由于恶意软件编写者插入的挂钩而触发，以确保注入的包将被执行。

Android Apps

Malware

Original

APP (a1)

Piggybacked

Carrier

Hook

Rider

Piggybacked

APP (a2)

值得注意的是，在这项工作中，我们在捎带和重新包装之间做出了明显的区别，这是文献中经常使用的两个术语。实际上，与捎带不同，重新包装不一定包括对给定Android应用程序的字节码的修改。相反，重新打包只需执行即可更改应用程序证书，从而切换所有权。然而，捎带总是意味着重新包装。

bull;Android应用程序启动模型

Android应用程序由以下四种类型的组件组成：

bull;活动，代表Android应用程序的图形界面;

bull;服务，专门用于在后台执行时间密集型任务;

bull;广播接收器，用于等待和解决系统以及用户定义的事件;

bull;C

全文共12187字，剩余内容已隐藏，支付完成后下载完整资料

资料编号：[11880]，资料为PDF文档或Word文档，PDF文档可免费转换为Word

原文和译文剩余内容已隐藏，您需要先支付 30元 才能查看原文和译文全部内容！立即支付

发小红书推广免费获取该资料资格。点击链接进入获取推广文案即可： Ai一键组稿 | 降AI率 | 降重复率 | 论文一键排版

注册

找回密码

在搭载Android应用程序中查找恶意代码外文翻译资料

在搭载Android应用程序中查找恶意代码

您可能感兴趣的文章

登录

注册

找回密码

在搭载Android应用程序中查找恶意代码

您可能感兴趣的文章