Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

CERIAS Tech Report 2006-22, Lecture notes of Local Area Network (LAN)

Ph.D., Purdue University, August, 2006. Enabling Internet Worms and. Malware Investigation and Defense Using Virtualization. Major Professor: Dongyan Xu.

Typology: Lecture notes

2022/2023

Uploaded on 05/11/2023

lana87
lana87 🇺🇸

4.4

(18)

318 documents

1 / 154

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CERIAS Tech Report 2006-22
ENABLING INTERNET WORMS AND MALWARE
INVESTIGATION AND DEFENSE USING VIRTUALIZATION
by Xuxian Jiang
Center for Education and Research in
Information Assurance and Security,
Purdue University, West Lafayette, IN 47907-2086
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52
pf53
pf54
pf55
pf56
pf57
pf58
pf59
pf5a
pf5b
pf5c
pf5d
pf5e
pf5f
pf60
pf61
pf62
pf63
pf64

Partial preview of the text

Download CERIAS Tech Report 2006-22 and more Lecture notes Local Area Network (LAN) in PDF only on Docsity!

CERIAS Tech Report 2006- ENABLING INTERNET WORMS AND MALWARE INVESTIGATION AND DEFENSE USING VIRTUALIZATION

by Xuxian Jiang Center for Education and Research in Information Assurance and Security, Purdue University, West Lafayette, IN 47907-

ENABLING INTERNET WORMS AND MALWARE INVESTIGATION AND

DEFENSE USING VIRTUALIZATION

A Dissertation Submitted to the Faculty of Purdue University by Xuxian Jiang

In Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy

August 2006

Purdue University West Lafayette, Indiana

iii

ACKNOWLEDGMENTS

It is a daunting task for me to enumerate, let alone repay, all those to whom I am indebted for their great assistance during my years at Purdue. In the following, I will mention a few despite inevitable omissions. First, I would like to thank my major advisor, Professor Dongyan Xu, in providing an energizing research environment and patiently motivating and supporting me during my graduate study at Purdue. Professor Xu has touched almost every aspect of my life in a positive way and I could not have asked for a more supportive and engaging mentor. Second, I would like to thank Professors Eugene H. Spafford, Mikhail (“Mike”) Atal- lah, Ninghui Li, Tony Hosking, and David K Y Yau for their time and efforts serving on my Ph.D. thesis committee and giving me valuable advice. In particular, I am deeply in- debted to Professor Spafford for his great shepherding and detailed feedbacks throughout my Ph.D. research. I would also like to thank Professors Xiaojun Lin and Ninghui Li for their constructive suggestions to improve my presentation and Professor Cristina Nita- Rotaru for kindly offering me an opportunity as a CERIAS seminar speaker. All of your support and guidance have significantly helped me make research progress and advance my professional career. West Lafayette is a nice and quiet place without much distraction. However, daily life for young graduate students such as myself would be quite mundane were it not for the constant interactions with my office mates, colleagues, and friends here. Yu (Jerry) Dong, Heung-Keung (Johnny) Chai, Wu Yan, Paul Ruth, Aaron Walters, Florian Buchholz, Jen- Yeu Chen, Gang Ding, Junghwan Rhee, and Ryan Riley are great friends and I greatly enjoy our time together. Our spontaneous and stimulating discussions on various topics

iv

from time to time provided much-needed inspiration and laughter, beneficial to both my work and life. I am indebted to my colleagues in industry, especially Yi-Min Wang, Helen J. Wang, Shuo Chen, and Doug Beck at Microsoft Research and Rong N. Chang, Christopher Ward, Melissa J. Buco, and Laura Z. Luan at IBM Research, for providing me with an avenue of technical exploration outside the confines of Purdue and exposing me to the commercial realities of industry research. I hope you found our work together as rewarding as I did. William J. Gorman, Amy Ingram, Mike Motuliak, Linda Byfield, and all other staff members of the Department of Computer Science also deserve my gratitude. I still re- member Dr. Gorman opened the door for me one weekend when I locked myself out of my office and left my interview materials inside. Amy patiently answered course regis- tration questions that I repeatedly asked every semester during the last three years. Mike cleaned up my laptop monitor many times and Linda helped me fill out numerous travel forms. I appreciate all of your help! Finally, I can not over-emphasize the importance of the persistent support and warm encouragement from my loving and beautiful wife Xining. Also, I must admit that I enormously enjoyed the distraction from my two kids – Matthew and Grace, ever since they were born.

vi

vii

ix

LIST OF TABLES

Table Page 5.1 Characterizing self-propagating worms with their behavioral footprints.. 84 5.2 Worm detection with content fingerprints................. 86 5.3 Snort signatures for the Slapper worm................... 88 6.1 A simplified color diffusion model..................... 101 6.2 LMBench results showing low process coloring overhead......... 109 6.3 Statistics of process coloring log in three worm experiments....... 110

x

LIST OF FIGURES

xii

ABSTRACT

Jiang, Xuxian. Ph.D., Purdue University, August, 2006. Enabling Internet Worms and Malware Investigation and Defense Using Virtualization. Major Professor: Dongyan Xu.

Internet worms and malware remain a threat to the Internet, as demonstrated by a num- ber of large-scale Internet worm outbreaks, such as the MSBlast worm in 2003 and the Sasser worm in 2004. Moreover, every new wave of outbreak reveals the rapid evolution of Internet worms and malware in terms of infection speed, virulence, and sophistica- tion. Unfortunately, our capability to investigate and defend against Internet worms and malware has not seen the same pace of advancement. In this dissertation, we present an integrated, virtualization-based framework for mal- ware capture, investigation and defense. This integrated framework consists of a front- end and a back-end. The front-end is a virtualization-based honeyfarm architecture, called Collapsar, to attract and capture real-world malware instances from the Internet. Collapsar is the first honeyfarm that virtualizes full systems and enables centralized management of honeypots while preserving their distributed presence. The back-end is a virtual malware “playground,” called vGround, to perform destruction-oriented experiments with captured malware or worms, which were previously expensive, inefficient, or even impossible to conduct. On top of the integrated framework, we have developed a number of defense mecha- nisms from various perspectives. More specifically, based on the unique infection behav- ior of each worm we run in vGround, we define a behavioral footprinting model for worm profiling and identification, which complements the state-of-the-art content-based signa- ture approach. We also develop a provenance-aware logging mechanism, called process coloring, that achieves higher efficiency and accuracy than existing systems in revealing malware break-ins and contaminations.

1 INTRODUCTION

1.1 Background and Problem Statement

Internet worms and malware remain a threat to the Internet, as demonstrated by a number of large-scale Internet worm outbreaks, such as the MSBlast worm in 2003 and the Sasser worm in 2004. Moreover, every new wave of outbreak reveals the rapid evolu- tion of Internet worms and malware with respect to their infection speed, virulence, and sophistication. Examples of malware capabilities include infecting via multiple software vulnerabilities [2–4]; propagating to a large machine population in tens of seconds [9]; planting “backdoors” in victim machines [2, 3]; installing malicious programs for spam relay [4] or personal information collection [2]; and forming botnets among victim ma- chines [10, 159]. Unfortunately, our capability to investigate and defend against Internet malware has not seen the same pace of advancement since the Code Red episode in mid-2001. The current approach of detection, characterization, and containment was developed to address the spread of file-based viruses, which mainly corrupt file contents, and has not changed significantly over the last five years. Furthermore, emerging Internet worms and malware are notably different from earlier file-based viruses in their infection methods, propagation means, and malicious payloads. As a result, advanced mechanisms are required to defend against emerging Internet worms and malware. In this dissertation, we argue that our lack of thorough understanding of Internet worms and malware and of corresponding defense techniques is partially due to the absence of systematic experimental platform and scientific methodology for observing, investigat- ing, and modeling Internet worms and malware. Such platform and the corresponding methodology should help answer the following questions: How to monitor the health of the Internet and generate timely attack alerts? Once an alert is generated, how to trace

1.2 Dissertation Contributions

The contributions of this dissertation are three-fold: malware capture, malware inves- tigation, and malware defense.

  • Malware capture We have designed, implemented, and evaluated a virtualization- based honeyfarm architecture, Collapsar [11, 12], to capture real-world malware at- tacks from the Internet. Collapsar realizes the honeyfarm vision of distributed pres- ence and centralized management of honeypots. Moreover, Collapsar supports both server-side honeypots and client-side honeypots [13]. Server-side honeypots run vulnerable server programs and passively wait for incoming attacks, while client- side honeypots act as vulnerable clients (e.g., running a vulnerable web browser) and actively crawl the Internet to be compromised by real-world malicious servers. Col- lapsar is the first virtualization-based honeyfarm system that supports both server- side and client-side honeypots.
  • Malware investigation We have designed, implemented, and evaluated a virtualiza- tion based malware playground, vGround [14], to safely reproduce malware behav- ior. vGround is the first safe, scalable playground that can be used to unleash and observe real-world worms and malware in a confined, realistic virtual environment on top of a general-purpose shared infrastructure (e.g., a physical machine or a clus- ter). vGround enables destruction-oriented experiments with real-world malware or worms captured by the Collapsar front-end. Such experiments were previously expensive, inefficient, or even impossible to conduct.
  • Malware defense Using Collapsar and vGround as an integrated experiment plat- form, we have developed a number of defense mechanisms [15, 16]. In this disser- tation, we describe two new defense mechanisms, one for worm behavior profiling and one for malware forensics: (1) For worm profiling, we have defined a behavioral footprinting model [15] that complements the content-based signature model and therefore enriches a worm’s profile for more accurate worm identification; (2) For

malware forensics, we have designed and implemented a provenance-aware logging mechanism called process coloring [16] to accurately and efficiently trace malware break-ins and contaminations.

1.3 Terminology

This section establishes terminology that is used throughout the rest of the dissertation. We inherit the same definitions for worm and virus by Eugene H. Spafford in 1989 [19]. The definition of honeypot is based on the definition by Lance Spitzner [20].

  • Worm A worm is “a program that can run independently and can propagate a fully working version of itself to other machines”. As noted in [19], “it is derived from the word tapeworm , a parasitic organism that lives inside a host and uses its resources to maintain itself.”
  • Virus A virus is “a piece of code that adds itself to other programs, including op- erating systems.” It cannot run independently – it requires that its “host” program be run to activate it. As pointed out in [19], it has “an analog to biological viruses
    • those viruses are not considered alive in the usual sense; instead, they invade host cells and corrupt them, causing them to produce new viruses.”
  • Rootkit A rootkit is “a set of software tools or programs frequently used by an in- truder after gaining access to a computer system.” [5] It allows an intruder to access the victim’s system without being noticed. A rootkit can intentionally conceal cer- tain status of a running system, such as current running processes, existing files, or open network connections. Various rootkits exist for a variety of operating systems including Microsoft Windows, Linux, and Solaris.
  • Backdoor A backdoor is “an undocumented way to get access to a computer system or the data it contains.” [6] The backdoor is usually combined with a rootkit. For example, when a backdoor is being provided by a malicious process, a rootkit can be deployed to hide its existence from a legitimate system administrator.

and process coloring, which are developed and evaluated on top of the integrated platform. We make concluding remarks and outline future work in Chapter 7.

2 AN INTEGRATED FRAMEWORK FOR MALWARE CAPTURE,

INVESTIGATION, AND DEFENSE: AN OVERVIEW

In this chapter, we present an overview of our integrated framework, followed by a brief description of its three key components and their relation.

2.1 Framework Overview

System Randomization

Behavioral Footprinting

Contamination Tracking

Collapsar vGround Advanced Malware Defense Mechanisms

Reactive Defense

Proactive Defense

Malware Trap Front−End: Back−End: Malware Playground

Figure 2.1. An integrated framework for malware capture, investigation, and defense

Figure 2.1 shows the overall organization of the integrated framework. This framework has three main components: (1) a honeyfarm front-end for malware capture (Collapsar), (2) a back-end playground for malware investigation (vGround), and (3) a suite of malware defense mechanisms.