Implications of Peer-to-Peer Networks on Worm Attacks and Defenses, Hacking and IT E-Book Dump Release, pobierz pdf

[ Pobierz całość w formacie PDF ]
Implications of Peer-to-Peer Networks on
Worm Attacks and Defenses
Jayanthkumar Kannan Karthik Lakshminarayanan
kjk,karthik
@cs.berkeley.edu
CS294-4 Project, Fall 2003
Abstract
worm design is captured in [1] which provides the
blueprint for a worm that can infect the entire vulner-
able population in “under 1 hour, in possibly about
Recently, two trends have emerged in the eld of
peer-to-peer networks: widespread deployment of
peer-to-peer systems for le sharing and develop-
ment of distributed hash tables that provide efcient
lookups. In this paper, we study how to harness the
power of these technologies to further the state-of-
the-art in both designing and defending against Inter-
net worms. We quantify this advance from three dif-
ferent viewpoints. Firstly, peer-to-peer trafc char-
acteristics differs from traditional Internet trafc in
several aspects, and we quantitatively analyze the ef-
fect of these differences on worm propagation and
control. Secondly, we show that a DHT is an ideal
model for coordination among worms, and design a
DHT-enabled worm that is an improvement over ex-
isting worm designs in a number of aspects, mainly
stealth in propagation and speed of propagation. Our
DHT-based worm designs can be used to implement
a variety of policies aimed at circumventing existing
schemes for worm propagation control. Our results
also show that a coordinated worm can spread more
than twice as fast as worms such as Slammer, while
halving the number of unsuccessful probes. In this
way, this paper attempts to “raise the bar” in worm
design, and this is essential to the development of
suitable defenses. Finally, we offer some prelimi-
nary insights on how a DHT can be used to be defend
against worms.
minutes”. This worm uses a combination of a pre-
collected hit list of vulnerable machines and a tech-
nique to divide the probing load among the infected
machines. Concurrently, there have been several pro-
posals for worm defense. Two mechanisms that have
been suggested include content ltering and address
ltering. However, [2] is a negative result that shows
that neither of these is likely to work at the time scales
of the worm propagation. Another mechanism that
uses a identifying characteristic of worm trafc is
Virus Throttling [3]. However, this requires deploy-
ment at all end-hosts in order to be effective, and does
not prevent the end-host itself from getting infected.
Methods that are being explored recently mainly rely
on unusual trafc patterns during a worm attack. For
example, Cossack [11] aims for cooperative detec-
tion, while Netbait [7] provides mechanisms for shar-
ing information. It is currently unclear whether such
schemes can control worm propagation. To summa-
rize then, the fastest known worms can spread much
faster that our best known defenses can counter them.
In this work, we attempt to show the inuence of
a new Internet technology, peer-to-peer networks, on
this state of affairs. This technology has two aspects
pertinent to this problem:
Widespread deployment of P2P networks
:
File sharing networks like Kazaa have gained
popularity, and have substantial user popula-
tions [5]. Trafc patterns in such networks dif-
fer substantially from those in traditional Inter-
net applications in that a participating end-host
is a client as well as a server.
1
Introduction and Motivation
A paper on worm design and defense requires no mo-
tivation thanks to several well-publicized attacks and
their global impact. The current state of the art in
In acting as a
1
server, it is typically contacted by several ma-
chines over the Internet without any solicitation.
Thus, a participant end-host has a far richer and
diverse incoming trafc. This has an impact on
worm propagation because passive generation
of hitlists and passive propagation (infected ma-
chine attempts to infected only hosts that contact
it) can be potentially much faster. This effect is
well-noted in literature, in particular in [1], and
we will attempt to quantify this effect. Further-
more, we also quantify how active hitlist gener-
ation can be done by crawling a deployed P2P
network such as Gnutella.
worm operates. Section 3 enumerates techniques of
harnessing the host population and the trafc patterns
of deployed P2P networks and evaluates them. In
Section 4, we discuss possible ways in which worms
would desire coordination. We also outline how Ke-
lips, an existing DHT can be modied to achieve this.
In Section 5, we outline specic schemes that make
use of the ability to coordinate, and evaluate these
schemes. In Section 6, we briey discuss how DHTs
can be used for worm defenses. We discuss related
work in Section 7, possible future work in Section 8,
and conclude in Section 9.
DHT design
:
2 Worm Model
Peer-to-peer networks can also provide an ef-
cient implementation of the DHT abstraction,
and this allows for efcient distributed coordi-
nation. We will show that worms that use DHTs
for coordination during propagation can poten-
tially be much faster and be harder to detect (this
possibility was rst suggested, though not ex-
plored, in [6]). The other side of the equation,
worm defense, has not been well explored as
well (preliminary approaches has been proposed
in [7], [11], [12]). We will also offer some in-
sights in this direction.
We will rst discuss the simple model of a worm
that we will assume henceforth. In this model, the
worm exploits a bug in some service running at end-
hosts, then proceeds to probe the address space in
some fashion in order to discover other vulnerable
hosts. In the most naive form of probing, the infected
host probes addresses randomly. Smarter worms [1]
might choose to attempt localized infection (infect
other hosts in its network) or work off hitlists em-
bedded in the code or use sequence generators. The
random probing worm is known to have a sigmoid
growth curve, exponential in the beginning and then
beginning to slow down, when the remaining vulner-
able hosts are hard to discover.
We also assume that the probes that the worm per-
forms at the infected host is only limited by the outgo-
ing bandwidth at the host. UDP worms such as Slam-
mer are limited by the outgoing bandwidth alone.
Traditionally, TCP based worms use the kernel TCP
implementations which places a limit on the number
of connections that a host can open. Since this is not
a fundamental limitation, our assumption would hold
for a worm in the context of TCP-based worms too.
At this point, we also address the following ques-
tion: given that known worm designs can spread very
fast and no good defenses are known against worm
designs, is it necessary to design better worms? For
example, it has been stated in [4] that coordinated
worms are not a prerequisite for fast propagation. We
believe that the answer is yes, because of the follow-
ing reasons. Firstly, propagation time has been the
only metric that has been used so far. A coordinated
worm would improve over existing designs in reduc-
ing the susceptibility to intrusion detection systems,
thus possibly evading detection techniques that may
be devised for uncoordinated worms. Secondly, the
propagation period calculated in [1] makes several
assumptions that are not true in practice, that our de-
signs attempt to accommodate for. Finally, it is nec-
essary to understand attacks in order to build designs
that work against them.
The rest of the report is organized as follows. In
Section 2, we discuss our assumptions of the way the
3 Discovering Vulnerable Hosts
We discuss and quantify in this section how the di-
verse trafc in P2P networks allow for faster worm
propagation.
2
3.1 Hitlist Generation
120000
1 crawler
5 crawlers
15 crawlers
25 crawlers
Passive hitlist generation before the attack can con-
siderably speed up worm propagation time because
the worm attack takes time to get up to speed. If the
worm exploits a bug in the peer to peer application
itself, then a hitlist can be obtained by deploying a
number of clients that log the IP address of every ma-
chine that contacts them. These clients can also crawl
the network actively to discover as many hosts as pos-
sible in the network. Active probing has the advan-
tage of being faster and of yielding a more complete
topology information, as compared to passive prob-
ing. On the other hand, if the worm exploits a bug
in some other application, the above hitlist still offers
a useful starting point: tools like NMap [8] can be
used to probe the hosts to discover whether they are
susceptible to the worm.
100000
80000
60000
40000
20000
0
0
1000
2000
3000
4000
5000
6000
7000
8000
Time (seconds)
Figure 1: Number of IP addresses crawled as a func-
tion of time. Different lines correspond to different
total number of crawlers.
110000
100000
90000
3.2 Worm Propagation
80000
70000
Assuming no hitlist is generated beforehand, and the
bug is in the P2P application itself, worm propagation
in a P2P network can potentially be very rapid, due to
the frequent communication among peers. Moreover,
such networks are known to follow the power-law
model [9], and such graph are known to have a tightly
knit core that connects the other nodes. Techniques
used to optimize lookup distances, such as a biased
random walk [9], can also be used by the worm to
optimize its infection strategy. Such a walk biases it-
self towards this richly connected core, ensuring that
a large part of the network is discovered quickly.
60000
50000
40000
30000
20000
10000
0
5
10
15
20
25
Number of crawlers
Figure 2: Number of IP addresses crawled as a func-
tion of total number of crawlers used.
at which new IP addresses are seen decreases rapidly.
After a certain point, the number of new IP addresses
seen increases only linearly at a small rate. This can
be attributed to the churn that the Gnutella system
typically sees combined with the aggressive crawling
technique. The other phenomenon evident from the
graph is the diminishing returns in using additional
nodes for crawling. In particular, with 15 crawlers,
we are able to gather
IP addresses, and with
10 more crawlers, we add only
more new IP
addresses. This effect of diminishing returns is fur-
ther demonstrated in Figure 2 which plots the num-
ber of IP addresses seen as a function of number of
crawlers. In the plots, a random subset of crawlers
is chosen for each point corresponding to a certain
number of crawlers. We note that changing the set of
crawlers, in spite of affecting the actual numbers, had
3.3 Evaluation
We deployed Gnutella clients on 25 Planetlab nodes
that performed active crawling of IP addresses, in
order to assess the effectiveness of this scheme.
The implementation was a modication performed to
the Limewire implementation to perform appropriate
packet logging. The crawlers were run for about
hours and the crawl logs were post-processed.
3.3.1 Hitlist Generation
Figure 1 shows the number of IP addresses crawled
as a function of time. The different lines on the graph
correspond to varying the number of crawlers used to
generate the IP addresses. As time increases, the rate
3
4 Coordinated Worm Attacks
12000
10000
In this section, we outline several ways in which a
worm might wish to coordinate during its propaga-
tion, and then discuss why a DHT is the ideal solution
to provide that coordination.
8000
6000
4000
4.1 Desirable Coordination
Crawler #1
Crawler #2
Crawler #3
Crawler #4
2000
In the traditional worm design, an infected machine
begins to probe independently of other infected ma-
chines. Schemes in [1] attempt to provide some coor-
dination by sharing information when a new machine
is infected. However, coordination during propaga-
tion allows for more powerful designs. The following
factors which deter a traditional worm can be solved
using coordination:
0
0
2
4
6
8
10
12
14
Number of hops
Figure 3: Cumulative number of nodes as a function
of depth from a super-peer in Gnutella.
little effect on the overall trends.
We also study the limitations of hitlist generation
in a peer-to-network by studying how fast such a list
goes stale given the typical host availabilities. On
some initial studies, we found that about
of the
list is stale after a period of about
week.
Avoiding detection
: As worm defense technol-
ogy improves, it might become necessary for
worms to follow certain policies during propa-
gation to avoid detection. Following is an sim-
ple example of such a policy: if a machine inside
an access network is infected, then no other ma-
chines should probe the network so as to mini-
mize worm trafc seen by vigilant rewalls that
interpose themselves in the path from external
hosts. Another example of such a policy could
be that a infected machine avoids probing the
same access network more than
times in order
to counter a rewall that uses a threshold of
at-
tempts from the same source to detect an attack.
3.3.2 Worm Propagation
Using the edges that were seen during the crawl of the
Gnutella network, we regenerated the Gnutella graph
structure. Since many edges were not reported be-
cause the clients did not send responses, the graph we
obtained is a proper subgraph of the actual Gnutella
network. In fact, in some cases, we could success-
fully construct the edges for only about half the nodes
that the Gnutella crawler encountered.
Using this graph structures generated, we stud-
ied how an infection would spread along a Gnutella
graph. Figure 3 plots the cumulative number of nodes
at a particular depth starting from the root node.
Here, the root node is the rst node that is infected
by the worm. The different lines in the graph corre-
spond to multiple starting points for the worm, ie. a
set of machines are infected and then start propagat-
ing it in the Gnutella network.
As Figure 3 indicates, most of the nodes that are
visible from any vantage point are only a few hops
away. In fact, for crawler 1 (in gure), about
of the
hosts whose edges we were able to re-
construct were found within
hops away. In fact, by
having multiple infected hosts, similar phenomenon
persists indicating benet in such a technique.
Uneven IP address space allocation
: The occu-
pation of the 32-bit IP address space is far from
even as has been pointed out in [13]. This im-
plies that random probing is not the ideal infec-
tion strategy. One would like to direct probes to-
wards likely targets, by allowing nodes to share
information about heavily occupied IP prexes
etc.
Locality-based infection
: It is clearly desirable
for machines to attempt to infect a close-by host
rather than a distant host. Given that round
trip times can vary widely in the Internet (for
one such study, see [14]), it is conceivable that
choosing close-by targets for infection can speed
up infection for TCP-based worms. Also, since
delay is correlated to number of ASes in the
4
path, this ensures the probes cross lesser num-
ber of ASes.
Ability to answer such questions such as “Is
any node from IP address prex x.y.w.z/16 in-
fected?” efciently.
Probing rate control
: It has been pointed out in
[15] that the CodeRed worm could have spread
faster if it had avoided congesting links due to
its own probe trafc. Coordination can be used
to achieve such rate control.
Ability to answer nearest neighbor nearest
queries. Accuracy is a secondary requirement.
It is clear that DHTs can provide the above ser-
vices a worm would require to coordinate, though it
is questionable whether DHTs can handle the high
churn rates of such an environment. We chose a
churn-resistant DHT, Kelips [17] to explore this ques-
tion. Also, note that accuracy of lookups is not of
paramount importance – if consistency is delayed by
a few seconds, it would not affect the coordination
phase.
Before delving into using Kelips to achieve coor-
dination, we rst discuss how a DHT can be made
to work in our environment with rapid joins. Firstly,
our overlay mechanism is organized as a two-level
hierarchy, utilizing the well-known heterogeneity of
peer-to-peer hosts ([18]). Each worm joins the super-
peer level only if it has sufcient bandwidth to do.
Otherwise, it joins a super-peer which shields it from
DHT control trafc. We expect that each worm will
nd a super-peer nearby which maintains worm state
on its behalf.
Avoid needless probing
: Infected hosts can share
information in order to avoid attempt to re-infect
the same end-host.
At a high-level, coordination can achieve two
goals. The obvious one is that worm propagation
can be speeded up. A more subtle and important
point is that all of these factors aid evasion of de-
tection mechanisms. For example, if a worm avoids
probing an unoccupied IP prex, it can avoid detec-
tion by schemes such as backscatter analysis [16]. A
worm that chooses close-by targets is likely to en-
counter fewer core routers in its path to the target,
thus improving evasion against defense mechanisms
deployed by ISPs (say, content ltering). There are
other situations where such coordination is useful.
It is especially useful in getting the worm off the
ground, i.e. if the initial startup phase has been iden-
tied as the slow phase, we can speed it up using
coordination. It is also useful for worms targeting a
small population of machines running a rare instal-
lation (which for some reason is interesting to the
attacker). In such a case, the start-up phase will be
more of a limiting factor, and our schemes will con-
siderably speed it up.
4.3 Kelips-Based Coordination
In this work, we use Kelips, a hybrid DHT for main-
taining information across all infected hosts.
4.3.1
Introduction
4.2 Coordination Using a DHT
We rst give a brief introduction to Kelips. Kelips is
a hybrid of an unstructured network and an DHT de-
signed to handle high churn. Kelips nodes are orga-
nized into a set of
afnity groups (based on a hash
function). Keys are hashed using the same hash func-
tion to determine which afnity group they should be
stored in. Note however that a key can be stored at
any
node in its afnity group. Each node maintains
the following state:
The coordination of infected machines is a challeng-
ing problem because the set of infected machines in-
creases rapidly during the initial stages of the propa-
gation. Based on the desirable goals of the previous
section, the scheme we use for coordination should
provide the following services:
Fast routing table maintenance to deal with the
rapid joining of new nodes. Information should
not be lost pathologically in the face of nodes
leaving.
Afnity Group Pointers: Pointers to
every
node
in its group.
Load balancing properties to help divide probing
load equally across all infected machines.
Foreign Contacts: A set of
pointers to nodes in
every other group.
5
[ Pobierz całość w formacie PDF ]

Wątki