Important definitions
Source
A source is an IP address that is sending packets to a given platform as long as there is not more than 25 hours between received packets from that IP. For instance, if IP address IP1 sends a packet to a given platform every hour during several months, all these packets will be linked to the same Source. On the other hand, if we get 10 packets from IP address IP1 on the first of March and then some more packets on the third of March, the 10 first will be linked to Source 1 while the next ones will be linked to another source, Source 2, corresponding to the same IP address. This is to take into account the fact that the same IP address may be reallocated to a different machine after some time, typically 24 hours.
Ports sequence
Attack sources send packets to various ports of the honeypots. A Ports Sequence defines the specific order according to which ports have been targeted on a given virtual machine. For instance, if source A sends requests on port 80, and then on ports 8080 and 1080, the associated ports sequence will be 80|8080|1080|, no matter how many packets are sent to these ports. Also, the same ports sequence is generated if source A sends packets to port 80, then to port 8080, then again to port 80 and then to port 1080.
Cluster
It is a set of IP Sources leaving the same attack fingerprint on a honeypot platform. The parameters used for determining the fingerprint are:
- global duration of the activities
- targeted ports sequences
- number of targeted Virtual Machines
- number of packets sent
- ordering of the attack against Virtual Machines (in sequence or in parallel)
- the payloads of the packets
Backscatters
It is a side effect of Denial-of-service attacks. Many automated flooding tools select a random source address for each packet they send. As a result, a DoS victim's server will send a response to the faked address.
These unexpected replies may sometimes hit our honeypots. By identifying them, we can identify the machines which are under attack. The specificity of these packets is that their source IP address indicates the victim of the attacks and not the authors, as opposed to all other packets we get.
Reference: David Moore, Geoffrey M. Voelker, and Stefan Savage, Inferring Internet Denial-of-Service Activity, 2001 USENIX Security From the San Diego Supercomputer Center's Cooperative Association for Internet Data Analysis (CAIDA).
