Starting on January 25th, 2004, the number of hits per day at the www.fourmilab.ch exploded from the typical weekday level of around 650,000 first to 823,000 on the 25th, then 1,051,992 on the 26th and comparable levels on subsequent days (with the typical drop-off expected on the week-end). The anomalous jump in hits is immediately apparent from the daily usage chart for January 2004.
About this Document
Examination of the Web access logs revealed that the increment in accesses were composed entirely of requests to the site's home page, most occurring over and over from a given IP address at intervals of four minutes (or, from proxy servers relaying such requests from hosts behind them, but I didn't understand that until later in the analysis process). These hosts requested nothing but the home page, which is highly anomalous, since that page on this site consists of nothing but a <frameset> container for which any browser will, immediately upon receiving it, request the content pages to which it links. This can easily be seen in the daily usage chart above in that the kilobytes transferred per day (red bar chart at bottom) was barely affected at all by the attack, as opposed to the other measures which do not take into account the volume of data transferred.
This pattern of accesses was highly suspicious, since at the very time the attack erupted against this site, the Mydoom.A worm was spreading across the Internet, and had been determined to contain code which would launch a distributed denial of service attack against the www.sco.com site on February 1, 2004 with precisely the form of requests I was seeing--repeated hits to the home page without any other requests to the site. However, my understanding was that Mydoom hit the SCO site as quickly as it could, not at the modest rate of one hit per four minutes I was seeing. This raised suspicion in my mind that, given the similarity of the pattern and the fact that I'd been hit several days before SCO was to be targeted, that my site might be the victim of an early test version of the worm, deployed to check it out prior to release as the fully-virulent Mydoom.
After mentioning this on a discussion list relating to the Mydoom worm, I was contacted by the administrator of another site who reported precisely the same pattern of hits: requests to his site's home page over and over, four minutes apart, totaling on the order of 100,000 per day (lower than the 400,000 I was seeing). However, this site had been under a consistent attack of this kind for almost a year, with two inexplicable intervals during which the attack stopped and then resumed with full intensity. We exchanged lists of IP addresses attacking us and compared them, and found no meaningful commonality. However, knowing that another site had been attacked for such a long period of time prompted me to analyse historical logs (I have logs of every hit to the site since it opened to the public on November 28th, 1994) to check for evidence of precursors to the main attack. The results were very interesting, indeed. In the daily status report for February 6, 2004, I report the discovery of a sequence of accesses identical to those of the attack dating back to December 12, 2002, and continuing from the same IP address until November 25, 2003. Other sites were observed to make similar accesses throughout 2003, and have been summarised in the early attacks report.
During the attack, I wrote to a number of system administrators who might be able to identify machines responsible for the attack and provide information about the possible cause. I received only one reply, from a site which tracked down the accesses to a single PC behind their firewall which was found to be infected with a variety of "spyware" and "adware" which, when cleansed from the machine, immediately stopped the hits originating there. Details, including a log of the cleanup of the machine, may be found in the daily update for January 30, 2004.
After these investigations and revelations, I mostly ran out of ideas apart from monitoring the statistics on a daily basis. Given that the Mydoom worm had a self-destruct date of February 12th, I was intensely interested to see what would happen to the hits on my site as that date approached. The hits started here first; would they also end before the 12th? Well, no, but on Friday the Thirteenth of February 2004, the attack began to deflate along a trajectory almost symmetric to the initial ramp-up, and has continued to do so subsequently. The end appears to be at hand. Still, one must temper one's optimism; recall that the other site hit by an almost identical attack saw pauses, then resumption of the assault. In any case, with the attack having ended or at the very least going on hiatus, there's nothing to do other than keep an eye on the logs to see if it comes back. The daily usage chart for February shows the abrupt drop in hit rate as the attack ended, then resumed on the 17th. Entries for the last week of February are perturbed by the introduction of countermeasures against the attack.
The main text of the document which follows was written on January 27, 2004 as an initial incident report on the attack. The historical evolution chart and the analysis reports linked to from the document were updated as the attack progressed, but subsequent revelations were not integrated into the main text. Discoveries made as the attack progressed and forensic analyses of the attack in progress and its history as recorded in the logs are chronicled in the daily updates appended to the original incident report.
It's back! On February 18th, the attack, moribund for several days, ramped back up with a vengeance and reached half its peak level of the first round. I've resumed posting daily updates, the first of which for the second round was posted on 2004-02-18.
I am not aware of any particular reason my site would merit being the target of an attack. Although I do publish Annoyance Filter which may irritate spammers, and The Digital Imprimatur, which could offend idiots who fail to comprehend it is a cautionary tale written in the style of a dystopia, neither of these are recent additions to the site. The only significant change of late was the discontinuation of Speak Freely on January 15th, 2004, but that was announced in the End of Life announcement of August 1st, 2003, and in any case occasioned no acrimonious reactions I am aware of, particularly since the final version of the program is in the public domain and available from an archive site on SourceForge. All of this is by way of saying that I can't see any reason my site should be a probable candidate for attack, which makes me suspect (along with the nature of the attack I'm seeing) that I'm simply a target of opportunity in all likelihood simply being used to test an attack network.
GET / HTTP/1.1When I started to notice this on the evening (GMT +1 time zone) of January 26th, two things were immediately apparent. First, the requests were coming from a large number of different IP addresses which, for those I was able to resolve a host name, were located all over the world. Second, these hosts' accesses to the site consisted exclusively of retrieving the home page--they did not make any other HTTP requests. Now this is downright weird, because the home page document at this site is just a <frameset> container which has no content unless one fetches the <frame>s it references. And most of these sites that were hitting me were requesting the home page over and over, without asking for anything else, but at a rate which was rather modest--not a flat out bombardment like you see when somebody's venomous spider program is in a loop blasting requests at you. Here is the "trajectory" of an individual IP address:
which began sending requests at 2004-01-26 03:15:28 and was still sending them 40 hours later at 2004-01-27 20:03:27 when I last scanned the log file. Note how the requests seem to arrive almost precisely four minutes apart, to within a second or so. This pattern is common, but not universal. Here's an extract of a second trajectory, this of a site which has sent a total of 1543 hits to the home page, which seems to send pairs of requests four minutes apart, as if in some circumstances multiple instances of the process which is generating the requests can be running. data table showing the acceleration of the attack over time. The following chart shows the development of the attack over time:
I'll update this document and chart as the situation evolves.
The "Hours Active" column gives the number of hours between the time of the first hit from this site and the most recent, and "Seconds per hit" the mean time in seconds between hits from this site. IP addresses which could not be resolved to host names via reverse DNS lookup are shown as "?"; host names which I haven't yet tried to resolve (the logresolve process is done as a batch job in the background) appear as blank.
Update: A disassembly of the worm payload indicates it creates a thread which simply requests endless copies of the www.sco.com home page. This is precisely the pattern of requests I'm seeing here, albeit at a much slower rate. Perhaps this is evidence we're seeing a test version of MyDoom here.
So again, I wonder, is this a test? "This is only a test. If this had been a real attack, your site would be a squashed bug by now." Makes ya' wonder. Makes me worry.
I'll append updates at the end of this page as things develop and information is gleaned as to what's going on here.
Looking over the chart so far, it looks like the real onset of the attack was at around 09:00 on 2004-01-24 with a large jump in recruited hosts beginning around 12:00 on the 25th. The first peak at over 20,000 hits per hour occurred at 20:00 on the 26th, and has followed a diurnal pattern since then, apparently neither growing nor shrinking. Recruitment of new hosts (never seen before in the attack) appears to be at a constant rate, following the same diurnal pattern. Since hosts are distinguished by IP address, some of this may be dial-up or other machines with floating IP addresses which appear to be new hosts each time they connect.
Now here's something extremely interesting. One of the machines consistently hitting me was behind the firewall of a commercial site, so I wrote that site's administrator to report the situation. They tracked down the source of the packets to a single PC which was found to be running a variety of "spyware" and "adware" packages when scanned by Ad-aware and Spybot-S&D. Further, when these products were instructed to remove the "malware" packages from the computer, it immediately ceased sending packets and hasn't sent one since. Consequently, the working assumption at this point must be that one of the packages identified and removed by these system cleaning tools is responsible for the attack, whether deliberately or accidentally. (One can easily imagine a spyware package checking in with headquarters every four minutes and using an HTTP connection, which sails through most firewalls, to do so. Now suppose somehow it got my IP address instead of the one to which it was intended to "phone home". . . .) Here is a copy of the Spybot log from the run on that machine. I have replaced text which might identify the site which so kindly shared this information with me or the user whose machine was infected with "REDACTED" to protect their privacy; none of the redacted information is relevant to identifying the source of the attack. We now have a list of suspects; perhaps tracking down other machines involved in the attack and cleaning them will permit winnowing the line-up to the actual perp. (Note: by publishing this list of spyware and adware cleaned from the machine which, after said ablution, ceased to sin against Fourmilab, I am not asserting that one of the packages named in this report is, in fact, responsible for the attack or imputing blame to their creators. I'm simply reporting the observation that this machine consistently sent packets every four minutes for more than 50 hours and, immediately after the Ad-aware and Spybot run, ceased to send them. You never know--perhaps the machine was rebooted after the malware was removed, and the actual culprit was a program which had been running but didn't restart after the reboot. We shall see.)
The first evidence that other sites are being hit with this came today in a E-mail responding to my Slashdot post. (again, I keep the identity of the sender and site confidential). The administrator of this site reports that he's seen the same pattern of hits to that site's home page for almost a year, starting slowly with several hundred a day and rising to on the order of 100,000 per day at present. The hits have the same characteristic four minute delay I'm seeing. He says he's heard of no other site being hit with these requests prior to my report.
What to make of all of this? Well, the rapid onset I saw looks very much like the propagation of a worm or virus, but may be the uptake of a command into a network of remote-controlled zombies. The malware connection is certainly worth investigating further when the opportunity to do a forensic analysis of another culprit host presents itself. I'm also going to analyse archived access logs to see if there's evidence for this attack prior to the January 21 date which I've used as the starting point for all analyses to date. Wouldn't it be cool if the perpetrator of this attack ran a test from their own machine before loosing this thing into the wild? The game is afoot.descending triangle which established its base and peak on the 26th and has been following that pattern ever since. Based on this advanced numerological technology, we'd expect a major violation of the baseline of the triangle around 12,500 to signal the end of the attack. The down-spikes yesterday were due to crashes of the regrettable 3Com firewall appliance whose hideous defects I will chronicle in excruciating detail once this adventure is over.
The administrator of the other site (see the last update) who's been seeing these hits for a year furnished me a list of IP addresses which have been hitting them. I cross-correlated this with my list of "heavy hitters" and found these sites as having hit both of us since January 21st, 2004. There's nothing obvious to be gleaned from this list: two are in Brazil and one in Portugal--some kind or weird lusophone conspiracy? I think not.
There have been reports of extortion threats against on-line gambling sites which threatened to take them down with a DDoS attack during the Super Bowl of American football. This made me wonder if the sites attacking me might be a DDoS network in "idle mode" waiting to be deployed against gambling sites which didn't pay up. A drastic decrease in hits on my site during the Super Bowl would be evidence for this. At this writing, the Super Bowl is still in the first quarter and I haven't seen a material decrease in the number of hits. I'll update this tomorrow when I have data for the entire duration of the game in hand.
I've updated the analysis tools available for download from the link at the bottom of this page to include a few new programs I've written. These programs are still abysmally documented, but better than the last time around. The onset.pl is my attempt to determine the first plausible instance of the attack. I've run this not just on the log starting on January 21st which I've used for most of these analyses, but on logs dating back to December 8th, 2003, and I've found a few potential precursors of the main attack. The activity in this report from 220.127.116.11 is interesting, to say the least. On December 10th, this site hit my home page 300 times without ever hitting anything else for one hour with a mean time of 21 seconds between hits. Then, on Christmas eve, a series of 3761 hits began which ran for 594 consecutive hours until January 18th, 2004, all without a single hit to any other page and an average time between hits of 569 seconds. Nothing remotely like this shows up in the historical logs I've scanned, and this IP address hasn't been seen since. The IP address resolves to a class C network belonging to pilosoft.com, an ISP in New York City. I shall be writing them tomorrow.
From yesterday's data, it's clear there wasn't a drop-off in the attacks during the Super Bowl as I speculated might happen if this was a DDoS network assembled for attacking gambling sites. Of course, maybe the extortionists didn't have a target to direct this particular network toward.
I've run my program which looks for early instances of attacks following the pattern on archived historical HTTP access logs (I have all of them, back to 1994 when the site opened) as far back as August 11th, 2003, and as far back as I've gone I've seen small numbers of sites hitting with the classic profile of the current mass attack. See the early attacks report for details. (Note: I haven't merged items from different log files, so some IP addresses are reported multiple times from the individual logs they were found within. The list is in ascending order of the date of the first packet received from the site.)
I also used "snoop" (the Solaris equivalent of "tcpdump") to monitor and dump incoming packets from a variety of hosts hitting the site. Packets from hosts hitting about every four minutes are identical. The mystery of why a few IP addresses hit much more frequently is now solved: they are proxy servers relaying requests from hosts behind them. I've posted packet dumps of examples of direct hit and proxied requests. Note that the HTTP GET request includes the specification "Pragma: no-cache", which forces the request, if relayed through a proxy server, to go ahead and hit my site anyway rather than serving a cached copy. Also note that that's the only specification in the HTTP header: no referer, no user agent, or anything else which might identify the program sending the request. Needless to say, these hits don't look like they're coming from a Web browser. Finally, note that the host name is specified--the request is not routed by IP address but includes the host name. This means that the strategy of moving the target site from an IP address wildcard to name-based virtual hosting and discarding all non-"Host:" specified requests, as suggested by one comment on Slashdot will not block these requests.
I still don't know for sure whether changing the server's IP address would help, but given that the HTTP requests include the domain name, that now seems far less probable. An IP address change might escape currently-hitting hosts, but as you can see from the chart and time evolution table, new hosts are continuously being recruited, and it's hard to imagine that they would include the domain name in the HTTP request but hard-code the IP address rather than doing a DNS lookup for it when the attacking program starts.
The peak hit rate remains steady at about 20,000 hits per hour with the divergence between unique hosts and hit rate remaining constant. There's no obvious change in the trend of newly recruited hosts. As I've observed before, there's no way to distinguish a genuinely new host from one with a floating IP address which has changed from one session to the next.
I have continued to scan historical logs and have not so far found anything matching the pattern of the attack earlier than the 2003-01-12 access I reported yesterday. By tomorrow a complete scan from August 2002 through the present should be complete and I'll update the early attacks document to include everything in that interval. The programs I use to scan historical logs are now included in the analysis tools download at the bottom of this page, but they are extremely specific to how the servers are set up here and may be completely useless at sites with different configurations.
Today's real news is the completion of the historical log scan and discovery of the date of the first access which matches the pattern of the current attack. Complete details are in the time evolution document--please read it. Because this information is of ongoing value, I'll leave the details there for any reader who consults that report, as opposed to including them only in this daily update. Eventually, I'll probably integrate that information into this document, but heaven knows when I'll find the time for that. In short, the evidence so far is that the attack began at 18:16:31 local time (17:16:31 UTC) on 2002-12-06 from IP address 18.104.22.168 which resolves to bgp01391858bgs.sequoa01.nm.comcast.net with "HEAD" requests. On 2003-01-12 at 06:22:55 local time, these switched to "GET" requests with only a few "HEAD"s seen in the next few days. This same IP address continued to pound away until 2003-11-25 at 10:49:02 local time, sending a total of 43,249 home page hits during this interval without a single hit to any other page on the site. This IP address has never been seen since in an attack log.
A couple more extremely arcane Perl programs have been added to the analysis tools download at the bottom of this page. These allow "retro-resolving" IP addresses that weren't resolved at the time a report was prepared and then merging the domain names back into the report. The attack onset program now looks for "HEAD" as well as "GET" requests.
I've extended the search for the first attack as far back as 2002-09-27 and found nothing earlier than the 2002-12-06 attack discovered yesterday. This appears to be when it all began.
The number of unique hosts with 100 or more hits passed the 10,000 mark today.
There's no significant change in the intensity of the attack today compared to yesterday.
After a little reflection, I figured out how to merge the old and new HTTP logs when preparing the reports, so I'll stay with merged data until it proves unwieldy.
Starting at 2004-02-13 11:00 local time (which, interestingly enough, is 12:00 UTC, and hence midnight of the 14th at the International Date Line), hits per hour and unique hosts hitting the site began to rapidly fall and have continued to drop monotonically ever since. Further, new hosts (attacks from IP addresses never seen in earlier attacks) dropped from 104 in the 12:00 local time bin to 34 in the 13:00 time bin, then 23 and 16 in the next two hours. Since then the number has continued to fall, although not monotonically, to a mean of around 10 in the last few hours. All of these are levels not seen since the attack began spooling up in earnest on 2004-01-25. The chart shows how dramatic this abatement has been.
These events put the "Mydoom precursor" hypothesis back into play. Recall that the main attack began a couple of days before Mydoom was unleashed in its full ferocity (although, as noted in earlier updates, there is abundant evidence of low-level attacks matching this pattern for over a year before the main assault began). Further, the requests this site was hit with were precisely of the form directed at SCO--repeated hits on the home page without any other requests. Disassembly and analysis of the Mydoom worm indicated that its attack on SCO had a timeout--it was pre-programmed to cease on February 12th. And lo and behold, if the current trend continues, it looks like the attack against this site began to wind down on a specific date--in this case as January 14 arrived around the globe. (Or, perhaps, it was triggered to stop at noon local time on the 13th, and we're seeing noon pass through the regions most heavily populated with hosts. It should be possible to tease this out of the heavy hitters report by analysing the time of the last packet sent from hosts whose time zone can be inferred from their top level domain.) Of course any analysis based on the local time of Windows machines is necessarily fuzzy because most of these machines do not run NTP or any other clock synchronisation protocol and may have clocks set mildly or wildly incorrectly.
While things are definitely looking up, it bears keeping in mind that the administrator of the other site who's seen similar attacks (see the update for 2004-01-30) observed two intervals during which the attack against his site almost entirely ceased, then spontaneously resumed. We'll see what the morrow brings.2004-01-30) observed precisely the same cessation and resumption of the attack, at the same time it was seen here. It is virtually certain that both of our sites are being hit with the same thing, but what and why remains a mystery. here and here. I also came across this in-depth analysis of a distributed denial of service which provides both technical details of how the attack zombie hosts are remote-controlled and offers an insight into the motivation and psychology of those behind these attacks. chart for February. Note that prior to February 23rd, the Pages, Files, and Hits in the top chart closely tracked one another, but on the 23rd, while the Hits remained high, the Pages and Files counts dropped to typical pre-attack levels; that's the patch in action. Starting around 18:00 local time on the 23rd, the unique IP addresses and new IP addresses per hour started an atypical decline (best seen on the time evolution chart--look at the very right and notice how the blue and green lines made a spike and then began to decline away from the red line, as opposed to following its general shape as before). Tomorrow we'll see if this means anything or is just a blip. I'm still researching alternatives within Apache--I know what I want to try next, but I haven't figured out how to shoehorn it into Apache's memory management and process model. The total number of unique IP addresses seen since the start of the attack topped 20,000 today. chart showing proxy and direct hits and the sum. (The chart begins on 2004-02-22 because that's when I started collecting data which permits distinguishing proxy and direct hits.) As a glance at the chart will reveal, there's been no secular change since that date in the mix of proxy and direct hits, so that cannot explain the divergence of hit rate and unique IPs seen recently. Further, it's clear that although proxies occupy the top slots in the heavy hitters table, they account for a small fraction of the total hits, most of which come from individual machines hitting directly. This is useful information, as one of the active measures I'm thinking about deploying will not work on hosts connected through a proxy. With the total contribution of proxies so small, proceeding with anything which might reduce hits, even if only from directly connecting hosts, is justified. The program used to produce the proxy and direct hit chart has been added to the analysis tools download at the bottom of the page. Note, however, that it requires an HTTP server configured to write a special "forensic log", not one of the standard log formats. Instructions for configuring the Apache HTTP server (version 2) to write such a log are included in the program, proxymix.pl. chart, you'll notice some inconsistencies between previous days' data and today's. In the process of checking out the latest round of active measures code, I discovered that in the second wave of the attack, hits may be submitted in either HTTP/1.1 or HTTP/1.0 protocol. In the first wave, only HTTP/1.1 was used. I modified the analysis programs to take into account hits in both protocols, and this revised upward the hit counts for the second wave.
The last few hours are, as a glance at the chart will reveal, a mess. This is due to my rolling out the second round of active measures, which identifies the IP addresses of attacking hosts and automatically blocks them from ever reaching the HTTP server. This is presently in very crude form, and I'm running it mostly to stumble into bugs in the design and implementation. While it's running and doing its job. however, the attack hits should be drastically reduced. I'll have more to say about this attack blocker once it's checked out and put into routine service.analysis tools and ponder the Perl program gardol.pl. Gardol relies upon the most excellent system-independent kernel-level packet filtering provided by IP Filter.
The residual level of attack is for two principal reasons. First, to avoid accidentally identifying a legitimate user as an attacker, the packet monitor requires that packets both conform to a signature which the attackers match and repeat a specified number of times without sending any other packets not matching the signature. This means that each host newly recruited into the attack will slip a few packets past the filter before it is unambiguously identified as an attacker and blocked. Second, and more signficant, is that attack packets which are relayed through ISP HTTP proxy servers are not blocked at the IP address level as are those originating from directly-connected attackers. If proxy servers found to be forwarding attack packets were blocked, innocent users of those servers would find themselves blocked without any explanation. So, I permit the proxy requests to reach the HTTP server, where the second level of defence against the attack discards them. This causes them to show up in the log and appear in the chart above. Since they are discarded immediately upon being identified as attack packets, their impact on the server and outbound network bandwidth (the resources most scarce at this and most Web sites, and hence most vulnerable to an attack of this kind) are negligible. It would be easy enough to block the proxy servers and, say, generate an automatic E-mail to the abuse desk of the ISP they belong to indicating why. If I don't get any more response, not to speak of assistance, on the part of the ISPs who are relaying these attacks by their customers, it may come to that.
In other news, today I heard from an administrator at yet another site which is experiencing this same attack--thousands of hosts hitting, most every four minutes. They are currently seeing about 3200 distinct IP addresses hitting them, which is consistent with what I saw in the second phase of the attack, and also the attack reported by the other site I discussed in the update for 2004-01-30. That makes three sites so far under this attack. One wonders how many more of us there are.
Today the attack blocker intercepted and discarded its five millionth packet since being put into service. I'm in the process of documenting the tools I've developed to respond to the attack, which I'll make available to other sites under attack. Having suffered occasional DDoS attacks on other occasions over the last few years (although nothing remotely severe as this one), I've tried to make the detection and blocking facility flexible enough to be deployed against most attacks which do not saturate a site's inbound bandwidth.post-remediation chart above through 19:00 local time on 2004-03-09.
Having minimised the impact of the attack on the site about as much as possible by measures within the target zone, I've spent the last few days researching possible sources of the attack, and I have found a very promising candidate--not the person or persons responsible for launching the attack, but the software which may be responsible for sending the packets hitting the site. At this point, I have no probative evidence that this software is responsible, but every characteristic of the software in question is completely consistent with the nature of the attack, and I have discovered no discrepancies between what I'm seeing on the receiving end and which I would expect to see were this software responsible for the attack.
I am sure you will be neither shocked nor stunned to learn who is responsible for this software: Microsoft. The facility in question is a "feature" introduced with Windows XP (and which may be retrofitted to Windows 98 and Me) called Universal Plug and Play (UPnP), which had such a catastrophic potential impact that the United States Federal Bureau of Investigation's National Infrastructure Protection Center (now part of the Department of Homeland Security) issued a warning about it on December 20, 2001. Basically, any unpatched XP system (or 98/Me with UPnP installed), becomes a wide-open Web server which can be used to bombard any host on the Internet with packets. Detailed information about the UPnP vulnerability is available in the following documents.
Now, at this point, I have absolutely no hard proof whatsoever that the UPnP vulnerability is what's causing the attack. But what is obvious is that it could be used to mount such an attack and, if it were, the resulting attack would have precisely the characteristics of the attack seen at this and the two other sites who are enduring it. All a potential attacker needs to do is take the already-posted exploit program, plug in the URL of the site to be attacked, then aim it at ranges of IP addresses likely to contain vulnerable PCs, of which there are doubtless bazillions, since any original installation of XP which has not been patched and can receive a UDP packet on port 1900 can be recruited into the attack. What's more, the very first evidence of this kind of attack was seen in the log file here on 2003-12-06, just two weeks before the vulnerability was publicly announced (see the update for 2004-02-06), and the attack on one of the other sites who've been hit became apparent in early 2002, shortly after the vulnerability was announced and the exploit code published. "If it looks like a duck, and it walks like a duck, and it quacks like a duck. . .".
Only actually identifying a host participating in the attack and inspecting its registry settings (or wherever the UPnP poll address it kept) can determine for certain whether UPnP is the source of the attack. But, if it is, one can make some predictions which are falsifiable on the receiving end of the attack. If the evidence seen is inconsistent with the predictions, then UPnP can be ruled out as the cause. The most obvious prediction is the following: Any attacking host which is not behind a firewall, NAT box, or other filter, should have the ports used by UPnP (5000/tcp and 1900/udp) open. If a large population of attacking hosts do not fit this profile, then they are probably not XP machines running UPnP and could not be the source of an attack exploiting it.
For a period of 16 hours, I monitored each newly attacking host and, immediately after receiving an attack packet from it, launched a nmap port scan upon it, checking ports 1900 and 5000. Port 5000 is a TCP port, for which there are three possible results from the scan: "open", which means the machine will accept connections on that port, "closed", which means no service is listening on that port, and "filtered", which generally indicates the port is behind a firewall of some kind. I scanned a total of 3666 attacking hosts, and found the following for port 5000:
The high fraction of "Filtered" results is consistent with most of these hosts being home PCs connected to the Internet with NAT boxes which do not permit inbound connections on TCP ports. A "Closed" result does not necessarily mean the machine is not listening on a port--many firewalls, including the one at this site, respond to attempted connections on prohibited ports as if they were closed--these could be, for example, machines on university networks behind a campus firewall (but which could be recruited into the attack by another machine on the local network behind the firewall). Port scanning takes a while, and since I scanned hosts serially, there was often a delay between receipt of the attack packet and the port scan of its sender. It's possible, therefore, that some hosts may have disconnected in the interim, which would yield a "Closed" status.
Port 1900, where the "NOTIFY" request is sent to direct a vulnerable machine toward the target site, is a UDP port. UDP is a "fire and forget" protocol, and there is no way to distinguish a UDP port blocked by a firewall from an open UDP port. Consequently, a port scan will report only "Open" or "Closed" for port 1900. Here's the result:
This result is largely consistent with the hypothesis, as the number of machines with port 1900 found "Open" (2961) is quite close to the sum of machines with port 5000 "Open" or "Filtered" (2913). Additional "data mining" studies are possible based on the port scan database (for example, do all machines connected through a given ISP or .edu domain have the same filtering profile?). I will report whatever I discover in subsequent updates.
The final question is, if this is indeed the source of the attack, is there anything that can be done to identify who is responsible for it and/or to cause the attack to abate? Several possibilities come to mind in this regard.
Another approach would be to counter-attack in the same manner the published "chargen" exploit does--send an endless stream of data back to the attacker, eventually filling up its memory and hanging it in a CPU loop (which may get users' attention enough to motivate them to apply the patches which will fix the vulnerability) or, at least, by taking the attacker down, keep it from hitting us. The problem with this is that to take down an attacker, we'd have to send a constant data stream large enough to fill its memory. Few sites, certainly not this one, have sufficient outbound bandwidth to stream hundreds of megabytes of data to each attacking host. Somebody with a big pipe might consider this, but then the impact of this kind of attack on such a site would be tolerable and may not be worth the trouble trying to respond.
After the installation of the remediation software on 2004-03-03, the intensity of the attack (measured now by the greatly-reduced level of hits received before the packet blocking kicks in for an attacking site) remained more of less constant until March 23-24 when it appeared to fall by about a factor of two. (The spike in the chart is an artefact due to my installing a new version of the remediation software, which results in a short-term increase in attacking packets until the new version "learns" the list of currently attacking IP addresses and blocks them.) The attack remained at this reduced level of intensity until around 2004-04-01, when it started to diminish on at a roughly linear rate. By 2004-04-12 the attack had diminished to levels not seen since the "holiday" in mid-February. The number of attacking hosts has declined to a "mere" 400 or so from the thousands seen shortly after the packet blocking was implemented. Looking at the chart above, it's tempting to conclude that something switched off on the first of April, with population of attackers and intensity of the attack declining ever since.