Software for ping probing the density of IPv4 address utilization

Here I describe the software I wrote and give details of how I used it, including discussion of the challenges of sampling IP addresses, such as trying to do it over a 24 hour period in order not to miss hosts which are connected only at certain times of the day.

I am well aware that some or many hosts do not respond to pings. I have no idea what percentage of "non-ping-responsive" hosts is overall. If you have any ideas about this, please let me know and I will try to improve these pages Nonetheless, I think these ping-based techniques are good enough to show some interesting aspects of IPv4 address space usage.

I am mainly trying to foresee what sorts of adjustments in currently allocated space are likely to be made once the reserves of addresses run out around 2011 or 2012. As far as I know, there is no other comparable analysis of host densities, using ping or any other techniques. I do not suggest that this analysis can't be bettered. I hope to inspire others to pursue this line of researching and visualising how the IPv4 address space is being used at present.

The tarball of software (source code in C for a Unix/GNU/Linux system), results and a file to drive gnuplot are here:

ping-scanner-software.tar.gz

The terminology used here includes, for brevity:

"host" - any device which is actually using one or more IPv4 IP addresses, whether or not it responds to a ping packet. A host is assumed to be a router, server or desktop computer - implicitly it means that this IP address is being "used".
"ping-responsive-host" - any device which responds to a ping packet, but (hopefully) not an intermediate router which returns the packet from another address.
"ping-responsive-host density" - over a certain range of IP addresses, usually specified by a prefix, what percentage of pings sent to that range generate acknowledgements. Sometimes, I have pinged every address. Other times I have pinged a randomised sample of those addresses.
"Random" - actually pseudo-random, using a 31 bit pseudo-random-number generator (PRNG) with the Park-Miller-Carta algorithm, as described in ../../dsp/rand31/ . This is seeded on the current time, in seconds.

I don't claim that pinging is the definitive way of deciding whether there is a host connected to an IP address. Some hosts may not respond to pings. Another approach is looking at the reverse DNS entries of every IP address in the prefix - which I have a program to do for any /24. Another approach, although imperfect too, would be to look up the IP address in a body of data such as that from the Internet Systems Consortium www.isc.org/index.pl?/ops/ds/ Internet Domain Survey Host Count, which scans name servers looking for hostnames, and the IP addresses of those hostnames.

Another approach, if we find a prefix which apparently has no hosts in it (according to ping) is to try connecting to some well known TCP ports on every IP address, for instance for SMTP, HTTP, SSH etc. This is likely be be viewed as unfriendly, and I have not attempted it.

A prefix without any hosts connected to it at the time of the survey is not necessarily a poor use of IP addresses. Maybe hosts will be connected soon. Maybe the hosts were turned off when I pinged them.

Also, a prefix with a single host is not necessarily a "waste" of IP address space, since the user may only have a single host they needs to multihome, and using a /24 (or similar) prefix is the smallest use of IP address space they can make, with BGP-based multihoming, in order to get that host on the Net. The one IP address may serve an entire organisation, providing a mail server, name server, VPN services, Net access for hundreds of computers in the local LAN via NAT or a proxy server etc.

The software

ping-scanner.c

Random pings at about 1 per second to a complete /8 prefix (16,777,216 IP addresses). The prefix is specified numerically on the command line and the program runs for up to 86400 pings, at one a second. Results are written to a file such as 208.txt, if 208.0.0.0/24 is the prefix being scanned. This can take up to 24 hours, but sometimes it runs for less time. For instance, when a ping is acknowledged, the ping program will usually exit sooner than the 1 second wait. Also, some responses from the network make it exit very quickly, so this is not a completely reliable way of pinging for 24 hours. Nonetheless, I find it generally runs for close to 24 hours and so is useful for determining the average ping-responsive-host density of entire /24 prefixes.

The ping program on Debian 3.1 can't ping 0.0.0.0/8, so I don't attempt to ping this prefix in any of the following tests.

There is also a version of this program which sends a ping approximately every 5 seconds. This is to enable a 24 hour scan of all the /8 prefixes from the next-mentioned program. Ideally, this program should be altered to print the percentage first, because that is what we sort the output lines on to make a distribution chart.

ping-scanner-call-multiple.c

ping-scanner-slash24.c

ping-scanner-slash24-slow.c

prefix-process.sh

The purpose of this is to take the latest raw "global BGP routing table" file from bgp.potaroo.net/ipv4-stats/prefixes.txt and split it into multiple files, counting the prefixes in the output files. Firstly, it greps prefixes.txt to leave only lines which contain prefixes which are "advertised" - that is, a router says it is responsible for this prefix and asks all others to send packets to it which are addressed to this prefix. This gets rid of prefixes which are not allocated and those which are allocated and not advertised. The result is prefixes-adv.txt. The prefixes.txt file is, as far as I know, completely contiguous in that all the prefixes, together, entirely cover the IPv4 address space 0.0.0.0 to 255.255.255.255. This file is produced daily, based on the BGP routing table in a particular router. I understand it is a good representation of the routes that every transit and multihomed border router knows about and needs to consider when deciding which interface to forward packets on.

prefixes-adv.txt text is then processed with grep to produce multiple files containing only one prefix length. I do this for all lengths 1 to 32, but there are none shorter 8, and we are generally not interested in those longer than 24, of which there are a handful. The main purpose of this is to generate the files prefixes-adv-08.txt to prefixes-adv-24.txt and to count their lines. The counts appear in out.txt. These files can be used for several purposes, including sending to ping-scanner-mask.c and ping-scanner-mask-multi.c.

There is an OpenOffice Calc spreadsheet Prefix-analysis-1.ods to process a column from out.txt to calculate how many IP addresses are covered by advertised prefixes of different lengths. In all of this work, I need to be able to edit a text file and select, copy and paste a rectangular block of text, such as the numbers in a particular column. There are various editors which do this, but I use Code Genie 3.0 a Windows program from 2002.

ping-scanner-mask.c

This program reads a text file, where each line specifies a prefix between /8 and /24 in length. The program has a mask of the IPv4 address space, on /24 boundaries, so it has 2^16 entries, one for each set of 256 IP addresses. The input file's prefixes set the mask to allow pings to these sets of IP addresses. Pings are generated at one per second, with address bits 23 to 0 controlled by the PRNG. For each such set of bits, the program tries to ping that address in every /8 (from 1 to 223), depending on whether the mask is set in that prefix. So, for instance, it tries 1.23.45.67 (where 23.45.67 are from the PRNG) and then it tries 2.23.45.67 etc. up to 223.23.45.67.

This program enables randomised ping scanning of large and complex areas of the address space. For instance, I used it to find the average pin-responsive-host density for IP addresses which are advertised as /24 prefixes, and likewise for all the other lengths to /8. There were considerable differences between these, which prompted me to analyse the distribution of ping-responsive-host densities in a sample of individual prefixes of the same length, which leads to the next program:

ping-scanner-mask-multi.c

This resembles ping-scanner-mask.c except that instead of producing just a single result, for the entire set of pings sent and acknowledgments received, it maintains separate pingsent and ackcount counters for the set of IP addresses specified in each prefix. The program reads the prefixes on individual lines, and each prefix it reads generates a different number (above 0) in the 16,777,216 long mask array. ??????????? Subnets of /8 to /24 are accepted as inputs, including those which span the boundaries of longer prefixes. (This is discussed below in the section prefix-span-tag(zeropad).c.)

This is probably the most powerful and useful program here. It can use every prefix in the input file, such as a file identical to or derived from prefixes-20.txt or similar, as described above. It can also step through the file, using only particular lines. The span and the offset command line parameters enable one instance of this program using one input file, to ping only those prefixes on, for instance, lines 12, 22, 32 etc. This is with a span of 10 and and offset of 1. This makes it relatively easy to craft a shell script, using "&" at the end of each line, to fire up a bunch of instances of this program, all working from the same input file, but each pinging a certain number (the length of the file divided by the span) of prefixes, and each instance doing a different set, assuming the spans are the same and the offsets are different.

Each instance writes its results to a distinctively named file, with the input filename, the span and the offset in the results filename. These can then be concatenated to show the total results of pinging a bunch of prefixes at the same time. It would be tricky having one executable program doing more than about one ping per second, since we need to wait up to a second for the acknowledgement packet to arrive. Therefore, to launch ping surveys at more than one per second, it is desirable to have an easy way of running multiple instances of the same program, with the results being easy to bring together.

A single line of the results file of this program looks like this, assuming its input file is something like prefixes-adv-23.txt. In this case, the input file has been tagged with "Dxx", using prefix-span-tag.c as described below, to indicate whether each prefix spans the border of a longer prefix.

I have labeled the columns, but this is not part of what is in the file.

Acks as % Pings sent Line number Depth of longer
of pings | from input prefixes spanned
sent Acks | file Prefix | Other items from input file
| | | | | | |
000.0000, 00000, 00172, 02562, 193.108.224.000/23 , D00 , 512 20010629 ripen cc adv
003.1645, 00005, 00158, 03389, 194.103.055.000/23 , D01 , 512 19941019 ripen cc adv
000.6369, 00001, 00157, 04216, 196.008.114.000/23 , D00 , 512 19931110 afrinic adv
001.1111, 00002, 00180, 05043, 199.079.133.000/23 , D01 , 512 19940131 Carin adv

Multiple such results files can be combined into a single file. When the lines of that file are sorted, the lowest percentage results are at the top. Then it is simply a matter of selecting the left column of percentages, copying, pasting in to a spreadsheet program and then one can have a bar-graph showing the distribution of host densities.

A variation on this program, made by commenting out some lines, is ping-scanner-mask-nt.c Normally, the program produces an output file with the lines as shown above, one for each prefix selected from the input file, and at the end a line with total statisitics for all the pings sent and acknowledged. That total line is not needed when we concatenate output files and sort them, so this "-nt" version does not print the total line.

prefix-span-tag.c & prefix-span-tag-zeropad.c

In this example, 192.126.1.0/22 spans four /24s, but not in a simple manner. If it started at 192.126.0.0 it would be on a /22 boundary and there would be no such "spanning". Because it starts at 192.126.1.0 it does not have a simple set of bits which are fixed and another set which have all combinations. This would be tagged as having depth "D02" with these programs.

The zeropad version also pads out the prefix address numbers to three leading-zero digits. I wrote it this way at first, to enable easy sorting of lines according to their addresses. However, ping and other programs don't like the zero-padded format, so I wrote a second version without this.

Input lines such as from one of the prefixes-adv-xx.txt files:

24.196.224.0/20 4096 20010523 Carin adv
24.196.241.0/23 512 20010523 Carin adv
24.196.244.0/20 4096 20010523 Carin adv
24.197.4.0/22 1024 20010523 Carin adv
24.197.8.0/20 4096 20010523 Carin adv

produce lines such as this (no zeropadding):

24.196.224.0/20 , D00 , 4096 20010523 Carin adv
24.196.241.0/23 , D01 , 512 20010523 Carin adv
24.196.244.0/20 , D02 , 4096 20010523 Carin adv
24.197.4.0/22 , D00 , 1024 20010523 Carin adv
24.197.8.0/20 , D01 , 4096 20010523 Carin adv

or this, with zeropadding:

024.196.224.000/20 , D00 , 4096 20010523 Carin adv
024.196.241.000/23 , D01 , 512 20010523 Carin adv
024.196.244.000/20 , D02 , 4096 20010523 Carin adv
024.197.004.000/22 , D00 , 1024 20010523 Carin adv
024.197.008.000/20 , D01 , 4096 20010523 Carin adv

The idea is that later, when I am looking at the host density of various advertised prefixes, I can easily tell which are straightforward prefixes or which span several longer prefixes, and might therefore in some way be used differently. It is possible to count these in each prefixes-adv-xx.txt file with grep. All /24s will be tagged "D00" since they are all on x.x.x.0 /24 boundaries.

prefix-process.sh & prefix-process-leading-zeroes.sh

These shell scripts require grep to be intalled (which it usually is on a Unix/GNU-Linux system) and one of the programs prefix-span-tag or prefix-span-tag-zeropad to be in the parent directory.

prefix-process.sh uses the former, and processes prefixes.txt (which should be in the current directory) with grep to create prefixes-adv-untagged.txt , which contains only those prefixes which are advertised and therefore connected to the Net. It then uses prefix-span-tag to create a tagged version of this, prefixes-adv.txt . The script then uses grep to create separate files prefixes-adv-02.txt to prefixes-adv-32.txt . The -02 to -07 files have zero length. The -08 to -24 files contain lines with the depth tags, and it is these files we are primarily interested in for directing ping surveys. The -25 to -32 files are produced from prefixes-adv-untagged.txt. They have no depth tags. out.txt has the prefix counts for each prefix length.

prefix-process-leading-zeroes.sh does the same job, but with decimal sections of prefix addresses expanded to 3 digits with leading zeroes.

inaddr-slash-24.c

On 2007-02-27 I had a bunch of /24 advertised prefixes which were "ping unresponsive". I wanted to know if there were any hosts there, so I wrote a program to use dig to do reverse mapping queries on every IP address in a /24 prefix. This program gets its prefixes from a text file, such as a results file from contatenating and sorting the output files of 282 instances of ping-scanner-slash24-slow.c. into ascending percentage order.

It runs dig, with lines of the form:

+short -x 192.26.45.0
+short -x 192.26.45.1
+short -x 192.26.45.2
+short -x 192.26.45.3
. . .

+short -x 192.26.45.254
+short -x 192.26.45.255

dig usually only puts out some text when it finds a text name returned for its query. However some error messages may get through which I tried to filter with grep, however piping to grep on a command line run with system() turned out to cause all sorts of havoc. The result of the scan of one /24 prefix is written into a file with a name such as: 192.026.045.0-24-inaddr.txt .

Addresses in some prefixes, such as 192.231.192.0/24 cause dig to produce as its result ";; connection timed out; no servers could be reached". This is very slow, and it can take a long time (more than 10 minutes) to get through the 256 requests. I try to stop this by quitting that prefix if dig returns a non-zero status, which I think it does for such a condition, but there will still be this error message in the output file. At the end of each cycle, I wanted at put the file through grep to get rid of any such error messages, but I had all sorts of trouble calling grep in any way using system(). This means you will have to look manually at the contents of any non-zero results file. The single error message my version of dig produces shows up as 53 bytes in the results file.

Research 1: Host density of /8 prefixes 1.0.0.0/8 to 223.0.0.0/8

On 21 February 2007, I used the following command line to randomly ping the Net for 24 hours, with an instance each of a 5 second per ping version of ping-scanner.c for each /8 prefix 1.0.0.0/8 to 223.0.0.0/8:

pingscanner-call-multiple 1 223 17280

Each instance sent a ping every 5 seconds to some random address in its assigned /8 prefix. Each prefix got 17280 pings.

A day later (some quite a lot earlier, for reasons I investigated to the point of not being concerned about them) I had a bunch of files 001.txt to 223.txt. I used concatenate-outfiles.sh to bring them all together, and they did so in name order. The results are results-slash-1-to-223.txt and looked like this, with the prefix, the raw acks received, and the percentage of pings for which acks were received:

070, 03485,   20.1678
071, 05984,   34.6296
072, 03582,   20.7292
073, 01637,    9.4734
074, 02987,   17.2859
075, 02020,   11.6898
076, 00884,    5.1157
077, 00355,    2.0544
078, 00000,    0.0000
079, 00085,    0.4919

From this, I created a text file, using information about each /8 prefix from www.potaroo.net/tools/ipv4/ and then used this text file as the basis of the colour HTML table here ../#slash8-1-223-results .

The earlier endings to some instances seems to be related to responses coming back which are not proper acks, but which end the ping program earlier than the 1 second normal delay. I satisfied myself that there was no reason to think that the acknowledgement rate was lower than it should be. The earlier endings meant that the sampling time wasn't a full 24 hour cycle in weekdays (this was 2007-02-21) but that some /8 prefixes were sampled for a shorter time. I don't think this would affect the acknowlegement rate enough to render the striking diffrences in host-density invalid. The details are here: completion-times.txt.

Reseach 2: ping-responsive-host density of selected /24 prefixes

I can't remember now (mid-March 2007) how I discovered that the ping-responsive-host density of many /24s was rather low. I wanted to find out the distribution of their densities. For instance, if I tested 100 /24s by pinging every address in each one, and 30 of them gave no acknowledgements, then the start of my graph would be a horizontal line at 0 height, stretching 30% of the way towards the right.

On 2007-02-23 I ran 282 instances of ping-scanner-slash24-slow from a manually prepared shell script. All ran in parallel, mainly for a day (to try to avoid missing computers from some country where they were not turned on while a shorter test was run), but some finished earlier. I retried for shorter times, at 1 ping per second, those prefixes for which the program had finished earlier, and found that some of them were getting acks from a router at an IP address other than that which the ping was sent to. I did not count these as acks. The 282 prefixes were chosen manually, in a reasonably random fashion, from the prefixes-adv-24.txt of a day or so before. (I got the file and repeatedly: selected two screenfuls, deleted it, moved down a line and so on.) I also retested some of those prefixes which returned no acks and found them firstly to be returning no acks again, and secondly to be still in the latest prefixes.txt.

All the output files were contatenated, sorted by percentage (the first result) - and then I used the percentages to make a bar-graph chart.

This lead to the realisation that quite a lot of these /24 prefixes appear to have no computers connected to them - or at least that any connected computers were not returning pings. Perhaps there are other explanations, but it seems the most likely one is that a significant number, maybe as high as 35% of these /24s, are not being used for traffic and so could be removed from the Net, reducing the load on routers all over the world, without affecting traffic. In order to reseach these apparently "ping unresponsive" prefixes, I also looked at the reverse address mapping of these prefixes. More on that below.

Research 3: ping-responsive-host density of BGP advertised prefixes /24 to /10

I could tell that from Research 1 above that the big /8 prefixes in the "global BGP routing table" generally had a very low host density. Research 2 made me think that the average of /24s was rather low too, and that the problem was that about 1/3 had zero ping-responsive-host density and that only a few of them had substantially high ping-responsive-host densities.

ping-scanner-mask.c can be used to tell me the average host density for all the /24 prefixes, or all the /23 etc. all the way to the /8s, which I already knew. The results from a 24 hour weekday set of runs of this program (one for each of the lists of advertised prefixes) are in the table ../#prefixes-analysis . The form of this command was:

ping-scanner-mask prefixes-adv-11.txt 86400 &

I did this 24 hour scan on 26 and 27 March 2007.

Research 4: ping-responsive-host density distributions of BGP advertised prefixes /9 to /24

I wanted to see the distribution curves of host density for samples of BGP advertised prefixes from /10 to /24. I already knew the ping-responsive-host density for each /8 prefix. There were only two /9 prefixes, and for one of them, I got no responses to pings (details below).

For the address space covered by advertised /10 prefixes, and then for /11, /12 etc. to /24, I wanted to know not just the average ping-responsive-host density, but how much the average was held down by lots of low ping-responsive-host density prefixes, while a smaller number of operators made much better use of their address space (or at least let their computers respond to ping). This is part of trying to understand what a reasonable expectation of host density is, so when we do start to run out of IPv4 addresses - both reserved addresses and by making better use of those currently allocated (advertised and not) - we would have some idea now of how far this process can go. Clearly, different users have different needs. A single multihomed organisation may only present a few computers to the Net, and needs to do so in prefixes of 256 addresses, so their host density is going to be pretty low.

ping-scanner-mask-multi.c is the tool for finding distributions of densities. It can operate in a number of ways and some thought is required to use it well. There is a big difference between sampling a small number of /10 prefixes compared to sampling a few dozen or a few hundred /24 prefixes.

I had two practical constraints using this program. Firstly, it uses about 30MB of memory for each instance, so I could run 24 or so instances at most. That was OK. The other restriction is that when it is working on a small number of /24 subnets, the CPU spends most of its time running through the mask not finding many /24's unmasked for pinging. For instance, I ran it with only 6 /24s, which means only a small proportion (6 / (224 * 256 * 256) = one part in 2.4 million) of the /24s were unmasked. I could only run about 7 instances at a time before the 2.4GHz P4 Celeron became overloaded. Also, with longer prefixes with fewer numbers of IP addresses, it makes no sense to scan them for a long time - for instance sending more than 256 pings to a /24 on a random basis - unless it is desired to spread the sampling over a 24 hour period to avoid the chance of testing at a time of day when some or all of the computers were not operating.

The computer used for all these tests was a Debian 3.1 machine behind a NAT firewall of another Linux machine, which connects to the Net via lightly-loaded Internode ADSL link, capable of 3.137 Mbps downstream and 304kbps upstream.

Sampling questions

I am not trying to discover ping-responsive-host densities with the sort of rigour reserved for scientific experiments, however I want the figures to be reliable enough to get some idea of how the address space is currently being used..

With the short prefixes (/8 etc. which really means a long or large subnet in terms of the number of IP addresses), there are not so many of them, so it makes sense to survey each one of them. For instance, there are 283 /13s - so I could survey each one. This means I won't be pinging every IP address within each prefix, but each one has half a million IP addresses and I don't need to ping more than every few hundred or so addresses to have a good idea of the ping-responsive-host density.

With the longer prefixes (towards /24), I want to sample at least a hundred of them, and ideally, have a ping density high enough that in general I am sending at least one ping per IP address. In practice, I am only looking for general trends in figures, so I don't need to ping every possible IP address in the prefix.

Ideally, I would run 24 hour scans to overcome biases in computers being turned on and off at various times of day. I did this in some cases, but in other cases, to save time, I used 12 and 4 hour scans, using the following techniques. Some I did in February and others I did in March. The ones in March were done in more of a hurry, so I used shorter scan times. I doubt that this would substantially affect the results, but if someone wants to do a more thorough survey, the software is here and I will be happy to link to similar or improved surveys.

The graphs I produced (with gnuplot) and the raw results files on which they are based are all in the software tarball.

The figures I used for the graph come from the 24 hour scan on 21 February, where an instance of ping-scanner.c, modified to send a ping every 5 seconds, pinged random addresses in each of the /8 prefixes 1.0.0.0/8 to 223.0.0.0/8. Each prefix got about 17280 pings. The results are as follows, with the percentage acknowledgements, the prefix address and the number of acknowledgements.

0.0000   003, 00000
0.5787   004, 00100
0.1273   008, 00022
3.0266   012, 00523
0.0000   015, 00000
0.0000   016, 00000
0.0058   017, 00001
0.1215   018, 00021
0.0405   032, 00007
0.0058   033, 00001
0.0694   035, 00012
0.3877   038, 00067
0.0000   044, 00000
0.0000   045, 00000
0.0000   053, 00000
0.0000   055, 00000
0.0174   057, 00003
0.3414   126, 00059
0.0000   214, 00000

It can be seen that many of these /8s have extremely low, perhaps zero, ping-responsive-host density. The average is 0.248%.

/10

On 20 March 2007, there were only 13 /10 prefixes. One of them - the last - spans /10 boundaries.

0.0.0/10 , D00 ,  4194304 19890904 iana adv
64.0.0/10 , D00 ,  4194304 20040520 panic adv
0.0.0/10 , D00 ,  4194304 19980731 Carin adv
64.0.0/10 , D00 ,  4194304 19990122 Carin adv
0.0.0/10 , D00 ,  4194304 20060228 Carin adv
128.0.0/10 , D00 ,  4194304 20040310 ripen cc adv
128.0.0/10 , D00 ,  4194304 20050207 ripen cc adv
0.0.0/10 , D00 ,  4194304 20060703 ripen cc adv
128.0.0/10 , D00 ,  4194304 20000324 Carin adv
192.0.0/10 , D00 ,  4194304 19960508 Carin adv
0.0.0/10 , D00 ,  4194304 20011031 panic adv
0.0.0/10 , D00 ,  4194304 20020412 panic adv
16.0.0/10 , D02 ,  4194304 20030115 panic adv

I used this command:

ping-scanner-mask-multi prefixes-adv-10.txt 7200 &

This did about 523 pings per prefix over 2 hours, starting at 8:45AM GMT. The first number in each line of the output file shows the percentage response from each prefix. The average response rate was 9.16%

000.0000, 00000, 00558, 00001, 20.0.0.0/10 , D00 , 4194304 19890904 iana adv
001.0600, 00006, 00566, 00002, 60.64.0.0/10 , D00 , 4194304 20040520 panic adv
002.6881, 00015, 00558, 00003, 63.0.0.0/10 , D00 , 4194304 19980731 Carin adv
003.0035, 00017, 00566, 00004, 63.64.0.0/10 , D00 , 4194304 19990122 Carin adv
022.4014, 00125, 00558, 00005, 75.0.0.0/10 , D00 , 4194304 20060228 Carin adv
023.5188, 00131, 00557, 00006, 84.128.0.0/10 , D00 , 4194304 20040310 ripen cc adv
004.8473, 00027, 00557, 00007, 86.128.0.0/10 , D00 , 4194304 20050207 ripen cc adv
002.6881, 00015, 00558, 00008, 91.0.0.0/10 , D00 , 4194304 20060703 ripen cc adv
010.0538, 00056, 00557, 00009, 172.128.0.0/10 , D00 , 4194304 20000324 Carin adv
001.8416, 00010, 00543, 00010, 208.192.0.0/10 , D00 , 4194304 19960508 Carin adv
017.0250, 00095, 00558, 00011, 219.0.0.0/10 , D00 , 4194304 20011031 panic adv
015.4121, 00086, 00558, 00012, 220.0.0.0/10 , D00 , 4194304 20020412 panic adv
014.5359, 00083, 00571, 00013, 221.16.0.0/10 , D02 , 4194304 20030115 panic adv

prefixes-adv-10.txt, 07265, 00666, 9.1672

/11

/12

/13

On 22 March 2007 there were 287 /13 prefixes. For this and the longer prefixes below, I tested about 120 prefixes of each length. Generally I used 8 instances of the ping-scanner-mask-multi program for 4 hours, to send about 960 pings to each prefix. However, for the last two - /23 and /24 - I used shorter times, since these prefixes only have 512 and 256 IP addresses in them, respectively.

To sample 120 of the 287 /13 prefixes, I used 8 programs for four hours, each to scan 15 of the prefixes. The last two parameters on each line are the step (called "span" in the program) and offset, telling each instance to start at the line number "offset" to get its first prefix to test, and then to step a certain number of lines to get the next. In this case, I want 15 of 287 lines for each instance, so the step is 19. The 8 values of offset start at 0 and move up in steps of about 19/8 to evenly sample the whole set of prefixes.

ping-scanner-mask-multi prefixes-adv-13.txt 14400 19    0 &
ping-scanner-mask-multi prefixes-adv-13.txt 14400 19    2 &
ping-scanner-mask-multi prefixes-adv-13.txt 14400 19    5 &
ping-scanner-mask-multi prefixes-adv-13.txt 14400 19    7 &
ping-scanner-mask-multi prefixes-adv-13.txt 14400 19   10 &
ping-scanner-mask-multi prefixes-adv-13.txt 14400 19   12 &
ping-scanner-mask-multi prefixes-adv-13.txt 14400 19   14 &
ping-scanner-mask-multi prefixes-adv-13.txt 14400 19   17 &

This was a 4 hour test, starting at 6:30AM GMT. The average response rate was 12.59%.

/15

On 22 March 2007 there were 5162 /16 prefixes. I used a similar scheme to that just described for /13 prefixes, with step = 344.

120 prefixes scanned with 8 instances of the program each scanning 15 prefixes for 4 hours, with each prefix getting about 960 pings.

This was a 4 hour test, starting at 10:40AM GMT. The average response rate was 4.10%.

I also did an early, more extensive, scan:

In late February there were 5136 /16 prefixes, taking up nearly 20% of the advertised space, with relatively low ping-responsive-host density. I initially used 22 instances, each for 24 hours, each scanning 40 prefixes. This means each prefix got about 2160 pings. This would have been 880 prefixes, which was more than I needed for a graph, and the memory requirements were beyond physical RAM. So I killed some processes and wound up with 644, which is still rather a lot. The command lines which contributed to the final result were:

ping-scanner-mask-multi prefixes-adv-16.txt 43200 128   0 &
ping-scanner-mask-multi prefixes-adv-16.txt 43200 128 10 &
ping-scanner-mask-multi prefixes-adv-16.txt 43200 128 20 &
ping-scanner-mask-multi prefixes-adv-16.txt 43200 128 30 &
ping-scanner-mask-multi prefixes-adv-16.txt 43200 128 40 &
ping-scanner-mask-multi prefixes-adv-16.txt 43200 128 50 &
ping-scanner-mask-multi prefixes-adv-16.txt 43200 128 60 &
ping-scanner-mask-multi prefixes-adv-16.txt 43200 128 70 &
ping-scanner-mask-multi prefixes-adv-16.txt 43200 128 80 &
ping-scanner-mask-multi prefixes-adv-16.txt 43200 128 90 &
ping-scanner-mask-multi prefixes-adv-16.txt 43200 128 100 &
ping-scanner-mask-multi prefixes-adv-16.txt 43200 128   5 &
ping-scanner-mask-multi prefixes-adv-16.txt 43200 128 15 &
ping-scanner-mask-multi prefixes-adv-16.txt 43200 128 25 &
ping-scanner-mask-multi prefixes-adv-16.txt 43200 128 35 &
ping-scanner-mask-multi prefixes-adv-16.txt 43200 128 45 &

Of the 644 results, one stood out as anomalous: 1086 responses from 1088 pings sent to a particular /16: 158.99.0.0. Manually probing this I found every address I tried returned pings, from 158.99.0.0 to 158.255.255. The range was split evenly in two in terms of the last router: 130.206.220.67 for the bottom half and 130.206.220.66 for the top. The second last router was the same, with no reverse mapped name and the third last was: j-1-1.car2.Madrid1.Level3.net. I figure no-one has a host on every IP address, so this must be some router-based ping response which has nothing to do with actual utilisation of the address space. I deleted this line from the results to create: slash-16-643-sorted-percentages.txt and slash-16-643-sorted-percentages.txt.

The average response rate was 4.045%.

/17

/18

/19

/20

/21

/23

/24

On 23 March 2007 there were 14939 /24 prefixes. I ran 4 instances pinging 30 prefixes each (span 497) for 3 hours, starting at 9:23PM GMT - a time when most business premises in Europe and America would be unattended. . Each of the 120 prefixes (256 IP addresses in each prefix) got about 360 pings. The average response rate was 3.165%. As can be seen in the graph, this is noticeably lower than the graph which results from the next set of data, which had a average of 4.84%.

The next set was a 24 hour survey, on a weekday in all timezones.

On 22 February 2007 I used some shell scripts to run 282 instances of ping-scanner-slash-24-slow, each performing a complete ping scan of a single /24 prefix. I generated this list of 282 /24 prefixes by a manual approach of taking the full list, chopping out two screenfuls, keeping one line, chopping out another two screenfuls etc. Each instance pinged every one of the 256 IP addresses once, in a way which was randomised (XORed by the bits 8 to 15 of the prefix's address. The idea was to avoid systematic biases due to computers being off at certain times of the day.

95 of these 282 prefixes (33.7%) did not acknowledge a single ping. One of them, 207.171.235.0/24, acknowledged 247 of the 256 pings. I performed a complete reverse DNS lookup of every IP address in every one of these 282 /24 prefixes. I wanted to try to discern whether some of the non-responders were probably being used for traffic, and had routers or hosts which were configured not to respond to pings. The results of that can't easily be quantified. There is a zip file with the results here: ../#reverse-dns-zip .

The 24 hour survey's rate was 4.69% - or 4.35% if the highest responding prefix is ignored).

Software for ping probing the density of IPv4 address utilisation

Introduction