The Transition and Deployment of IPv6 in Australia and China

By Vannak Lach

Internet Protocol version 6 (IPv6) is the future resolution to deal with the long-anticipated problem of the current Internet Protocol version 4 (IPv4) address exhaustion and intendedly to replace IPv4 in the future. IP is the central protocol in the Internet. Each device connected to the Internet must have a unique IP address to communicate, and routers in the network use the IP destination address in packet headers to forward the packet to the receiver and vice versa.

With a rapid growth of the devices connecting to the network, the current status of IPv4 space is nearly running out. Like many other countries, both Australia and China vastly deploy IPv4, and still lack progress towards IPv6 deployment. If both countries are to be able to fully participate and benefit from the digital economy in the future, they better prepare themselves for the transition and be ready for the IPv6 alternative. Otherwise, they will face a huge challenge in connecting devices across different platforms.

Getting on board with IPv6 will not only prevent countries like Australia and China from being at a competitive disadvantage, but also save operational costs in the near future. As APNIC’s general director Paul Wilson quoted in the iTnews, “Without IPv6, the Australian internet will be less efficient, it will be slower and less reliable, and more expensive — and that would be bad for the country.” Likewise, China is also on the same boat as Australia.

The IPv4 address shortage will become an even bigger issue for both countries as their economies are rapidly moving toward the fourth industrial revolution of which their manufacturing and services are relying heavily on the Internet.

To promote the IPv6 employment and help prepare Australia and China for the transition, APNIC has provided a grant to the School of Engineering and IT at Murdoch University to conduct research on the IPv6 readiness and deployment in all types of industry sectors that use the Internet in both countries. The survey received responses from 198 participants from Australia and 188 participants from China.

The main objective of this research project seeks to gain insights into the motivations of Australian and Chinese organisations for deploying or not deploying IPv6, to identify whether IPv6 deployment is likely to increase in the future, and to determine what are the driving and hindering forces behind the deployment of IPv6 in their economies.

There are a few previous studies on the IPv6 readiness which is relevant to both countries such as a study on the IPv6 readiness of companies in Australia based on a survey in 2011, an RIR community survey on IPv6 Deployment 2012-2013, an IPv6 Industry Survey 2014,  and the most recent IPv6 deployment worldwide survey 2016.

However, most of the data from those studies is a few years old now, and there is small number of participants from Australia and China. While some studies provide a comprehensive analysis on the awareness and urgency of IPv6, none of them provide much insight into the motivation or reason of the companies for deploying or not deploying IPv6 or what the main obstacles are for them to move to IPv6 infrastructure.

In this sense, the focus of the study by Murdoch University is very crucial, particularly for these two countries to plan for the IPv6 transition. The project’s focus is on Australia and China, because both countries currently have low IPv6 deployment, so the study will be very relevant to both countries.

The survey results are available online, structured around the following areas:

We invite you to visit the research team’s website to understand how the survey was organised, and what we have learned through the process.

Simulating satellite Internet traffic to a small island Internet provider

A significant number of islands are too remote to make submarine cable Internet connections economical. We’re looking mostly at the Pacific, but islands like these exist elsewhere, too. There are also remote places on some continents, and of course there are cruise ships, which can’t connect to cables while they’re underway. In such cases, the only way to provision Internet commercially at this point are satellites.

Satellite Internet is expensive and sells by the Mbps per month (megabits per second of capacity per month). So places with few people – up to several ten thousand perhaps – usually find themselves with connections clearly this side of a Gbps (gigabit per second). The time it takes for the bits to make their way to the satellite and back adds latency, and so we have all the ingredients for trouble: a bottleneck prone to congestion and long round-trip times (RTT) which give TCP senders an outdated picture of the actual level of congestion that their packets are going to encounter.

I have discussed these effects at length in two previous articles published at the APNIC blog, so won’t repeat this here. Suffice to say: It’s a problem worth investigating in depth, and we’re doing so with help from grants by ISIF Asia in 2014 and 2016 and Internet NZ. This blog post describes how we’ve built our simulator, which challenges we’ve come up against, and where we’re at some way down our journey.

What are the questions we want to answer?

The list is quite long, but our initial goals are:

  • We’d like to know under which circumstances (link type – GEO or MEO, bandwidth, load, input queue size) various adverse effects such as TCP queue oscillation or standing queues occur.
  • We’d like to know what input queue size represents the best compromise in different circumstances.
  • We’d like to know how much improvement we can expect from devices peripheral to the link, such as performance-enhancing proxies and network coders.
  • We’d like to know how best to parameterise such devices in the scenarios in which one might deploy them.
  • We’d like to know how devices with optimised parameters behave when loads and/or flow size distributions change.

That’s just a start, of course, and before we can answer any of these, the biggest question to solve is: How do you actually build, configure, and operate such a simulator?

Why simulate, and why build a hybrid software/hardware simulator?

We get this question a lot. There are things we simply can’t try on real satellite networks without causing serious inconvenience or cost. Some of the solutions we are looking at require significant changes in network topology. Any island out there keen to go without Internet for a few days while we try something out? So we’d better make sure it can be done in a controlled environment first.

Our first idea was to try to simulate things in software. There is a generation of engineers and networking people who have been brought up on the likes of ns-2, ns-3, mininet etc., and they swear by it. We’re part of that generation, but the moment we tried to simulate a satellite link in ns-2 with more than a couple of dozen Mbps and more than a handful of simultaneous users generating a realistic flow size distribution, we knew that we were in trouble. The experiment was meant to simulate just a few minutes’ worth of traffic, for just one link configuration, and we were looking at days of simulation. No way. This wasn’t scalable.

Also, with a software simulator, you rely on software simulating a complex system with timing, concurrent processes, etc., in an entirely sequential way. How do you know that it gets the chaos of congestion right?

So we opted for farming out as much as we could to hardware. Here, we’re dealing with actual network components, real packets, and real network stacks.

There’s been some debate as to whether we shouldn’t be calling the thing an emulator rather than a simulator. Point taken. It’s really a bit of both. We take a leaf here from airline flight simulators, which also leverage a lot of hardware.

The tangible assets

Our island-based clients at present are 84 Raspberry Pis, complemented by 10 Intel NUCs. Three Supermicro servers simulate the satellite link and terminal equipment (such as PEPs or network coding encoders and decoders), and another 14 Supermicros of varying vintage act as the servers of the world that provide the data which the clients on the island want.

The whole thing is tied together by a number of switches, and all servers have external Internet access, so we can remotely access them to control experiments without having to load the actual experimental channel. The image in Figure 1 below shows the topology – the “island” is to the right, the satellite “link” in the middle, and the “world” servers on the right.

Figure 1: The topology of our simulator. 84 Raspberry Pis and 10 Intel NUCs represent the island clients on the (blue) island network. Three Super Micro servers emulate the satellite link and run the core infrastructure either side (light blue network). A further 14 Super Micros represent the servers of the world that send data to the island (red network). All servers are accessible via our external network (green), so command and control don't interfere with experiments.

Figure 1: The topology of our simulator. 84 Raspberry Pis and 10 Intel NUCs represent the island clients on the (blue) island network. Three Super Micro servers emulate the satellite link and run the core infrastructure either side (light blue network). A further 14 Super Micros represent the servers of the world that send data to the island (red network). All servers are accessible via our external network (green), so command and control don’t interfere with experiments.


Simulating traffic: The need for realistic traffic data

High latency is a core difference between satellite networks and, say, LANs or MANs. As I’ve explained in a previous blog, this divides TCP flows (the packets of a TCP connection going in one direction) into two distinct categories: Flows which are long enough to become subject to TCP congestion control, and those that are so short that their last data packet has left the sender by the time the first ACK for data arrives.

In networks where RTT is no more than a millisecond or two, most flows fall into the former category. In a satellite network, most flows don’t experience congestion control – but contribute very little data. Most of the data on satellite networks lives in flows whose congestion window changes in response to ACKs received.

So we were lucky to have a bit of netflow data courtesy of a cooperating Pacific ISP. From this, we’ve been able to extract a flow size distribution to assist us in traffic generation. To give you a bit of an idea as to how long the tail of the distribution is: We’re looking at a median flow size of under 500 bytes, a mean flow size of around 50 kB, and a maximum flow size of around 1 GB.

A quick reminder for those who don’t like statistics: The median is what you get if you sort all flows by size and take the flow size half-way down the list. The mean is what you get by adding all flow sizes and dividing by the number of flows. A distribution with a long tail has a mean that’s miles from the median. Put simply: Most flows are small but most of the bytes sit in large flows.

Simulating traffic: Supplying a controllable load level

Another assumption we make is this: By and large, consumer Internet users are reasonably predictable creatures, especially if they come as a crowd. As a rule of thumb, if we increase the number of users by a factor of X, then we can reasonably expect that the number of flows of a particular size will also roughly increase by X. So if the flows we sampled were created by, say, 500 users, we can approximate the behaviour of 1000 users simply by creating twice as many flows from the same distribution. This gives us a kind of “load control knob” for our simulator.

But how are we creating the traffic? This is where our own purpose-built software comes in. Because we have only 84 Pis and 10 NUCs, but want to be able to simulate thousands of parallel flows, each physical “island client” has to play the role of a number of real clients. Our client software does this by creating a configurable number of “channels”, say 10 or 30 on each physical client machine.

Each channel creates a client socket, randomly selects one of our “world” servers to connect to, opens a connection and receives a certain number of bytes, which the server determines by random pick from our flow size distribution. The server then disconnects, and the client channel creates a new socket, selects another server, etc. Selecting the number of physical machines and client channels to use thus gives us an incremental way of ramping up load on the “link” while still having realistic conditions.

Simulating traffic: Methodology challenges

There are a couple of tricky spots to navigate, though: Firstly, netflow reports a significant number of flows that consist of only a single packet, with or without payload data. These could be rare ACKs flowing back from a slow connection in the opposite direction, or be SYN packets probing, or…

However, our client channels create a minimum amount traffic per flow through their connection handshake. This amount exceeds the flow size of these tiny flows. So we approximate the existence of these flows by pro-rating them in the distribution, i.e., each client channel connection accounts for several of these small single packet flows.

Secondly, the long tail of the distribution means that as we sample from it, our initial few samples are very likely to have an average size that is closer to the median than to the mean. In order to obtain a comparable mean, we need to run our experiments for long enough so that our large flows have a realistic chance to occur. This is a problem in particular with experiments using low bandwidths, high latencies (GEO sats), and a low number of client channels.

For example, a ten minute experiment simulating a 16 Mbps GEO link with 20 client channels will typically generate a total of only about 14,000 flows. The main reason for this is the time it takes to establish a connection via a GEO RTT of over 500 ms. Our distribution contains well over 100,000 flows, with only a handful of really giant flows. So results at this end are naturally a bit noisy, depending on whether, and which, giant flows in the 100’s of MB get picked by our servers. This forces us to run rather lengthy experiments at this end of the scale.

Simulating the satellite link itself

For our purposes, simulating a satellite link mainly means simulating the bandwidth bottleneck and the latency associated with it. More complex scenarios may include packet losses from noise or fading on the link, or issues related to link layer protocol. We’re dedicating an entire server to the simulation (server K in the centre of the topology diagram), so we have enough computing capacity to handle every case of interest. The rest is software, and here the choice is chiefly between a network simulator (such as, e.g., sns-3) and something relatively simple like the Linux tc utility.

The latter lets us simulate bandwidth constraints, delay, sporadic packet loss and jitter: enough for the moment. That said, it’s a complex beast, which exists in multiple versions and – as we found out – is quite quirky and not overly extensively documented.

Following examples given by various online sources, we configured a tc netem qdisc to represent the delay, which we in turn chained to a token bucket filter. The online sources also suggested quality control: ping across the simulated link to ensure the delay is place, then run iperf in UDP mode to see that the token bucket filter is working correctly. Sure enough, the copy-and-paste example passed these two tests with flying colours. It’s just that we then got rather strange results once we ran TCP across the link. So we decided to ping while we were running iperf. Big surprise: Some of the ping RTTs were in the hundreds of seconds – far longer than any buffer involved could explain. Moreover, no matter which configuration parameter we tweaked, the effect wouldn’t go away. So, a bug it seems. We finally found a workaround involving ingress redirection to an intermediate function block device, which passes all tests and produces sensible results for TCP. Just goes to show how important quality control is!

Simulating world latency

We also use a similar technique to add a variety of fixed ingress and egress delays to the “world” servers. This models the fact that TCP connections in real life don’t end at the off-island sat gate, but at a server that’s potentially a continent or two down the road and therefore another few dozen or even hundreds of milliseconds away.

Link periphery and data collection

We already know that we’ll want to try PEPs, network coders etc., so we have another server each on both the “island” (server L) and the “world” (server J) side of the server (K) that takes care of the “satellite link” itself. Where applicable, these servers host the PEPs and / or network coding encoders / decoders. Otherwise, these servers simply act as routers. In all cases, these two servers also function as our observation points.

At each of the two observation points, we run tcpdump on eth0 to capture all packets entering and leaving the link at either end. These get logged into pcap capture files on L and J.

An alternative to data capture here would be to capture and log on the clients and / or “world” servers. However, capture files are large and we expect lots of them, and the SD cards on the Raspberry Pis really aren’t a suitable storage medium for this sort of thing. Besides that, we’d like to let the Pis and servers get on with the job of generating and sinking traffic rather than writing large log files. Plus, we’d have to orchestrate the retrieval of logs from 108 machines with separate clocks, meaning we’d have trouble detecting effects such as link underutilisation.

So servers L and J are really without a lot of serious competition as observation points. After each experiment, we use tshark to translate the pcap files into text files, which we then copy to our storage server (bottom).

For some experiments, we also use other tools such as iperf (so we can monitor the performance of a well-defined individual download) or ping (to get a handle on RTT and queue sojourn times). We run these between the NUCs and some of the more powerful “world” servers.

A basic experiment sequence

Each experiment basically follows the same sequence, which we execute via our core script:

  1. Configure the “sat link” with the right bandwidth, latency, queue capacity etc.
  2. Configure and start any network coded tunnel or PEP link we wish to user between servers L and J.
  3. Start the tcpdump capture at the island end (server L) of the link
  4. Start the tcpdump capture at the world end (server J) of the link with a little delay. This ensures that we capture every packets heading from the world to the island side
  5. Start the iperf server on one of the NUCs. Note that in iperf, the client sends data to the server rather than downloading it.
  6. Start the world servers.
  7. Ping the special purpose client from the special purpose server. This functions as a kind of “referee’s start whistle” for the experiment as it creates a unique packet record in both tcpdump captures, allowing us to synchronise them later.
  8. Start the island clients as simultaneously as possible.
  9. Start the iperf client.
  10. Start pinging – typically, we ping 10 times per second.
  11. Wait for the core experiment duration to expire. The clients terminate themselves.
  12. Ping the special purpose client from the special purpose server again (“stop whistle”).
  13. Terminate pinging (usually, we ping only for part of the experiment period, though)
  14. Terminate the iperf client.
  15. Terminate the iperf server.
  16. Terminate the world servers.
  17. Convert the pcap files on J and L into text log files with tshark
  18. Retrieve text log files, iperf log and ping log to the storage server.
  19. Start the analysis on the storage server.

Between most steps, there is a wait period to allow the previous step to complete. For a low load 8 Mbps GEO link, the core experiment time needs to be 10 minutes to yield a half-way representative sample from the flow size distribution. The upshot is that the pcap log files are small, so need less time for conversion and transfer to storage. For higher bandwidths and more client channels, we can get away with shorter core experiment durations. However, as they produce larger pcap files, conversion and transfer take longer. Altogether, we budget around 20 minutes for a basic experiment run.

Tying it all together

We now have more than 100 machines in the simulator. Even in our basic experiments sequence, we tend to use most if not all of them. This means we need to be able to issue commands to individual machines or groups of machines in an efficient manner, and we need to be able to script this.

Enter the pssh utility. This useful little program lets our scripts establish a number of SSH connections to multiple machines simultaneously, e.g., to start our servers or clients, or to distribute configuration information. It’s not without its pitfalls though: For one, the present version has a hardwired limit of 32 simultaneous connections that isn’t properly document in the man page. If one requests more than 32 connections, pssh quietly runs the first 32 immediately and then delays the next 32 by 60 seconds, the next 32 by 120 seconds, etc.

We wouldn’t have noticed this hadn’t we added a feature to our analysis script that checks whether all clients and servers involved in the experiment are being seen throughout the whole core experiment period. Originally, we’d intended this feature to pick up machines that had crashed or had failed to start. Instead, it alerted us to the fact that quite a few of our machines were late starters, always by exactly a minute or two.

We now have a script that we pass the number of client channels required. It computes how to distribute the load across the Pi and NUC clients, creates subsets of up to 32 machines to pass to pssh, and invokes the right number of pssh instances with the right client parameters. This lets us start up all client machines within typically less than a second. The whole episode condemned a week’s worth of data to /dev/null, but shows again just how important quality assurance is.

Automating the complex processes is vital, so we keep adding scripts to the simulator as we go to assist us in various aspects of analysis and quality assurance.

Observations – and how we use them

Our basic experiment collects four pieces of information:

  1. A log file with information on the packets that enter the link from the “world” side at J (or the PEP or network encoder as the case may be). This file includes a time stamp for each packet, the source and destination addresses and ports, and the sizes of IP packets, the TCP packets they carry, and the size of the payload they contain, plus sequence and ACK numbers as well as the status of the TCP flags in the packet.
  2. A similar log file with information on the packets that emerge at the other end of the link from L and head to the “island” clients.
  3. An iperf log, showing average data rates achieved for the iperf transfer.
  4. A ping log, showing the sequence numbers and RTT values for the ping packets sent.

The first two files allow us to determine the total number of packets, bytes and TCP payload bytes that arrived at and left the link. This gives us throughput, goodput, and TCP byte loss, as well as a wealth of performance information for the clients and servers. For example, we can compute the number of flows achieved and the average number of parallel flows, or the throughput, goodput for and byte loss for each client.

Figure 2: Throughput and goodput on a simulated 16 Mbps satellite link carrying TCP for 20 client sockets with an input queue of 100kB on the satellite uplink. Note clear evidence of link underutilisation - yet the link is already impaired.

Figure 2: Throughput and goodput on a simulated 16 Mbps satellite link carrying TCP for 20 client sockets with an input queue of 100kB on the satellite uplink. Note clear evidence of link underutilisation – yet the link is already impaired.

Figure 2 above shows throughput (blue) and goodput (red) in relation to link capacity, taken at 100 ms intervals. The link capacity is the brown horizontal line – 16 Mbps in this case.

Any bit of blue that doesn’t reach the brown line represents idle link capacity – evidence of an empty queue some time during the 100 ms in question. So you’d think there’s be no problem fitting a little bit of download in, right? Well that’s exactly what we’re doing at the beginning of the experiment, and you can indeed see that there’s quite a bit less spare capacity – but still room for improvement.

Don’t believe me? Well, the iperf log gives us an idea as to how a single long download fares in terms of throughput. Remember that our clients and servers aim at creating a flow mix but don’t aim to complete a standardised long download. So iperf is the more appropriate tool here. In this example, our 40 MB download takes over 120 s with an average rate of 2.6 Mbps. If we run the same experiment with 10 client channels instead of 20, iperf might take only a third of the time (41 s) to complete the download. That is basically the time it takes if the download has the link to itself. So adding the extra 10 client channel load clearly has a significant impact.

At 50 client channels, iperf takes 186 seconds, although this figure can vary considerably depending which other randomly selected flows run in parallel. At 100 client channels, the download sometimes won’t even complete – if it does, it’s usually above the 400 second mark & there’s very little spare capacity left (Figure 3).


Figure 3: At 100 client channels, the download does not complete but there is still a little spare capacity left.

Figure 3: At 100 client channels, the download does not complete but there is still a little spare capacity left.


You might ask why the iperf download is so visible in Figure 1 compared to the traffic contributed by our hundreds of client channels? The answer lies once again in the extreme nature of our flow size distribution and the fact that at any time, a lot of the channels are in connection establishment mode: The 20 client channel experiment above averages only just under 18 parallel flows, and almost all of the 14,000 flows this experiment generates are less than 40 MB: In fact, 99.989% of the flows in our distribution are shorter than our 40 MB download. As we add more load, the iperf download gets more “competition” and also contributes at a lower goodput rate.

The ping log, finally, gives us a pretty good estimate of queue sojourn time. We know the residual RTT from our configuration but can also measure it by pinging after step 2 in the basic experiment sequence. Any additional RTT during the experiment reflects the extra time that the ICMP ping packets spend being queued behind larger data packets waiting for transmission.

One nice feature here is that our queue at server K practically never fills completely: To do so, the last byte of the last packet to be accepted into the queue would have to occupy the last byte of queue capacity. However, with data packets being around 1500 bytes, the more common scenario is that the queue starts rejecting data packets once it has less than 1500 bytes capacity left. There’s generally still enough capacity for the short ping packets to slip in like a mouse into a crowded bus, though. It’s one of the reasons why standard size pings aren’t a good way of detecting whether your link is suffering from packet loss, but for our purposes – measuring sojourn time – it comes in really handy.

Figure 4 shows the ping RTTs for the first 120 seconds of the 100 client channel experiment above. Notice how the maximum RTT tops out at just below 610 ms? That’s 50 ms above the residual RTT of 560 ms (500 satellite RTT and 60 ms terrestrial), +/-5% terrestrial jitter that we’ve configured here. No surprises here: That’s exactly the time it takes to transmit the 800 kbits of capacity that the queue provides. In other words: The pings at the top of the peaks in the plot hit a queue that was, for the purposes of data transfer, overflowing.

The RTT here manages to hit its minimum quite frequently, and this shows in throughput of just under 14 Mbps, 2 Mbps below link capacity.

Figure 4: Ping RTTs during the first 120 seconds.

Figure 4: Ping RTTs during the first 120 seconds.

Note also that where the queue hits capacity, it generally drains again within a very short time frame. This is queue oscillation. Note also that we ping only once every 100 ms, so we may be missing shorter queue drain or overflow events here because they are too short in duration – and going by the throughput, we know that we have plenty of drain events.

This plot also illustrates one of the drawbacks of having a queue: between e.g., 35 and 65 seconds, there are multiple occasions when the RTT doesn’t return to residual for a couple of seconds. This is called a “standing queue” – the phenomenon commonly associated with buffer bloat. At times, the standing queue doesn’t contribute to actual buffering for a couple of seconds but simply adds 20 ms or so of delay. This is undesirable, not just for real-time traffic using the same queue, but also for TCP trying to get a handle on the RTT. Here, it’s not dramatic, but if we add queue capacity, we can provoke an almost continuous standing queue: the more buffer we provide, the longer it will get under load.

Should we be losing packet loss altogether?

There’s one famous observable that’s often talked about but surprisingly difficult to measure here: packet loss. How come, you may ask, given that we have lists of packets from before and after the satellite link?

Essentially, the problem boils down to the question of what we count as a packet, segment or datagram at different stages of the path.

Here’s the gory detail: The maximum size of a TCP packet can in theory be anything that will fit inside a single IP packet. The size of the IP packet in turn has to fit into the Ethernet (or other physical layer) frame and has to be able to be processed along the path.

In our simulator, and in most real connected networks, we have two incompatible goals: Large frames and packets are desirable because they lower overhead. On the other hand, if noise or interference hits on the satellite link, large frames present a higher risk of data loss: Firstly, at a given bit error rate, large packets are more likely to cop bit errors than small ones. Secondly, we lose more data if we have to discard a large packet after a bit error than if we have to discard a small packet only.

Then again, most of our servers sit on Gbps Ethernet or similar, where the network interfaces have the option of using jumbo frames. The jumbo frame size of up to 9000 bytes represents a compromise deemed reasonable for this medium. However, these may not be ideal for a satellite link. For example, given a bit error probability of 0.0000001, we can expect to lose 7 in 1000 jumbo frames, or 0.7% of our packet data. If we use 1500 byte frames instead, we’ll only lose just over 1 in 1000 frames, or 0.12% of our packet data. Why is that important? Because packet loss really puts the brakes on TCP, and these numbers really make a difference.

The number of bytes that a link may transfer in a single IP packet is generally known as the maximum transmission unit (MTU). There are several ways to deal with diversity in MTUs along the path: Either, we can restrict the size of our TCP segment right from the sender to fit into the smallest MTU along the path, or we can rely on devices along the way to split IP packets with TCP segments into smaller IP packets for us. Modern network interfaces do this on the fly with TCP segmentation offload (TSO) and generic segmentation offload (GSO, see Finally, the emergency option when an oversize IP datagram hits a link is to fragment the IP datagram.

In practice, TSO and GSO are so widespread that TCP senders on a Gbps network will generally transmit jumbo frames and have devices further down the path worry about it. This leaves us with a choice in principle: Allow jumbo frames across the “satellite link”, or break them up?

Enter the token bucket filter: If we want to use jumbo frames, we need to make the token bucket large enough to accept them. This has an undesirable side effect: Whenever the bucket has had a chance to fill with tokens, any arriving packets that are ready to consume them get forwarded immediately, regardless of rate (which is why you see small amounts of excess throughput in the plots above). So we’d “buy” jumbo frame capability by considerably relaxing instantaneous rate control for smaller data packets. That’s not what we want, so it seems prudent to stick with the “usual” MTUs of around 1500 bytes and accept fragmentation of large packets.

There’s also the issue of tcpdump not necessarily seeing the true number of packets/fragments involved, because it captures before segmentation offload etc. (

The gist of it all: The packets we see going into the link aren’t necessarily the packets that we see coming out at the other end. Unfortunately that happens in a frighteningly large number of cases.

In principle, we could check from TCP sequence numbers & IP fragments whether all parts of each packet going in are represented in the packets going out. However, with 94 clients all connecting to 14 servers with up to 40-or-so parallel channels, doing the sequence number accounting is quite a computing-intensive task. But is it worth it? For example, if I count a small data packet with 200 bytes as lost when it doesn’t come out the far end, then what happens when I have a jumbo frame packet with 8000 bytes that gets fragmented into 7 smaller packets and one of these fragments gets dropped? Do I count the latter as one packet loss, or 1/7th of a packet loss, or what?

The good news: For our purposes, packet loss doesn’t actually help explain much unless we take it as an estimate of byte loss. But byte loss is an observable we can compute very easily here: We simply compare the number of observed TCP payload bytes on either side of the link. Any missing byte must clearly have been in a packet that got lost.

Quality control

There is a saying in my native Germany: “Wer misst, misst Mist”. Roughly translated, it’s a warning that those who measure blindly tend to produce rubbish results. We’ve already seen a couple of examples of how an “out-of-left field” effect caused us problems. I’ll spare you some of the others but will say that there were just a few!

So what are we doing to ensure we’re producing solid data? Essentially, we rely on four pillars:

  1. Configuration verification and testing. This includes checking that link setups have the bandwidth configured, that servers and clients are capable of handling the load, and that all machines are up and running at the beginning of an experiment.
  2. Automated log file analysis. When we compare the log files from either side of the link, we also compute statistics about when each client and server was first and last seen, and how much traffic went to/from the respective machine. Whenever a machine deviates from the average by more than a small tolerance or a machine doesn’t show up at all, we issue a warning.
  3. Human inspection of results: Are the results feasible? E.g., are throughput and goodput within capacity limits? Do observables change in the expected direction when we change parameters such as load or queue capacity? Plots such as those discussed above also assist us in assessing quality. Do they show what we’d expect, or do they show artefacts? This also includes discussion of our results so there are at least four eyes looking at data.
  4. Scripting: Configuring an experiment requires the setting of no less than seven parameters for the link simulation, fourteen different RTT latencies for the servers, and load and timeout configurations for 94 client machines, an iperf download size, plus the orchestrated execution of everything with the right timing – see above. Configuring all of this manually would be a recipe for disaster, so we script as much as we can – this takes care of a lot of typos!

Also, an underperforming satellite link could simply be a matter of bad link configuration rather than a fundamental problem with TCP congestion control. It would be all too easy to take a particular combination of link capacity and queue capacity to demonstrate an effect without asking what influence these parameters have on the effect. This is why we’re performing sweeps – when it comes to comparing the performance of different technologies, we want to ensure that we are putting our best foot forward.


So what’s the best queue capacity for a given link capacity? You may remember the old formula for sizing router queue, RTT * bandwidth. However, there’s also Guido Appenzeller’s PhD thesis from Stanford, in which he recommends to divide this figure by the square root of the number of long-lived parallel flows.

This presents us with a problem: We can have hundreds of parallel flows in the scenarios we’re looking at. However, how many of those will qualify as long-lived depends to no small extent on the queue capacity at the token bucket filter!

For example, take the 16 Mbps link with 20 client channels we’ve already looked at before. At 16 Mbps (=2MBps) and 500 ms RTT, the old formula suggests 1 MB queue capacity. We see fairly consistently 17-18 parallel flows (not necessarily long-lived ones, though) regardless of queue capacity. Assuming extremely naively that all of these flows might qualify as long-lived (well, we know they’re not), Guido’s results suggest dividing the 1MB by about a factor of around 4, which just so happens to be a little larger than the 100kB queue we’ve deployed here. But how do we know whether this is the best queue capacity to choose?

A real Internet satellite link generally doesn’t just see a constant load. So how do we know which queue capacity works best under a range of loads?

The only way to get a proper answer is to try feasible combinations of load levels and queue capacities. Which poses the next question: What exactly do we mean by “works best”?

Looking at the iperf download, increasing the queue size at 20 client channels always improves the download time. This would suggest dumping Guido’s insights in favour of the traditional value. Not so fast: Remember those standing queues in Figure 3? At around 20 ms extra delay, they seemed tolerable. Just going to a 200kB queue bumps these up to 80 ms, though, and they’re a lot more common, too. Anyone keen to annoy VoIP users for the sake of a download that could be three times faster? Maybe, maybe not. We’re clearly getting into compromise territory here, but around 100kB-200kB seems to be in the right ballpark.

So how do we get to zero in on a feasible range? Well, in the case of the 16 Mbps link, we looked at (“sweeped”) eleven potential queue capacities between 30 kB and 800 kB. For each capacity, we swept up to nine load levels between 10 and 600 client channels. That’s many dozens of combinations, each of which takes around 20 minutes to simulate, plus whatever time we then take for subsequent manual inspection. Multiply this with the number of possible link bandwidths of interest in GEO and MEO configuration, plus repeats for experiments with quality control issues, and we’ve got our worked carved out. It’s only then that we can get to coding and PEPs.

What’s next?

A lot. If the question on your mind starts with “Have you thought of…” or “Have you considered…,” the answer is possibly yes. Here are a few challenges ahead:

  • Network coding (TCP/NC): We’ve already got the encoder and decoder ready, and once the sweeps are complete and we have identified the parameter combinations that represent the best compromises, we’ll collect performance data here. Again, this will probably take a few sweeps of possible generation sizes and overhead settings.
  • Performance-enhancing proxies (PEP): We’ve identified two “free” PEPs, PEPSal and TCPEP, which we want to use both in comparison and – eventually – in combination with network coding.
  • UDP and similar protocols without congestion control. In our netflow samples, UDP traffic accounts for around 12% of bytes. How will TCP behave in the presence of UDP in our various scenarios? How do we best simulate UDP traffic given that we know observed throughput, but can only control offered load? In principle, we could model UDP as a bandwidth constraint, but under heavy TCP load, we get to drop UDP packets as well, so it’s a constraint that’s a little flexible, too. What impact does this have on parameters such as queue capacities, generation sizes etc.?
  • Most real links are asymmetric, i.e., the inbound bandwidth is a lot larger than the outbound bandwidth. So far, we have neglected this as our data suggests that the outbound channels tend to have comparatively generous share of the total bandwidth.
  • Simulating world latencies. At this point, we’re using a crude set of delays on our 14 “world servers”. We haven’t even added jitter. What if we did? What if we replaced our current crude model of “satgate in Hawaii” with a “satgate in X” model, where the latencies from the satgate in X to the servers would be distributed differently?


As you can see, lots of interesting work ahead!

Seed Alliance at IGF Mexico

The Seed Alliance members, FIRE Africa (AFRINIC), FRIDA Program (LACNIC), ISIF Asia (APNIC) will be present at the 2016 Internet Governance Forum, which will take place in Guadalajara, Mexico, on 6-9 December.

During the IGF, the Seed Alliance will organize two workshops, one on cybersecurity and one on innovation and entrepreneurship, hold the Seed Alliance Awards Ceremony, and offer an opportunity to interact with grantees and Award Winners at the Seed Alliance booth in Guadalajara’s Palace of Culture and Communication, home of the Internet Governance Forum.

On Tuesday 6 December, the Seed Alliance will hold its first workshop of the week, which will focus on cybersecurity initiatives developed in and by the Global South. The session will be moderated by Carlos Martínez, LACNIC CTO, and will include noted speakers, all of them cybersecurity experts, including ISOC’s Olaf Kolkmann. This workshop will explore how developing economies are working to address cybersecurity issues, highlighting successful initiatives in their corresponding regions.(

In this sense, it is worth noting that this year the Seed Alliance included a specific category, funded by the Internet Society, which provided financial support to initiatives seeking to improve Internet security in the region: Protecting the TOR Network against Malicious Traffic in Brazil, BGP Security by RENATA (Colombia’s National Advanced Technology Academic Network) and Developing Tonga National CERT.

  • Prepared by Campinas State University (Brazil), the project for Protecting the TOR Network against Malicious Traffic seeks to implement a solution to the growing malicious code traffic operating over this network.
  • BGP Security by RENATA (Colombia’s National Advanced Technology Academic Network) involves implementing origin validation for BGP routes in RENATA’s network backbone.
  • In the case of the Tonga CERT, the project lead by Ministry of Meteorology, Energy, Environment, Climate Change, Information, Communication, Disaster Management (MEIDECC will work on creating the first national CERT in the Pacific region.

Award Winners 2016

On Tuesday 6 November, the Seed Alliance members will also present the 2016 Awards recognizing eight innovative initiatives and practices that have contributed to the region’s social and economic development. These are:

  1. AgriNeTT by the University of West Indies (Trinidad and Tobago)
  2. Mexicoleaks (Mexico);
  3. Restoring Connectivity: Movable and Deployable Resource ICT Unit (MDRU) by CVISNET Foundation (The Philippines);
  4. Towards A Fairer Electoral System: 1 Person, 1 Vote, 1 Value by Tindak (Malaysia);
  5. All Girls Tech Camp by Give1ProjectGambia (The Gambia);
  6. DocmeUP (Ghana);
  7. Kids Comp Camp (Kenia) and
  8. Tobetsa and WiFi TV Extension Project (South Africa).

To conclude, on Friday 9 December, FIRE, FRIDA, ISIF Asia will hold a second workshop on entrepreneurship and innovation in the Global South. This workshop will analyze the challenges innovators and entrepreneurs must face in developing countries and attempt to identify opportunities for Internet innovation in the countries of the Global South.

Finally, a Seed Alliance booth will be set up at the IGF Village, where FIRE, FRIDA and ISIF Asia Award winners and cybersecurity grant recipients will be available to share with Forum participants.

MDRU – restoring connectivity during disasters

CVISNET Foundation is the winner of the ISIF Asia 2016 Community Impact Award. They are eligible to get an additional 1000 AUD for the Community Choice Award 2016, so please vote for CVISNET Foundation and show your support, and

CVISNET: Restoring Connectivity through the use of Movable and Deployable Resource ICT Unit (MDRU)

Article prepared by Vannak Lach

The MDRU is a unit that can be quickly deployed to restore communications in communities in the aftermath of a disaster. The unit is self-reliant running on its own power source, and/or is able to harness other power sources such as power generators or local active power lines. It has the ability to accommodate communication and information processing functions that can be rapidly transported or moved to the disaster zone, and can be deployed within a reasonable short time to establish the network at the disaster site and launch ICT services.  The MDRU is equipped with an array of communications equipment, servers and storage devices, and is designed to bring not only a communications infrastructure but also data center functions to a disaster-stricken area in a very short time.

The MDRU system is capable of expanding by connecting to another MDRU and thereby creating an MDRU network. This extends the coverage as big as the number of units is connected. The project extended the MDRU to Designated Evacuation Areas using Fixed Wireless Access (FWA). The project implements an FWA IPAS (Wireless IP Access System), a broadband wireless point-to-multipoint communication system operating at 26 GHz that provides high-speed IP access up to 80 Mbps transmission rate.

The deployment of a MDRU network also supports communities to improve their disaster management planning and preparedness.


CVISNET technical staff are installing MDRU equipment in a piloting area in Philippines.

The Municipality of San Remigio, in northern Cebu was the pilot area of the MDRU project with the municipal hall designated as the command and control center during disaster. In order for local residents to communicate using their smartphones a construction of a Wi-Fi based network or Access Points (AP) will cover the entire municipal hall and its surrounding areas. Approximately a radius of 250 meters that has a Wi-Fi signal that the residents can use during disaster. The service to be delivered first is voice communication. With a large number of the population using smartphones it is being leverage by the MDRU project to connect as many residents as possible with minimal training due to the familiarity of the Android applications.

The pilot site is located in a tropical area that is constantly being hit by typhoons and severe weather disturbances. It is also a good location for the MDRU equipment to be tested in a hot and humid environment that can be replicated to other areas in the Pacific. Aside from the equipment, the project will also gather more information with the experiences and results from the disasters that Japan and the Philippines encountered in 2011 and 2013.

One of the relevant results of the pilot testing is the use of the MDRU equipment during non-disaster period or during normal times. It was noticed the MDRU can also be used to isolated island communities where there is no voice and data infrastructure. The output of this study is now called a “wireless IP PBX System”.

The MDRU Project is also a great example of inter regional collaboration and multi-stakholder collaboration. The work around the MDRU units started in Japan as a result of an R&D effort by MIC and NTT after  the experience of the 2011 Great Japan Earthquake. Two years later in 2013 the Philippines was also hit by super typhoon Haiyan that devastated the entire Central Philippines, so CVISNET negotiated NTT for the MDRU to be tested in the Philippines with the help and thru the channels of MIC, ITU and DOST.

As disasters are more and more frequent in the Asia Pacific region, the MDRU offer an scalable solution to restore connectivity during a disaster, as well as an alternative to expand connectivity for the unconnected.

TINDAK MALAYSIA: Towards A Fairer Electoral System

Tindak Malaysia is the winner of the ISIF Asia 2016 Technical Innovation Award and the Community Choice Award 2016.

TINDAK MALAYSIA: Towards A Fairer Electoral System –
1 Person, 1 Vote, 1 Value

A democracy is reflected in the sovereignty of the people. They are supposed to have the power to choose their leaders under Free and Fair Elections. Unfortunately, those in power will try to manipulate the electoral system to entrench their grip on power. Attempts to manipulate the system could be…

  • in tweaking the rules of elections in their favour,
  • in the control of the mainstream media,
  • through threats,
  • through bribery,
  • through the pollsters to manipulate public perception,
  • during the vote count,
  • by making election campaigns so expensive that only the rich or powerful could afford to run or win.
  • through boundary delineation either by gerrymandering, or through unequal seat size.

The Nov 2016 US Presidential Election threw up all of the above in sharp contrast. There were two front runners, Donald Trump and Hillary Clinton.

Both candidates were disliked by more than half the electorate,

Both candidates generated such strong aversion that a dominant campaign theme was to vote for the lesser evil. The people were caught in the politics-of-no-choice.
Eventually, the winning candidate won, with slightly less votes (0.3%), than the losing candidate, each winning only 27% of the electorate. Yet the delegates won by the winner was 306 (57%) while the loser got 232 (43%), a huge difference!

The winning candidate won with barely a quarter of the total voting population. 43% of the voters did not vote. In other words, only 27% of the electorate decided on the President.

Consider Malaysia. We are located in South-east Asia. We have a population of 31 million with about 13.5 million registered voters. We practise a First-Past-The-Post System of elections, meaning the winner takes all, just like in the US.

In the 2013 General Elections, the Ruling Party obtained 47.4% of the votes and 60% of the seats. Meanwhile the opposition, with 52% of the votes, won only 40% of the seats – more votes, but much fewer seats.

We had all the problems listed above except that no opinion polls were allowed on polling day. But the most egregious problem of all was boundary delimitation, which is the subject of our project.

In 2013, the Ruling Party with 47.4% of the popular vote, secured 60% of the seats. To hang on to power, they resorted to abuse and to change of the laws to suppress the Opposition and the people. Our concern was that continuing oppression of the people in this manner could lead to violent protests. It was our hope to achieve peaceful change in a democratic manner through the Constitution.

From a Problem Tree Analysis, it was found that the problem was cyclic in nature. The root cause was a Fascist Government maintaining power through Fraudulent Elections. See red box opposite.
Problem Tree Analysis



If current conditions prevail without any changes, they can still win power with just 39% of the votes.
50-Year General Elections Voting Trend


What happened?

Malapportionment! The seats won by the Ruling Party in the chart below are the blue lines with small number of voters in the rural seats. The red lines with huge numbers are in the urban areas won by the Opposition. It was found that they could have won 50% of the seats with merely 20.22% of the votes.
Malapportionment in General Elections – GE213



The above computation was based on popular vote. If based on total voting population, BN needed only 17.4% to secure a simple majority.

What is the solution we propose?

The solution was obvious. Equalize the seats.
But for the past 50 years, no one seemed to object to the unfair maps.

Why? The objectors never managed to submit a substantive objection because:

  • Biased EC stacked with Ruling Party cronies, who actively worked to prevent any objections being made,
  • Constitution rules of delimitation drafted to make objections difficult, such that the EC had a lot of leeway to interpret it anyway it wished.
  • Very high barriers to objection,
  • Insufficient information offered during a Redelineation exercise. Given the 1-month deadline, it was impossible for an ordinary voter to prepare a proper objection.

How are Constituencies Drawn – Districting?


We start with a Polling District (PD). The PD is the smallest unit of area in a Constituency. It is defined by a boundary, a name and/ID Code, and includes elector population. Map 1 is an example of PD. To avoid clutter, the elector numbers are carried in separate layer which can be overlaid on top.

Districting is conducted by assembling these PD into Constituencies. In theory, the Constituencies are supposed to have roughly the same number of electors, unless variation is permitted in the Constitution.

What happens when the Election Commission presents a map without any PD as shown in Map 2 below.


This was gazetted by the EC on 15th Sept 2016 for public objections. No Polling Districts are identified. In reality, the EC had all the information in digital format under an Electoral Geographical Information System (EGIS) but they kept it from the public.

An elector faced with such a map, is stuck. He would not know where to begin. Neither did he have the technical knowledge to carry out the redistricting even if he wanted to, all within the time limit of 1 month.

This has been the case for the past 50 years. No one could object effectively.

So we had a situation where electors wanted to object but were unable to do so because of insufficient information and lack of expertise.

Studying the problem, we decided that the solution was to bridge the Digital Divide through Technical Innovation as well as to bring the matter out of the jurisdiction of the EC.


  1. Digitize all the PD in Malaysia, about 8000 of them. This took us 1 year.
  2. Learn how to redistrict using digital systems. We used QGIS, an open source GIS system,
  3. Develop a plug-in to semi-automate and speed up the redistricting process.


  1. Bring in legal expertise. Collaborate with lawyers to bring the matter out of the control of the EC and into the jurisdiction of the courts in order to defend the Constitution.

We started this initiative in July 2011 and by Dec 2015, we had digitised all the PD and redistricted the whole country twice, sharpening our expertise and correcting errors in the process. We got the Bar Council (Lawyers Association) to team up with us to guide the public on how to object when the Redelineation exercise by the EC is launched.

Redelineation, 1st Gazette:

On 15th Sept 2016, the EC published the First Gazette of the Redelineation Proposal. For the State of Selangor with 22 Parliamentary seats, they published one map only – MAP 2. We analysed their proposal and found glaring disparities in the seat sizes with elector population ranging from 39% to 200% of the State Electoral Quota (EQ) – MAP 3



At a more detailed level, it looks like MAP 4 below. We can see the densely populated central belt (brown columns) sticking out in sharp contrast to the under-populated outlying regions around the perimeter – ochre areas). Clearly the EC has not addressed the inequalities in the voting strength among the various regions.



Trial Run: We conducted a trial run on the EC maps for a local council in Selangor – MPSJ. See MAP 4. It was found that we could maintain local ties with 6 State and 2 Parliamentary Constituencies, with the elector population kept within +/-20% of the mean. This was much better than the EC’s range of -60% to +100%.



We have submitted objections for the First Gazette and await the call for a public hearing by the EC. Our lawyers are monitoring the EC to ensure they comply with the Constitution and preparing lawsuits in case they don’t.

While conducting our research on how to object, we uncovered yet another area of abuse. The boundaries of the polling districts and electors within, had been shifted to other constituencies unannounced. This was a surreptitious form of redelineation outside the ambit of the constitution and a gross abuse of authority. As part of our next project, we intend to focus on this, to prevent such gerrymandering.

In conclusion, we feel like we are peeling an onion. As we unfold one layer, a new layer of fraud is exposed. It was a never-ending process. But we are determined to keep on digging until we reach the core and achieve our goal of Free and Fair Elections.

IoT solutions to help reduce human-elephant conflict in Sri Lanka

The APNIC blog published yesterday an article written by Asanka Sayakkara, Assistant Lecturer at University of Colombo School of Computing (UCSC), about Internet of Things (IoT) solutions to deal with the problems that emerge from the interaction between humans-elephants.

From ISIF Asia, is really great to see how one of the organizations that received one of our first grants, continues to work on innovative solutions that use Internet technologies to address development problems. Kasun de Zoysa from UCSC worked back in 2010, on a Virtual IPv6 application test bed.

Asanka’s article as published at the APNIC blog is below and information about Kasun’s work is linked there. Hope you enjoy!

ISIF Secretariat


IoT solutions to help reduce human-elephant conflict in Sri Lanka 

IoT for elephants-human conflict

Human-elephant conflict is a very serious and destructive problem in rural Sri Lanka.

Each year, around 70 people are killed by elephants who wander into villages and farms in search of food; and nearly four times as many elephants are killed as a result.  Elephants wandering into farmland also damage crops.

Presenting at the Internet of Things (IoT) tutorial at the recent APNIC 42 conference held in Colombo Sri Lanka, Dr Kasun de Zoysa from the University of Colombo’s School of Computing, shared with attendees examples of how his team, in collaboration with Sweden’s Uppsala University, are employing simple IoT solutions to protect crops and both human and elephant lives.

“Different people have approached this problem in different ways: biologists and animal conservationists are trying their best to protect local habitats, and the government and villagers have built kilometres of electric fencing around their villages and farms,” says Kasun.

“Our approach seeks to complement these efforts by incorporating sensing and data processing technology.”

Such technologies include making electric fences smarter and improving elephant warning systems.

Smarter electric fences

Electric fencing is a common solution used to protect villagers from elephants, particularly farmlands bordering the jungle.

However, Kasun says elephants have learnt how to avoid electric fences and discovered ways to break them, making the practice less reliable.

Once broken, it takes a significant human effort to find the location of the breakage by walking along the fence wire several kilometers long under the threat of nearby wild elephants.

To overcome this, Kasun’s team have developed a cost-effective electric fence, with small IoT nodes placed along the wire that can communicate with each other using the same wire as the communication medium.

“Their packets are encoded into the high-voltage electric pulses in a way that enables us to identify which node is disconnected from the network,” says Kasun. “When a node is disconnected from the network (part of the fence is broken) we can send alerts to maintenance crews with the exact location of the breakage.”

Infrasonic elephant localization system

Kasun says that although this new system will help with alerting villagers to potential elephant intrusions, it is not by itself a sustainable solution to protect people’s lives.

“This is where our second approach comes in,” says Kasun. “We have been testing an infrasonic localization system to locate elephants.”

Elephants emit infrasonic (low frequency sounds) which travel further compared to audible frequencies. The system we are working on can accurately locate elephants in the area and alert people via various means including SMS alerts and social media.”

Kasun says that both the infrasonic elephant localization system and the smart electric fence are still in experimental stages; however, they plan to launch a pilot program in the coming months to evaluate their effectiveness.

“Success of this pilot deployment will provide us with the valuable information we need to complete this work and produce a cost-effective, open-source product that anybody can build.”

Read more about Kasun’s team’s preliminary work with infrasonic elephant localization systems.


Voiceless faces and missed Cases – Operation ASHA bridges TB and technology

The road to recovery for a Tuberculosis (TB) patient in Cambodia can be long and arduous. For three months, 65-year old Mr. Nou Pov suffered from coughs, fatigue and night sweats, all of which are symptoms of TB, without being able … Continue reading

Why you should apply for the ISIF Asia Awards: advice from past award winner

By Robert Mitchell, APNIC With nominations for the ISIF Asia Awards 2016 now open, we thought we’d check back with some of our previous award winners to understand how the award benefitted their projects and get some advice on what … Continue reading

ISIF Asia 2016 grant recipients announced!

The first CERT in the Pacific, a Peering Strategy for the Pacific, and a mobile app reader to access books in Thailand’s Karen dialects are just some of the initiatives that will receive funding. This year ISIF Asia will award … Continue reading

Seed Alliance completion report 2012-2015 published

Back in 2011, APNIC and LACNIC were interested to join efforts to strengthen their regional programs for Internet development. Both ISIF Asia and FRIDA had many stories to tell and supported many projects since they were established. Although they operated in … Continue reading