IPv6 TUN reflector

We had a look as a simple network simulation using TUN a couple of posts ago: https://matthewarcus.wordpress.com/2013/05/18/fun-with-tun/.

Let’s now have a look at getting it all working with IPv6. Changing the address swapping code is fairly straightforward, and for some extra points we’ll add a facility for printing out the source and destination addresses of each packet forwarded, the correct function for doing this is now inet_ntop, which works for both v4 and v6 addresses.

Here are the main changes [see https://github.com/matthewarcus/stuff/tree/master/tun for full code]:

It’s convenient to define a 32-bit swap function:

void swap32(uint8_t *p, uint8_t *q)
{
  uint32_t t = get32(p);
  put32(p,get32(q));
  put32(q,t);
}

and our main function is now:

#define SRC_OFFSET4 12
#define DST_OFFSET4 16
#define SRC_OFFSET6 8
#define DST_OFFSET6 24

void reflect(uint8_t *p, size_t nbytes)
{
  uint8_t version = p[0] >> 4;
  switch (version) {
  case 4:
    if (verbosity > 0) {
      char fromaddr[INET_ADDRSTRLEN];
      char toaddr[INET_ADDRSTRLEN];
      inet_ntop(AF_INET, p+SRC_OFFSET4, fromaddr, sizeof(fromaddr));
      inet_ntop(AF_INET, p+DST_OFFSET4, toaddr, sizeof(toaddr));
      printf("%zu: %s->%s\n", nbytes, fromaddr, toaddr);
    }
    // Swap source and dest of an IPv4 packet
    // No checksum recalculation is necessary
    swap32(p+SRC_OFFSET4,p+DST_OFFSET4);
    break;
  case 6:
    if (verbosity > 0) {
      char fromaddr[INET6_ADDRSTRLEN];
      char toaddr[INET6_ADDRSTRLEN];
      inet_ntop(AF_INET6, p+SRC_OFFSET6, fromaddr, sizeof(fromaddr));
      inet_ntop(AF_INET6, p+DST_OFFSET6, toaddr, sizeof(toaddr));
      printf("%zu: %s->%s\n", nbytes, fromaddr, toaddr);
    }
    // Swap source and dest of an IPv6 packet
    // No checksum recalculation is necessary
    for (int i = 0; i < 4; i++) {
      swap32(p+SRC_OFFSET6+4*i,p+DST_OFFSET6+4*i);
    }
    break;
  default:
    fprintf(stderr, "Unknown protocol %u\n", version);
    exit(0);
  }
}

Setting up the the v6 addresses for our new interface is a little different. As before, we bring the interface up:

$ ip link set tun0 up

Now, we need to add a link-local address, mandatory for all IPv6 interfaces:

$ ip -6 addr add fe80::1/64 dev tun0

We can use ping6 to try this out, localizing the request to the tun0 interface:

$ ping6 -I tun0 fe80::a617:31ff:fe5a:334f
PING fe80::a617:31ff:fe5a:334f(fe80::a617:31ff:fe5a:334f) from fe80::1 tun0: 56 data bytes
64 bytes from fe80::a617:31ff:fe5a:334f: icmp_seq=1 ttl=64 time=0.164 ms
...

This address is also the link-local address of my Wifi interface, but there is no ambiguity as we must specify which interface to use.

We can also add a private network address. IPv6 does not have the same concept of a private network as IPv4, instead we define Unique Local Addresses: append 0xfd to a random 10 digit hex global id and add an arbitrary 4 digit subnet identifier. Any random global id is fine – the idea is to ensure that any given network will have a different id from any other private network it is likely to come in contact with – we don’t need to worry about true global uniqueness though the Birthday Paradox tells us that we are likely to have a potential conflict with only about a million private networks (there might be lots of people out there with the same name and birthday as you, but you are unlikely to meet one of them at random).

We can generate our own random address, for example, using the method described in RFC4193, or use /dev/random:

$ hexdump -v -e '/1 "%02x"' -n 5 /dev/urandom; echo
2acd2c8bc4

or just copy a random sequence from somewhere on the Internet, for example, the one used here:

$ ip -6 route add fd2a:cd2c:8bc4:0::/64 dev tun0

This adds a local network with a global id of 2acd2c8bc4 and a subnet id of 0.

We can also define a larger subnet:

$ ip -6 route add fd2a:cd2c:8bc4:1100::/56 dev tun0

Now traffic to any IPv6 address of form fd2a:cd2c:8bc4:11xx:… will be sent to our TUN device:

$ ping6 fd2a:cd2c:8bc4:11ff::23
PING fd2a:cd2c:8bc4:11ff::23(fd2a:cd2c:8bc4:11ff::23) 56 data bytes
64 bytes from fd2a:cd2c:8bc4:11ff::23: icmp_seq=1 ttl=64 time=0.110 ms
...

Indeed, we can define all subnets for another global id:

$ hexdump -e '/1 "%02x"' -n 5 /dev/urandom; echo
40bd2f7ba0
$ sudo ip -6 route add fd40:bd2f:7ba0::/48 dev tun0

Just for interest, here’s our entire IPv6 routing table:

$ route -A inet6
Kernel IPv6 routing table
Destination Next Hop Flag Met Ref Use If
fd2a:cd2c:8bc4::/64 :: U 1024 0 0 tun0
fd2a:cd2c:8bc4:1100::/56 :: U 1024 0 0 tun0
fd40:bd2f:7ba0::/48 :: U 1024 0 0 tun0
fe80::/64 :: U 256 0 0 wlan0
fe80::/64 :: U 256 0 0 tun0
::/0 :: !n -1 1 524 lo
::1/128 :: Un 0 1 35 lo
fe80::1/128 :: Un 0 1 10 lo
fe80::a617:31ff:fe5a:334f/128 :: Un 0 1 7 lo
ff00::/8 :: U 256 0 0 wlan0
ff00::/8 :: U 256 0 0 tun0
::/0 :: !n -1 1 524 lo

Finally, to set up a simple service to use IPv6:

In one terminal:

$ nc -l -6 9901
...

In another:

$ nc -6 fd2a:cd2c:8bc4:11ff::23 9901
...

and our logging now looks like this:

$ ./reflect -v
Capability CAP_NET_ADMIN: 1 0 1
Created tun device tun0
48: fe80::1->ff02::2
72: fe80::1->fd2a:cd2c:8bc4:11ff::23
72: fe80::1->fd2a:cd2c:8bc4:11ff::23
72: fe80::1->fd2a:cd2c:8bc4:11ff::23
72: fe80::1->fd2a:cd2c:8bc4:11ff::23
48: fe80::1->ff02::2
48: fe80::1->ff02::2
80: fe80::1->fd2a:cd2c:8bc4:11ff::23
80: fe80::1->fd2a:cd2c:8bc4:11ff::23
72: fe80::1->fd2a:cd2c:8bc4:11ff::23
79: fe80::1->fd2a:cd2c:8bc4:11ff::23
72: fe80::1->fd2a:cd2c:8bc4:11ff::23
72: fe80::1->fd2a:cd2c:8bc4:11ff::23
72: fe80::1->fd2a:cd2c:8bc4:11ff::23
72: fe80::1->fd2a:cd2c:8bc4:11ff::23

Those ff02::2 addresses are for IPv6 router discovery. The rest are two TCP flows, one in each direction (we can get more detail from Wireshark, in particular, the relevant port numbers, but this gives the general idea).

Advertisements

Fun with TUN

TUN devices are much used for virtualization, VPNs, network testing programs, etc. A TUN device essentially is a network interface that also exists as a user space file descriptor, data sent to the interface can be read from the file descriptor, and data written to the file descriptor emerges from the network interface.

Here’s a simple example of their use. We create a TUN device that simulates an entire network, with traffic to each network address just routed back to the original host.

For a complete program, see:

https://github.com/matthewarcus/stuff/blob/master/tun/reflect.cpp

First create your TUN device, this is fairly standard, most public code seems to be derived from Maxim Krasnyansky’s:

https://www.kernel.org/doc/Documentation/networking/tuntap.txt

and our code is no different:

int tun_alloc(char *dev) 
{
  assert(dev != NULL);
  int fd = open("/dev/net/tun", O_RDWR);
  CHECKFD(fd);

  struct ifreq ifr; 
  memset(&ifr, 0, sizeof(ifr)); 
  ifr.ifr_flags = IFF_TUN | IFF_NO_PI;
  strncpy(ifr.ifr_name, dev, IFNAMSIZ); 
  CHECKSYS(ioctl(fd, TUNSETIFF, (void *) &ifr));
  strncpy(dev, ifr.ifr_name, IFNAMSIZ); 
  return fd;
}

We want a TUN device (rather than TAP, essentially the same thing but at the ethernet level) and we don’t want packet information at the moment. We copy the name of the allocated device to the char array given as a parameter.

Now all our program needs to do is create the TUN device and sit in a loop copying packets:

int main(int argc, char *argv[])
{
  char dev[IFNAMSIZ+1];
  memset(dev,0,sizeof(dev));
  if (argc > 1) strncpy(dev,argv[1],sizeof(dev)-1);

  // Allocate the tun device
  int fd = tun_alloc(dev);
  if (fd < 0) exit(0);

  uint8_t buf[2048];
  while(true) {
    // Sit in a loop, read a packet from fd, reflect
    // addresses and write back to fd.
    ssize_t nread = read(fd,buf,sizeof(buf));
    CHECK(nread >= 0);
    if (nread == 0) break;
    reflect(buf,nread);
    ssize_t nwrite = write(fd,buf,nread);
    CHECK(nwrite == nread);
  }
}

The TUN mechanism ensures that we get exactly one packet for each read, we don’t need to worry about fragmentation, and we just send each packet back with the source and destination IPs swapped:

static inline void put32(uint8_t *p, size_t offset, uint32_t n)
{
  memcpy(p+offset,&n,sizeof(n));
}

static inline uint32_t get32(uint8_t *p, size_t offset)
{
  uint32_t n;
  memcpy(&n,p+offset,sizeof(n));
  return n;
}

void reflect(uint8_t *p, size_t nbytes)
{
  (void)nbytes;
  uint8_t version = p[0] >> 4;
  switch (version) {
  case 4:
    break;
  case 6:
    fprintf(stderr, "IPv6 not implemented yet\n");
    exit(0);
  default:
    fprintf(stderr, "Unknown protocol %u\n", version);
    exit(0);
  }
  uint32_t src = get32(p,12);
  uint32_t dst = get32(p,16);
  put32(p,12,dst);
  put32(p,16,src);
}

We don’t need to recalculate the header checksum as it doesn’t get changed by just swapping two 32 bit segments.

Handling IPV6 is left as an exercise for the reader (we just need to use a different offset and address size I think).

In this day and age, security should be prominent in our minds, particularly for long-running programs like our TUN server, so for extra points, let’s add in some capability processing.

(You might need to install a libcap-dev package for this to work, for example, with “sudo apt-get install libcap-dev” and link with -lcap).

Once we have started up, we should check if we have the required capability, we just require CAP_NET_ADMIN to be permitted:

  cap_t caps = cap_get_proc();
  CHECK(caps != NULL);

  cap_value_t cap = CAP_NET_ADMIN;
  const char *capname = STRING(CAP_NET_ADMIN);

  cap_flag_value_t cap_permitted;
  CHECKSYS(cap_get_flag(caps, cap,
                        CAP_PERMITTED, &cap_permitted));
  if (!cap_permitted) {
    fprintf(stderr, "%s not permitted, exiting\n", capname);
    exit(0);
  }

and then make effective what we require:

  CHECKSYS(cap_clear(caps));
  CHECKSYS(cap_set_flag(caps, CAP_PERMITTED, 1, &cap, CAP_SET));
  CHECKSYS(cap_set_flag(caps, CAP_EFFECTIVE, 1, &cap, CAP_SET));
  CHECKSYS(cap_set_proc(caps));

Finally, after creating our TUN object, before entering our main loop, we can relinquish our extra privileges altogether:

  CHECKSYS(cap_clear(caps));
  CHECKSYS(cap_set_proc(caps));
  CHECKSYS(cap_free(caps));

For completeness, here are the error checking macros used above:

#define CHECKAUX(e,s)                            \
 ((e)? \
  (void)0: \
  (fprintf(stderr, "'%s' failed at %s:%d - %s\n", \
           s, __FILE__, __LINE__,strerror(errno)), \
   exit(0)))
#define CHECK(e) (CHECKAUX(e,#e))
#define CHECKSYS(e) (CHECKAUX((e)==0,#e))
#define CHECKFD(e) (CHECKAUX((e)>=0,#e))
#define STRING(e) #e

Of course, production code will want to do something more sophisticated than calling exit(0) when an error occurs…

To use, compile for example with:

g++ -W -Wall -O3 reflect.cpp -lcap -o reflect

We can set permissions for our new executable to include the relevant capability, so we don’t need to start it as root:

$ sudo setcap cap_net_admin+ep ./reflect

Actually start it:

$ ./reflect&
Capability CAP_NET_ADMIN: 1 0 1
Created tun device tun0

We now have an interface, but it isn’t configured:

$ ifconfig tun0
tun0 Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
POINTOPOINT NOARP MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:500
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)

With the interface running, set up networking:

$ sudo ip link set tun0 up
$ sudo ip addr add 10.0.0.1/8 dev tun0

Check all is well:

$ ifconfig tun0
tun0 Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
inet addr:10.0.0.1 P-t-P:10.0.0.1 Mask:255.0.0.0
UP POINTOPOINT RUNNING NOARP MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:500
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)

And try it out:

$ ping -c 1 10.0.0.41
PING 10.0.0.41 (10.0.0.41) 56(84) bytes of data.
64 bytes from 10.0.0.41: icmp_req=1 ttl=64 time=0.052 ms

--- 10.0.0.41 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.052/0.052/0.052/0.000 ms

Let’s check performance, firstly, a flood ping on the loopback device:

$ sudo ping -f -c10000 -s1500 127.0.0.1
PING 127.0.0.1 (127.0.0.1) 1500(1528) bytes of data.

--- 127.0.0.1 ping statistics ---
10000 packets transmitted, 10000 received, 0% packet loss, time 778ms
rtt min/avg/max/mdev = 0.003/0.006/0.044/0.002 ms, pipe 2, ipg/ewma 0.077/0.006 ms

compared to one through the TUN connection:

$ sudo ping -f -c10000 -s1500 10.0.0.100
PING 10.0.0.100 (10.0.0.100) 1500(1528) bytes of data.

--- 10.0.0.100 ping statistics ---
10000 packets transmitted, 10000 received, 0% packet loss, time 945ms
rtt min/avg/max/mdev = 0.022/0.032/3.775/0.038 ms, pipe 2, ipg/ewma 0.094/0.032 ms

Respectable. We have got ourselves a network!