Veth Devices, Network Namespaces and Open vSwitch

It’s useful to be able to set up miniature networks on a Linux machine, for development, testing or just for fun. Here we use veth devices and network namespaces to create a small virtual network, connected together with an Open vSwitch instance. I’m using a Raspberry Pi 3 for this, it’s less inconvenient when it goes wrong, but I don’t think anything is Pi specific (and I certainly wouldn’t recommend a Pi for serious routing applications).

A veth device pair is a virtual ethernet cable, packets sent on one end come out the other (and vice versa of course):

$ sudo ip link add veth0 type veth peer name veth1
$ ip link show type veth
4: veth1@veth0:  mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 92:e3:6f:51:b7:96 brd ff:ff:ff:ff:ff:ff
5: veth0@veth1:  mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether aa:4c:fd:e3:cc:a1 brd ff:ff:ff:ff:ff:ff

I could assign an IP to both ends and try to send traffic through the link, but since the system knows about both ends, the traffic would get sent directly to the destination interface. Instead, I need to hide one end in a network namespace, each namespace has a set of interfaces, routing tables etc. that are private to that namespace. Initially everything is in the global namespace and we can create a new namespace, which are often named after colours, with the ip command:

$ sudo ip netns add blue

Now put the “lower” end of the veth device into the new namespace:

$ sudo ip link set veth1 netns blue

veth1 is no longer visible in the global namespace:

$ ip link show type veth
5: veth0@if4:  mtu 1500 qdisc noqueue state LOWERLAYERDOWN mode DEFAULT group default qlen 1000
    link/ether aa:4c:fd:e3:cc:a1 brd ff:ff:ff:ff:ff:ff link-netnsid 0

but we can see it in the blue namespace:

$ sudo ip netns exec blue ip link show
1: lo:  mtu 65536 qdisc noop state DOWN mode DEFAULT group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
4: veth1@if5:  mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
    link/ether 92:e3:6f:51:b7:96 brd ff:ff:ff:ff:ff:ff link-netnsid 0

Note that veth0 is in state LOWERLAYERDOWN because veth1 is now DOWN (as is the local interface in the namespace). We can now assign addresses to veth0 and veth1 and make sure all the interfaces are up:

$ sudo ip addr add 10.0.0.10/24 dev veth0
$ sudo ip netns exec blue ip addr add 10.0.0.1/24 dev veth1
$ sudo ip link set veth0 up
$ sudo ip netns exec blue ip link set veth1 up
$ sudo ip netns exec blue ip link set lo up

Now we can ping the other end:

$ ping -c1 10.0.0.1
PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.
64 bytes from 10.0.0.1: icmp_seq=1 ttl=64 time=0.197 ms

--- 10.0.0.1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.197/0.197/0.197/0.000 ms

and tcpdump confirms that traffic really is being sent over the veth link:

$ sudo tcpdump -i veth0 icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on veth0, link-type EN10MB (Ethernet), capture size 262144 bytes

16:36:50.390395 IP 10.0.0.10 > 10.0.0.1: ICMP echo request, id 20113, seq 1, length 64
16:36:50.390714 IP 10.0.0.1 > 10.0.0.10: ICMP echo reply, id 20113, seq 1, length 64

That’s all we need for the most basic setup. Now we’ll add a second namespace and connect everything together with a switch – we could use a normal Linux bridge for this, but it’s more fun to use Open vSwitch and later use some very basic Openflow commands to set up a learning switch.

For a more complicated setup it’s usually a good idea to enable IP forwarding, so while we remember:

$ sudo bash -c "echo 1 > /proc/sys/net/ipv4/ip_forward"

And we now want another veth pair and another namespace:

$ sudo ip netns add red
$ sudo ip link add veth2 type veth peer name veth3
$ sudo ip link set veth3 netns red
$ sudo ip netns exec red ip addr add 10.0.0.2/24 dev veth3
$ sudo ip netns exec red ip link set lo up
$ sudo ip netns exec red ip link set veth3 up

Let’s remove the address assigned above to veth0 (we are going to put veth0 in the bridge anyway, but explicitly removing the address is tidier and prevents confusion later):

$ sudo ip addr del 10.0.0.10/24 dev veth0

Check we have openvswitch installed, on Ubuntu:

$ sudo apt-get install openvswitch-switch

and see what is already running:

$ sudo ovs-vsctl show
b494c304-46b7-4ff8-9fa4-581952fae2f1
    ovs_version: "2.3.0"

Add a new bridge:

$ sudo ovs-vsctl add-br ovsbr0
$ sudo ovs-vsctl show
b494c304-46b7-4ff8-9fa4-581952fae2f1
    Bridge "ovsbr0"
        Port "ovsbr0"
            Interface "ovsbr0"
                type: internal
    ovs_version: "2.3.0"

If you’ve been experimenting, remove the upper veths from any bridge they might be in:

$ sudo ip link set veth0 nomaster
$ sudo ip link set veth2 nomaster

and add to the OVS bridge:

$ sudo ovs-vsctl add-port ovsbr0 veth0
$ sudo ovs-vsctl add-port ovsbr0 veth2
$ sudo ovs-vsctl show
b494c304-46b7-4ff8-9fa4-581952fae2f1
    Bridge "ovsbr0"
        Port "veth0"
            Interface "veth0"
        Port "veth2"
            Interface "veth2"
        Port "ovsbr0"
            Interface "ovsbr0"
                type: internal
    ovs_version: "2.3.0"
$ ip link show type veth
5: veth0@if4:  mtu 1500 qdisc noqueue master ovs-system state UP mode DEFAULT group default qlen 1000
    link/ether 4a:b7:05:b1:29:d6 brd ff:ff:ff:ff:ff:ff link-netnsid 0
7: veth2@if6:  mtu 1500 qdisc noqueue master ovs-system state UP mode DEFAULT group default qlen 1000
    link/ether 5e:a7:f8:99:d4:ba brd ff:ff:ff:ff:ff:ff link-netnsid 1

Master for veth0 and veth2 is now the ovs-system device. Note that both links are UP.

Now we are ready to go (we set up everything within the namespaces earlier):

$ sudo ip netns exec blue ping -c1 10.0.0.2
PING 10.0.0.2 (10.0.0.2) 56(84) bytes of data.
64 bytes from 10.0.0.2: icmp_seq=1 ttl=64 time=0.922 ms

--- 10.0.0.2 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.922/0.922/0.922/0.000 ms

Let’s try external connectivity:

$ sudo ip netns exec blue ping -c1 8.8.8.8
connect: Network is unreachable

Looks like a routing problem:

$ sudo ip netns exec blue ip route
10.0.0.0/24 dev veth1 proto kernel scope link src 10.0.0.1 

There is no default route, let’s add one:

$ sudo ip netns exec blue ip route add default via 10.0.0.254

and this will need an address on the bridge itself:

$ sudo ip addr add 10.0.0.254/24 dev ovsbr0

Try again:

$ sudo ip netns exec blue ping -c1 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.

--- 8.8.8.8 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms

Better, we seem to be sending packets out of the namespace and this is confirmed by tcpdump on the bridge interface:

$ sudo tcpdump -i ovsbr0
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ovsbr0, link-type EN10MB (Ethernet), capture size 262144 bytes
13:20:19.777324 IP 10.0.0.1 > google-public-dns-a.google.com: ICMP echo request, id 5667, seq 1, length 64
13:20:24.831303 ARP, Request who-has 10.0.0.254 tell 10.0.0.1, length 28
13:20:24.831565 ARP, Reply 10.0.0.254 is-at 06:d4:34:9b:26:42 (oui Unknown), length 28

And we can see the packet exiting on wlan0:

$ sudo tcpdump -i wlan0 icmp
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on wlan0, link-type EN10MB (Ethernet), capture size 262144 bytes
13:21:54.727306 IP 10.0.0.1 > google-public-dns-a.google.com: ICMP echo request, id 5697, seq 1, length 64

but sadly the source address is still in the 10.0.0.0 subnet and it’s not surprising that the Google DNS server isn’t responding.

Now, part of this exercise is to find out about Open vSwitch and its capabilities and I would hope that they would include setting up simple NAT translation, but I have no idea how to do that right now, so we’ll just use IP tables, so set up NAT and make sure forwarding is enabled:

$ sudo iptables -t nat -A POSTROUTING -s 10.0.0.0/24 -o wlan0 -j MASQUERADE
$ sudo iptables -F
$ sudo iptables -P FORWARD ACCEPT

Now all is well:

$ sudo ip netns exec blue ping -n -c1 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
64 bytes from 8.8.8.8: icmp_seq=1 ttl=57 time=19.4 ms

--- 8.8.8.8 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 19.434/19.434/19.434/0.000 ms

This assumes we have a default forwarding policy of ACCEPT. It may be prudent to be more selective:

$ sudo iptables -P FORWARD DROP
$ sudo iptables -A FORWARD -s 10.0.0.0/24 -o wlan0 -j ACCEPT
$ sudo iptables -A FORWARD -d 10.0.0.0/24 -i wlan0 -j ACCEPT

so we are just prepared to forward to and from the switch network.

We can add an external IP to the switch. The main connection to my Pi 3 is through wlan0 and so I’d like to leave that alone, so let’s put eth0 into the switch:

$ sudo ovs-vsctl add-port ovsbr0 eth0

Now attach a network cable eg. directly to another laptop (crossover cables are largely a thing of the past), configure an ip address in our 10.0.0.0/24 subnet:

$ sudo ip addr add 10.0.0.100/24 eth0

and we have connectivity out of our box – the external IP is now the switch address.

Finally, since we have been doing so well, let’s program our OVS bridge to be a learning switch. See, for example, http://openvswitch.org/support/dist-docs-2.5/tutorial/Tutorial.md.html for further information.

First, turn off the default flow rules:

$ sudo ovs-vsctl set Bridge ovsbr0 fail-mode=secure

Before when we created the OVS bridge, it started in “Normal” mode, with a single flow rule that sends every incoming packet out of every interface (except the one that it came in on), so the bridge is acting like a hub. Setting “fail-mode=secure” means there are no default rules so all packets are dropped.

First, if we have been playing, it’s a good idea to clear the rule table:

ovs-ofctl del-flows ovsbr0

Now set up the learning rules. The idea is that when a packet comes in from a particular MAC address, the switch remembers which interface the packet arrived on, so when it wants to send a packet to that address, it can just send it on the interface recorded earlier. We can do a similar thing with the local interface so we don’t need to configure the rules to handle whatever the local MAC address is (maybe there is a better way to handle the local interface – comments welcome).

ovs-ofctl add-flow ovsbr0 "table=0, priority=60, in_port=LOCAL, actions=learn(table=10, NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[], load:0xffff->NXM_NX_REG0[0..15]), resubmit(,1)"
ovs-ofctl add-flow ovsbr0 "table=0, priority=50, actions=learn(table=10, NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[], load:NXM_OF_IN_PORT[]->NXM_NX_REG0[0..15]), resubmit(,1)"

The first rule says that when a packet originating locally is received, ie. that is being sent from a local process, add a rule (to table 10) that says that when an incoming packet is received, addressed to same MAC address, put the value 0xFFFF in the lower 16 bits of register 0. The second is the same but for packets received from the other interfaces in the switch, add a rule that puts the interface number in register 0. Having added a rule, processing continues with table 1.

In table 1, we have:

ovs-ofctl add-flow ovsbr0 "table=1 priority=99 dl_dst=01:00:00:00:00:00/01:00:00:00:00:00 actions=resubmit(,2)"
ovs-ofctl add-flow ovsbr0 "table=1 priority=50 actions=resubmit(,10), resubmit(,2)"

The first rule sends packets with a broadcast ethernet address directly through to table 2, the second rule goes through table 10 first – the idea being that if the packet is being sent to a known MAC address, table 10 will put the number of the interface in register 0, or 0xffff if it’s the local MAC address, or 0 if the interface hasn’t been learned yet.

Finally table 2 just sends packets off to the right place using the register 0 values:

ovs-ofctl add-flow ovsbr0 "table=2 reg0=0 actions=LOCAL,1,2,3"
ovs-ofctl add-flow ovsbr0 "table=2 reg0=1 actions=1"
ovs-ofctl add-flow ovsbr0 "table=2 reg0=2 actions=2"
ovs-ofctl add-flow ovsbr0 "table=2 reg0=3 actions=3"
ovs-ofctl add-flow ovsbr0 "table=2 reg0=0xffff actions=LOCAL"

To inspect the all rules table (including any that have been added by the table 1 rules):

$ sudo ovs-ofctl dump-flows ovsbr0

Now thing should work much as they did before. If not, see the link above for further information on OVS testing and debugging.


Redirection

Like, I suspect, many programmers, there are many software tools I have used on a regular basis for many years but remain woefully ignorant of their inner workings and true potential. One of these is the Unix command line.

It’s a common, for example, to want to make the error output of a program to appear as the normal output and trial and error, or Googling or looking at Stack Overflow leads to:

strace ls 2>&1 >/dev/null

which works fine but seems puzzling – we’ve told the program to send error output to normal output, then normal output to /dev/null so why doesn’t that discard everything, similar to:

strace ls >/dev/null 2>&1

This is because we don’t understand what is going on.

An indirection is actually a call to the dup2 system call. From the man page:

dup2() makes newfd be the copy of oldfd, closing newfd first if necessary'

So: n>&m does a dup2(m,n): close fd n if necessary, then make n be a copy of fd m, and n>file means: close n if necessary, open file as fd m, then do dup2(m,n).

Now it all makes sense:

strace ls 2>&1 1>/dev/null 

first of all makes 2 be a copy of 1, then changes 1 to point to /dev/null – the copying is done ‘by value’ as it were (despite the confusing, for C++ programmers anyway, use of ‘&’).

Using strace here is not an accident, but used like this doesn’t tell us much: indirection is handled by the shell, not by the program, so we need to do something like this for further insight:

$ strace -f -etrace=clone,execve,open,dup2 bash -c 'ls >/dev/null 2>&1'
execve("/bin/bash", ["bash", "-c", "ls >/dev/null 2>&1"], [/* 46 vars */]) = 0
clone(Process 20454 attached
child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f643f4c09d0) = 20454
Process 20453 suspended
[pid 20454] open("/dev/null", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
[pid 20454] dup2(3, 1)                  = 1
[pid 20454] dup2(1, 2)                  = 2
[pid 20454] execve("/bin/ls", ["ls"], [/* 45 vars */]) = 0
Process 20453 resumed
Process 20454 detached
--- SIGCHLD (Child exited) @ 0 (0) ---

bash forks off a subprocess (a clone syscall these days rather than fork), which then sets up the input and output before calling execve to actually run the command.

We don’t have to limit ourselves to fds 0,1 and 2 that we get to start with:

$ runit () { echo stderr 1>&2; echo stdout; }
$ runit 1>/dev/null
stderr
$ runit 2>/dev/null
stdout
$ runit 3>&2 2>&1 1>&3
stderr
stdout
$ (runit 3>&2 2>&1 1>&3) 1> /dev/null
stdout
$ (runit 3>&2 2>&1 1>&3) 2> /dev/null
stderr

We duplicate 2 to 3, then 1 to 2, then 3 to 1, and we have swapped stdin and stderr.

We can also pass non-standard file descriptors in to programs, though this doesn’t seem to be a technique used much:

#include <unistd.h>
int main()
{
  char buffer[256];
  ssize_t n;
  while ((n = read(3,buffer,sizeof(buffer))) > 0) {
    write(4,buffer,n);
  }
}

and do:

$ g++ -Wall cat34.cpp -o cat34
$ echo Hello World | ./cat34 3<&0 4>&1
Hello World

It’s interesting that this also works:

$ echo Hello World | ./cat34 3>&0 4<&1
Hello World

While we are this part of town, let’s talk briefly about named pipes, another feature that has been around for ever, but doesn’t seem to get used as much as it deserves. We can run in to problems though:

Suppose I want to capture an HTTP response from a web server, I can do this:

$ mkfifo f1
$ mkfifo f2
$ mkfifo f3
$ netcat -l 8888 <f1 >f2 &
$ netcat www.bbc.co.uk 80 <f2 >f3 &
$ tee foo.txt <f3 >f1 &

and try a download, but alas:

$ GET http://localhost:8888/ >/dev/null
Can't connect to localhost:8888 (Connection refused)

This doesn’t seem right, netcat should be listening on port 8888, I told it so myself! And checking with netstat shows no listener on 8888 and finally ps -aux shows no sign of any netcat processes – what is going on?

Once again, strace helps us see the true reality of things:

$ kill %1
$ strace netcat -l 8888 < f1 > f2

But strace tells us nothing – there is no output! Like the dog that didn’t bark in the night though, this is an important clue, and widening our area of investigation, we find:

$ strace -f -etrace=open,execve bash -c 'netcat -l 8888 < f1 > f2'
execve("/bin/bash", ["bash", "-c", "netcat -l 8888 < f1 > f2"], [/* 46 vars */]) = 0
Process 20621 attached
Process 20620 suspended
[pid 20621] open("f1", O_RDONLY ...

The shell is stalled trying to open the “f1” fifo, before it even gets around to starting the netcat program, which is why the first strace didn’t show anything. What we have forgotten is that opening a pipe blocks if there is no process with the other end open (it doesn’t have to actively reading or writing, it just has to be there). The shell handles redirections in the order they appear, so since our 3 processes are all opening their read fifo first, none have got around to opening their write fifo – we have deadlock, in fact, the classic dining philosophers problem, and a simple solution is for one philosopher to pick up the forks in a different order:

$ netcat -l 8888 >f2 <f1 &
$ netcat www.bbc.co.uk 80 <f2 >f3 &
$ tee foo.txt <f3 >f1 &
$ GET http://localhost:8888/ >/dev/null
$ cat foo.txt
HTTP/1.1 301 Moved Permanently
Server: Apache
...

We can, it should be noted, do this more easily with a normal pipeline and a single fifo, and avoid all these problems:

$ netcat www.bbc.co.uk 80 <f1 | tee foo.txt | netcat -l 8888 >f1

but that would be less fun and possibly less instructive.


Embedded Python Interpreter

And now for something completely different…

Often, I’d like to embed a reasonably capable command interpreter in a C++ application. Python seems a likely candidate, so here’s some investigative code using separate processes (the next step will be to use threads, if that’s possible, so the interpreter can live in the same memory space as our application, that can wait for part II though). As well as the mechanics of embedding Python, we have a pleasant excursion through the sometimes murky worlds of signal handling and pseudo-terminals.

The server structure is conventional (though not necessarily suitable for a serious production server), on each incoming connection we fork a handler process, this in turn splits into two processes, which form their own process group under the control of a pseudo-terminal (pty). One forwarding process copies data between the socket and the master side of the pty, the other process runs the interpreter itself on the slave side. Simple enough, with a few subtleties. To get signal handling right, we have to ignore SIGINT in the forwarding process (otherwise it will terminate on interrupt, taking the interpreter with it), but leave the default handler in the interpreter process – Python sets up its own signal handler, but it only seems to do this if the handler hasn’t been redefined already. Also, Python seems to insist that it uses fds 0,1 and 2 so we need to rebind them, and, finally, to get Python to do line editing, we need to import readline in the interpreter.

My main interest here is in getting external access to the interpreter, rather than the mechanics of calling between C and Python, so we just have a couple of simple functions init() and func() defined in the embedded interpreter as examples. At this simple level I don’t think we need to worry about reference counts etc.

#include <Python.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
#include <signal.h>
#include <time.h>
#include <errno.h>
#include <netinet/ip.h>
#include <sys/epoll.h>

// Some handy macros to help with error checking
// When prototyping, it's a good idea to check every
// system call for errors, these macros help to keep
// the code uncluttered.

#define CHECK(e) \
 ((e)? \
  (void)0: \
  (fprintf(stderr, "'%s' failed at %s:%d\n - %s\n", \
           #e, __FILE__, __LINE__,strerror(errno)), \
   exit(0)))

#define CHECKSYS(e) (CHECK((e)==0))
#define CHECKFD(e) (CHECK((e)>=0))

// We are told not to use signal, due to portability problems
// so we will define a similar function ourselves with sigaction
void setsignal(int signal, sighandler_t handler)
{
  struct sigaction sa;
  memset(&sa,0,sizeof(sa));
  sa.sa_handler = handler;
  CHECKSYS(sigaction(signal,&sa,NULL));
}

// Make a suitable server socket, as a small concession to
// security, we will hardwire the loopback address as the
// bind address. People elsewhere can come in through an SSH
// tunnel.
int makeserversock(int port)
{
  int serversock = socket(AF_INET,SOCK_STREAM,0);
  CHECKFD(serversock);
  sockaddr_in saddr;
  saddr.sin_family = PF_INET;
  saddr.sin_port = htons(port);
  saddr.sin_addr.s_addr = htonl(INADDR_LOOPBACK);

  int optval = 1;
  CHECKSYS(setsockopt(serversock, SOL_SOCKET, SO_REUSEADDR, 
                      &optval, sizeof optval));
  CHECKSYS(bind(serversock,(sockaddr*)&saddr,sizeof(saddr)));
  CHECKSYS(listen(serversock,10));
  return serversock;
}

// Copy data between our socket fd and the master
// side of the pty. A simple epoll loop.
int runforwarder(int mpty, int sockfd)
{
  static const int MAX_EVENTS = 10;
  int epollfd = epoll_create(MAX_EVENTS);
  CHECKFD(epollfd);
  epoll_event event;
  memset (&event, 0, sizeof(event));
  event.events = EPOLLIN;
  event.data.fd = sockfd;
  CHECKSYS(epoll_ctl(epollfd, EPOLL_CTL_ADD, sockfd, &event));
  event.data.fd = mpty;
  CHECKSYS(epoll_ctl(epollfd, EPOLL_CTL_ADD, mpty, &event));
  char ibuff[256];
  while (true) {
    struct epoll_event events[MAX_EVENTS];
    int nfds = epoll_wait(epollfd, events, MAX_EVENTS, -1);
    // Maybe treat EINTR specially here.
    CHECK(nfds >= 0);
    for (int i = 0; i < nfds; ++i) {
      int fd = events[i].data.fd;
      if (events[i].events & EPOLLIN) {
        ssize_t nread = read(fd,ibuff,sizeof(ibuff));
        CHECK(nread >= 0);
        if (nread == 0) {
          goto finish;
        } else {
          write(mpty+sockfd-fd,ibuff,nread);
        }
      } else if (events[i].events & (EPOLLERR|EPOLLHUP)) {
        goto finish;
      } else {
        fprintf(stderr, "Unexpected event for %d: 0x%x\n", 
                fd, events[i].events);
        goto finish;
      }
    }
  }
 finish:
  CHECKSYS(close(mpty));
  CHECKSYS(close(sockfd));
  CHECKSYS(close(epollfd));
  return 0;
}

// The "application" functions to be accessible from
// the embedded interpreter
int myinit()
{
  srand(time(NULL));
  return 0;
}

int myfunc()
{
  return rand();
}

// Python wrappers around our application functions
static PyObject*
emb_init(PyObject *self, PyObject *args)
{
    if (!PyArg_ParseTuple(args, ":init")) return NULL;
    return Py_BuildValue("i", myinit());
}

static PyObject*
emb_func(PyObject *self, PyObject *args)
{
    if (!PyArg_ParseTuple(args, ":func")) return NULL;
    return Py_BuildValue("i", myfunc());
}

static PyMethodDef EmbMethods[] = {
    {"init", emb_init, METH_VARARGS,
     "(Re)initialize the application."},
    {"func", emb_func, METH_VARARGS,
     "Run the application"},
    {NULL, NULL, 0, NULL}
};

int runinterpreter(char *argname, int fd)
{
  CHECKFD(dup2(fd,0));
  CHECKFD(dup2(fd,1));
  CHECKFD(dup2(fd,2));
  CHECKSYS(close(fd)); 

  Py_SetProgramName(argname);
  Py_Initialize();
  Py_InitModule("emb", EmbMethods);
  PyRun_SimpleString("from time import time,ctime\n");
  PyRun_SimpleString("from emb import init,func\n");
  PyRun_SimpleString("print('Today is',ctime(time()))\n");
  PyRun_SimpleString("import readline\n");
  PyRun_InteractiveLoop(stdin, "-");
  Py_Finalize();

  return 0;
}

int main(int argc, char *argv[])
{
  int port = -1;
  if (argc > 1) {
    port = atoi(argv[1]);
  } else {
    fprintf(stderr, "Usage: %s <port>\n", argv[0]);
    exit(0);
  }
  setsignal(SIGCHLD, SIG_IGN);
  int serversock = makeserversock(port);
  while (true) {
    int sockfd = accept(serversock,NULL,NULL);
    CHECKFD(sockfd);
    if (fork() != 0) {
      // Server side, close new connection and continue
      CHECKSYS(close(sockfd));
    } else {
      // Client side, close server socket
      CHECKSYS(close(serversock)); serversock = -1;
       // Create a pseudo-terminal
      int mpty = posix_openpt(O_RDWR);
      CHECKFD(mpty);
      CHECKSYS(grantpt(mpty)); // pty magic
      CHECKSYS(unlockpt(mpty));
      // Start our own session
      CHECK(setsid()>0); 
      int spty = open(ptsname(mpty),O_RDWR);
      // spty is now our controlling terminal
      CHECKFD(spty);
      // Now split into two processes, one copying data
      // between socket and pty; the other running the
      // actual interpreter.
      if (fork() != 0) {
        CHECKSYS(close(spty));
        // Ignore sigint here
        setsignal(SIGINT, SIG_IGN);
        return runforwarder(sockfd,mpty);
      } else {
        CHECKSYS(close(sockfd));
        CHECKSYS(close(mpty)); 
        // Default sigint here - will be replace by interpreter
        setsignal(SIGINT, SIG_DFL);
        return runinterpreter(argv[0],spty);
      }
    }
  }
}

Compilation needs something like:

g++ -g -L/usr/lib/python2.6/config -lpython2.6 -I/usr/include/python2.6 -Wall embed.cpp -o embed

Suitable flags can be obtained by doing:

	/usr/bin/python2.6-config --cflags
	/usr/bin/python2.6-config --ldflags

Of course, all this will depend on your exact Python version and where it is installed. Embedding has changed somewhat in Python 3, but most of this will still apply.

To connect to the interpreter, we can use our good friend netcat, with some extra tty mangling (we want eg. control-C to be handled by the pty defined above in the server code, not the user terminal, so we put that into raw mode).

#!/bin/sh
ttystate=`stty --save`
stty raw -echo
netcat $*
stty $ttystate

We set up the server socket to only listen on the loopback interface, so in order to have secure remote access, we can set up an SSH tunnel by running something like:

$ ssh -N -L 9998:localhost:9999 <serverhost>

on the client host.

Finally, we can run some Python:

$ connect localhost 9998
('Today is', 'Sun Nov  4 21:09:09 2012')
>>> print 1
1
>>> init()
0
>>> func()
191482566
>>> ^C
KeyboardInterrupt
>>> ^D
$