/ code snippets

Threaded HTTP Server

Table of Contents

Background

Python has an enjoyably simple module in it called SimpleHTTPServer, which does exactly as it sounds: hosts a very basic HTTP server. For those that aren't very familiar with the module, running it will host a basic site from the current directory. This is useful for browsing/downloading files remotely (when you're too lazy to scp) or testing web resources.

To drive home the point, one might see the following while hosting a site via SimpleHTTPServer, and then loading it from the browser:

[email protected]:~$ python -m SimpleHTTPServer
Serving HTTP on 0.0.0.0 port 8000 ...
192.168.0.20 - - [17/Oct/2016 00:08:16] "GET / HTTP/1.1" 200 -
192.168.0.20 - - [17/Oct/2016 00:08:18] "GET /favicon.ico HTTP/1.1" 404 -

Problem

This works wonderfully. Though one day, I realized that I was relying on this simplistic functionality while testing my opendir-dl project. This was before I had written most of the unit tests, so some testing was done manually (not something I remember fondly). But the nature of the project was difficult to test, since some functionality required interaction with a web server. It was obvious that I couldn't manually start an instance of SimpleHTTPServer each time I wanted to run the tests, so I needed a way give the unit tests the power to start and stop the HTTP server on their own.

As with most challenges, this sounds relatively simple at first. But when we look at how SimpleHTTPServer runs, the simplicity of progmatic interaction is quickly lost. We clearly cannot just run SimpleHTTPServer from the unit test, as there is no effective trigger to stop the server through a linear workflow. For example, lets slightly modify the example code provided by the SimpleHTTPServer documentation.

import SimpleHTTPServer
import SocketServer

PORT = 8000
Handler = SimpleHTTPServer.SimpleHTTPRequestHandler
httpd = SocketServer.TCPServer(("", PORT), Handler)

print "serving at port", PORT
httpd.serve_forever()

httpd.shutdown()
httpd.server_close()

Notice that aside from the whitespace, the only thing I've changed are the last two lines, which will cleanly close the open socket via the shutdown and server_close methods.

Running this code will be no different from the original example code though, because the httpd.serve_forever() will do exactly as it implies- it will serve forever, and execution will never reach the next line. So how do I not serve forever?

Solution

A very simple solution is to run the HTTP server in a separate thread, which can be controlled from the main thread of execution. Makes enough sense right? I found this stack overflow page which covers the core of what I'm trying to accomplish here, so I've cleaned it up and packaged it into a nice class. The first draft of this is as follows:

import threading
import SocketServer
import SimpleHTTPServer

class ThreadedHTTPServer(object):
    handler = SimpleHTTPServer.SimpleHTTPRequestHandler
    def __init__(self, host, port):
        self.server = SocketServer.TCPServer((host, port), self.handler)
        self.server_thread = threading.Thread(target=self.server.serve_forever)
        self.server_thread.daemon = True

    def start(self):
        self.server_thread.start()

    def stop(self):
        self.server.shutdown()
        self.server.server_close()

We can test the functionality of this using the following:

# Start the threaded server
server = ThreadedHTTPServer("localhost", 8000)
server.start()

# Make the request
http_session = httplib2.Http()
response = http_session.request("http://localhost:8000/")
status = response[0]["status"]
print status

# Close the server
server.stop()

When we run that from the file serve_test.py, it yields the following results:

[email protected]:~$ ./serve_test.py 
127.0.0.1 - - [17/Oct/2016 00:35:00] "GET / HTTP/1.1" 200 -
200

This is excellent, and means we can go home for the day now right? I mean, we started and stopped the SimpleHTTPServer just fine. Wrong. We've gotten this somewhat functional, but we now have two problems to address.

Problem 1: Unnecessary Output

The first problem is that the class we have is very noisy. I have many tests to run, and I don't need to see every single request that comes through to the temporary server I'm hosting. We can see that the SimpleHTTPRequestHandler doesn't define it's own log_message method, so we'll take a look at BaseHTTPRequestHandler, which is what is being subclassed. And sure enough, BaseHTTPRequestHandler is defining log_message() here.

Now all we have to do is subclass the handler, and overwrite the log_messages() method. This solution was given by Lauritz V. Thaulow on this Stack Overflow question. The resulting class looks like the following:

from SimpleHTTPServer import SimpleHTTPRequestHandler

class QuietSimpleHTTPRequestHandler(SimpleHTTPRequestHandler):
    # A quiet version of the SimpleHTTPRequestHandler
    def log_message(self, *args):
        pass

Now all we have to do is update our ThreadedHTTPServer to use our quiet handler instead of the SimpleHTTPRequestHandler:

class ThreadedHTTPServer(object):
    handler = QuietSimpleHTTPRequestHandler

Problem 2: 'Socket in use' errors

There are many tests to run, and they need to run quickly, so an important component to the ThreadedHTTPServer is that it cleanly closes the socket it opens. This ensures that the next test doesn't get socket errors while trying to start its own instance of an HTTP server. I had assumed that shutting down the server and closing it properly would be enough, but there is some garbage collection (needs to be confirmed) that slows the socket release process. Unfortunately this slowdown is enough to cause socket lock problems. There is however, a pretty workaround:

SocketServer.TCPServer.allow_reuse_address = True

Telling the SocketServer that it's okay to reuse a port is enough for us to re-bind a new HTTP server to a port before it's been fully cleaned up. This results in smooth sailing all the way through our unit tests.

Updates

Implementing 'with' (Updated on Oct 22, 2016)

I've recently come to learn a little more about pythons 'with' statements, which can be used in situations where a resource requires that some cleanup be done whether or not there is an error. The equivelant can be acheived using try and finally statements. For instance, my unit tests looked similar to the following:

def example_test(self):
    server = ThreadedHTTPServer("localhost", 8000)
    server.start()
    try:
        page_content = urllib2.urlopen("http://localhost:8000/").read()
    finally:
        server.stop()
    print page_content

As described in this excellent post on effbot, we can make this object work with the with statement by defining __enter__ and __exit__ methods. In this case, the only sort of setup and clean up that's needed is the call to the start and stop methods. I've defined these methods as follows:

def __enter__(self):
    self.start()
    return self

def __exit__(self, type, value, traceback):
    self.stop()

That was super simple right? You can see the fully updated version of the ThreadedHTTPServer below in the Final Code section. With this updated version of the class, the unit test case would look like the following.

def example_test(self):
    with ThreadedHTTPServer("localhost", 8000) as server:
        page_content = urllib2.urlopen("http://localhost:8000/").read()
    print page_content

As you can see, that's a huge improvement over the original version of the example test case.

Final Code

I've created a gist on github for this blog post. It contains the most up to date ThreadedHTTPServer class as well as a short example function.