Table of Contents
Python has an enjoyably simple module in it called SimpleHTTPServer, which does exactly as it sounds: hosts a very basic HTTP server. For those that aren't very familiar with the module, running it will host a basic site from the current directory. This is useful for browsing/downloading files remotely (when you're too lazy to scp) or testing web resources.
To drive home the point, one might see the following while hosting a site via SimpleHTTPServer, and then loading it from the browser:
[email protected]:~$ python -m SimpleHTTPServer Serving HTTP on 0.0.0.0 port 8000 ... 192.168.0.20 - - [17/Oct/2016 00:08:16] "GET / HTTP/1.1" 200 - 192.168.0.20 - - [17/Oct/2016 00:08:18] "GET /favicon.ico HTTP/1.1" 404 -
This works wonderfully. Though one day, I realized that I was relying on this simplistic functionality while testing my opendir-dl project. This was before I had written most of the unit tests, so some testing was done manually (not something I remember fondly). But the nature of the project was difficult to test, since some functionality required interaction with a web server. It was obvious that I couldn't manually start an instance of SimpleHTTPServer each time I wanted to run the tests, so I needed a way give the unit tests the power to start and stop the HTTP server on their own.
As with most challenges, this sounds relatively simple at first. But when we look at how SimpleHTTPServer runs, the simplicity of progmatic interaction is quickly lost. We clearly cannot just run SimpleHTTPServer from the unit test, as there is no effective trigger to stop the server through a linear workflow. For example, lets slightly modify the example code provided by the SimpleHTTPServer documentation.
import SimpleHTTPServer import SocketServer PORT = 8000 Handler = SimpleHTTPServer.SimpleHTTPRequestHandler httpd = SocketServer.TCPServer(("", PORT), Handler) print "serving at port", PORT httpd.serve_forever() httpd.shutdown() httpd.server_close()
Running this code will be no different from the original example code though, because the
httpd.serve_forever() will do exactly as it implies- it will serve forever, and execution will never reach the next line. So how do I not serve forever?
A very simple solution is to run the HTTP server in a separate thread, which can be controlled from the main thread of execution. Makes enough sense right? I found this stack overflow page which covers the core of what I'm trying to accomplish here, so I've cleaned it up and packaged it into a nice class. The first draft of this is as follows:
import threading import SocketServer import SimpleHTTPServer class ThreadedHTTPServer(object): handler = SimpleHTTPServer.SimpleHTTPRequestHandler def __init__(self, host, port): self.server = SocketServer.TCPServer((host, port), self.handler) self.server_thread = threading.Thread(target=self.server.serve_forever) self.server_thread.daemon = True def start(self): self.server_thread.start() def stop(self): self.server.shutdown() self.server.server_close()
We can test the functionality of this using the following:
# Start the threaded server server = ThreadedHTTPServer("localhost", 8000) server.start() # Make the request http_session = httplib2.Http() response = http_session.request("http://localhost:8000/") status = response["status"] print status # Close the server server.stop()
When we run that from the file
serve_test.py, it yields the following results:
[email protected]:~$ ./serve_test.py 127.0.0.1 - - [17/Oct/2016 00:35:00] "GET / HTTP/1.1" 200 - 200
This is excellent, and means we can go home for the day now right? I mean, we started and stopped the SimpleHTTPServer just fine. Wrong. We've gotten this somewhat functional, but we now have two problems to address.
Problem 1: Unnecessary Output
The first problem is that the class we have is very noisy. I have many tests to run, and I don't need to see every single request that comes through to the temporary server I'm hosting. We can see that the SimpleHTTPRequestHandler doesn't define it's own log_message method, so we'll take a look at BaseHTTPRequestHandler, which is what is being subclassed. And sure enough, BaseHTTPRequestHandler is defining
Now all we have to do is subclass the handler, and overwrite the
log_messages() method. This solution was given by Lauritz V. Thaulow on this Stack Overflow question. The resulting class looks like the following:
from SimpleHTTPServer import SimpleHTTPRequestHandler class QuietSimpleHTTPRequestHandler(SimpleHTTPRequestHandler): # A quiet version of the SimpleHTTPRequestHandler def log_message(self, *args): pass
Now all we have to do is update our ThreadedHTTPServer to use our quiet handler instead of the SimpleHTTPRequestHandler:
class ThreadedHTTPServer(object): handler = QuietSimpleHTTPRequestHandler
Problem 2: 'Socket in use' errors
There are many tests to run, and they need to run quickly, so an important component to the ThreadedHTTPServer is that it cleanly closes the socket it opens. This ensures that the next test doesn't get socket errors while trying to start its own instance of an HTTP server. I had assumed that shutting down the server and closing it properly would be enough, but there is some garbage collection (needs to be confirmed) that slows the socket release process. Unfortunately this slowdown is enough to cause socket lock problems. There is however, a pretty workaround:
SocketServer.TCPServer.allow_reuse_address = True
Telling the SocketServer that it's okay to reuse a port is enough for us to re-bind a new HTTP server to a port before it's been fully cleaned up. This results in smooth sailing all the way through our unit tests.
Implementing 'with' (Updated on Oct 22, 2016)
I've recently come to learn a little more about pythons 'with' statements, which can be used in situations where a resource requires that some cleanup be done whether or not there is an error. The equivelant can be acheived using
finally statements. For instance, my unit tests looked similar to the following:
def example_test(self): server = ThreadedHTTPServer("localhost", 8000) server.start() try: page_content = urllib2.urlopen("http://localhost:8000/").read() finally: server.stop() print page_content
As described in this excellent post on effbot, we can make this object work with the
with statement by defining
__exit__ methods. In this case, the only sort of setup and clean up that's needed is the call to the start and stop methods. I've defined these methods as follows:
def __enter__(self): self.start() return self def __exit__(self, type, value, traceback): self.stop()
That was super simple right? You can see the fully updated version of the ThreadedHTTPServer below in the Final Code section. With this updated version of the class, the unit test case would look like the following.
def example_test(self): with ThreadedHTTPServer("localhost", 8000) as server: page_content = urllib2.urlopen("http://localhost:8000/").read() print page_content
As you can see, that's a huge improvement over the original version of the example test case.
I've created the following gist for this blog post. It contains the most up to date ThreadedHTTPServer class as well as a short example function.