The most recent post on this blog is from…. last year—ACK! That’s lame. Sorry about that.
In my defense, this year has seen some upheaval in my life, as I decided to leave my cushy Google digs for more exciting pastures. Nothing is settled yet by any means, but they’re getting there, so I’m starting to think again about what’s next for the realchat project. I know I want to keep doing it, so that’s a start.
wsgi, bottlepy, and putting it “all” (read: “some of it”) together
In this post we’ll take a look at the most basic of WSGI apps (helloworld, natch) with the bottlepy micro-framework and how a wsgi app might work with what we’ve already built.
I know, I know. “Micro?” Really? Somewhat groan inducing, but given how many seemingly one-stop shopping frameworks out there that exist I think its important. Bottle doesn’t have an ORM. There’s no middleware for keeping sessions around. There’s a template language but it’s purely optional and we’ll probably steer clear of it. From the bottle page:
Bottle is a fast, simple and lightweight WSGI micro web-framework for Python. It is distributed as a single file module and has no dependencies other than the Python Standard Library.
We’re gonna use it for request context (GET and POST params) and routing. That’s all we need it for, so that’s all we’ll use it for.
Ok, enough soapbox, back to the codez. We’re adding a new directory called chatserver. We’ll add two module files, chatroom.py, which will encapsulate our bottle endpoints (the wsgi app), which are just decorated python functions, and service.py, which takes care of firing up a wsgiref http server in the context of startup components we already created.
Let’s take a look at the main bits of serivice.py as well as our imports:
You’ll notice we imported a few modules we’ve built before, conf and common.start. We use the conf module and arguments to create a context by which the process will run, like logfile setup and forking, then we use common.start.main() to do the running. The params for this call are:
serve_wsgiref: The actual method to call once we’ve setup logging and optionally forked. We’ll cover this a bit more later.
args: the argparse’d argument manager.
srvs: the service configuration list.
logname = ‘chatserver’: just a simple tag for our logs.
monkey_log = True: This flags the common.start.main() function to monkey patch the python logging module so that calls such as logging.info() go to our newly created log. I touched on this briefly here, and you’ll see us use this in the chatroom module later.
This is just really the glue into our startup code we’ve done before. Now let’s take a look at serve_wsgiref() method to see our wsgi app in action.
48defserve_wsgiref(log,logdir,args,here,**kwargs):49'''The realmain serve method, called by common.start.main()50 with nice newly minted log and the here dictionary, which is51 locked by common.serverlock.lock_server().5253 The here has the host and port name, which we use to construct our54 wsgiref httpd to expose our bottle wsgi app. 55 '''56host=here['host']57httpport=here['httpport' ]58log.info('chat service starting. host: %s port: %s'%(59host,httpport))60app=chatroom.APP61httpd=wsgiref.simple_server.make_server(host,httpport,app)62try:63httpd.serve_forever()64exceptKeyboardInterrupt,e:
65pass66log.info('chat service done')
Here, of note:
common.start.main() actually takes a callback, the “realmain” function, which in this case is serve_wsgiref. These calls needs to have a signature as follows: the log, logdir, argparse args, and the here dictionary, which is the socket-locked element we pull out of conf.
The handle to the bottle app is needed to hand off to the wsgiref simple_server. Yes, bottle has it own facilities to start up a wsgiref httpd but I thought it would be nice to show it in action here, especially since in another iteration of this code we’re going to be firing up different http request processing mechanism (like gevent). And yes, I know that bottle also supports starting with gevent. This, I think, counts against the loosely coupledness of the micro-framework, which is a bummer. But thankfully we don’t have to use it.
We call serve_forever(), which, as you may imagine, servers for, um, ever.
Ok, so maybe its not the most boring piece of code ever. There’s some interesting stuff going on here. It’s the most basic of route functions for bottle, we:
Decorate the method with bottle.route().
Call logging.info() with our message.
Set the content_type of the response to be plain text, and return it.
I’m not going to go to much into how bottle works, you can run through the excellent tutorial on the bottlepy site. What you should take away from this is how easy and simple http request routing is with it.
There is some module level setup we do as well, namely reaching into the module level default bottle app and placing it here in this module for easier access by the service.py components, which it uses to start up the wsgiref server. We also call the module function debug(True), which will flag bottle to do stuff like write tracebacks encountered during request handling to stderr (and thus to the log we setup in common.start).
Running The Code (finally)
Finally time for the fun part! Remember we have 2 processes configured and all we have to do is run service.py twice for them to both fire up (because of how we do socket locking). Here’s the commandline:
Bah, looks like the wsgiref defaults its access.log to go to stderr. We could jump through some hoops to fix this but since we’re really just using it as an example we’ll just leave it.
We now have a good starting point, with some hopefully more interesting bits for starting up a service rather then the boring old I can start a wsgi app tutorial. It should be too hard to extract this approach to something more service-like, where we start up N wsgi processes as daemons with their own logs in a relatively straightforward and predictable manner. This will serve as the foundation as we build out our horizontally scaling chat app.
A Note On Importing
You may be wondering how we can have import statements in our code like this:
You may be asking yourself something like “damn those highlighted code snippets on GRD are dope. How do I get such sweet codez?” Well fear not gentle reader!! Not only have I had too much coffee this morning, but I’ve got the hookup for you.
Check out pygments. As you might imagine, it lexes out various languages and outputs nicely to CSS’d html, images, etc. And its written in python, so you can actually utilize it as a programming API.
In fact I was using one set of command lines so much I decided to write up a small script that’ll slice a file and output html with the correct line numbers and offsets. Here it is all gisted up for you.
As the realchat project gets more pieces, testing is starting to become more important (and to be honest, “more pieces” here means, like, 3… which is to say you might as well start doing testing with 1 piece… which is to say which is to say start of thinking about testing at the beginning!)
Case in point: as I’m working on the next bits, I encountered a bug in common/start.py. I was closing of the stdout/stderr file descriptors in daemon mode without also monkey patching them with a wrapper that behaves like the stream but actually writes to the log files we’ve set up. This way anything that does a sys.stdout.write() call, like say a small implementation of a wsgiref simple_server, will go to the logs and not barf out. Simple enough to get back on track.
You may notice, too, that I created a tests directory. As I was debugging this fix I created this simple test script to exhibit the bad behavior, based off the small example snippet on the python docs about wsgiref:
importsysfromwsgiref.simple_serverimportmake_serverfromwsgiref.simple_serverimportmake_server,demo_apphttpd=make_server('alkime',8111,demo_app)print"Serving HTTP on port 8000..."fromcommonimportstarttry:start.daemonize('foo.pid')exceptstart.IAmParent,e:sys.exit(0)# Respond to requests until process is killedhttpd.serve_forever()
I did this mostly because the error was a bit of a PITA because it was the result of closing down file descriptors inappropriately, one of which was the stderr fd, which makes it a weeee bit tricky to know what error is going down.
Anyways, the real lesson for me, after thinking about it a bit, was that this script was already there in my code… I just didn’t realize it because I didn’t think about it as “a test script.” The piece of code was actually in the __main__ piece of common/start.py where I have a small integration test for those components. The bug was there so I should have gone there to look but instead I created a wholly new, wholly redundant piece of code when all I needed was to add a line or two to the current test, which I didn’t remember existed. I wasn’t in the mindset that I even had tests, so why would I go look for them? To mitigate this I created I moved my integration test to its own file and elevated it by putting it in its very own directory. There don’t you feel special.
So yah, the simplest testing lesson is: start out in the mind set that you have tests, and you’ll be more apt to do those tests.
A friend of mine the other day mentioned I should take a look at python fcntl.flock for doing process locking instead of the socket-oriented way I describe here.
The primary reasoning for using sockets is because it gives nice “free” cleanup of the lock if the process should die. Apparently os level file locks do much the same cleanup, so it might be worth checking out. Certainly worth a look.
I’ve been loving my dive into cython. Well loving might be a little strong, considering learning about it has necessitated so many punch-me-in-the-nuts moments that I remember from C hacking days. Yet so far most of those moments have been tempered by moments at the end where something starts to work and I think, “oh that’s pretty cool.”
Here’s one of those moments. So in my “learning cython” project (more on this as it gets further along), I’m wrapping an existing C api. This api defines a type which is basically a big container for callbacks which get fired a various times in a session. In order to best describe such a type in cython, I thought it might be worth while dip my toe into cython-based functional programming.
This small example is a result, with some comments to explain the interesting bits:
foo_lib.h:
1//simple struct with a function pointer2typedefstructFoo_t{3inta;4intb;5int(*func)(inti);
6}Foo_t;
foo.pyx:
1cdefexternfrom"foo_lib.h": 2# mirror the C struct, including the function pointer 3# we can't just pass in the struct body here, the Cython compiler 4# appearently needs to know these symbols in order to, upon instantiation 5# below, translate them info the C stuff defined in foo_lib.h, which makes 6# sense. 7# 8ctypedefstructFoo_t: 9inta10intb11int(*func)(inti)1213# the function we'll be giving our Foo_t14#15cdefintmunge(inti):16print'munging',i17returni**i1819# a python function, in which we define a Cython type, intantiated with the20# munge() Cython function defined above, and call it.21#22defgo(inta):23cdefFoo_tf=Foo_t(a=a,b=2,func=munge)24r=f.func(a)25print'called with %s and returned %s'%(a,r)
After build_ext’ing this guy we have the foo.so module which can be imported. Running the go() method give us this:
>>> import foo; foo.go(2)
munging 2
called with 2 and returned 4
>>>
I’ve never really used spotify much since it’s become all the rage (and, I believe, the single biggest clogger of facebook’s vaguely creepy realtime feed). And I have to admit when they announced their developer platform a little while ago, I rolled my eyes a bit. Seriously, another platform?
But I have to say, after looking at it, especially libspotify which is a C API for interfacing with the music streams the service provides (!!), it seems like it might be, in a word, THESHIT.
Having all kinds of mad-scientisty thoughts… I think I might have found my first cython project…
P.S. Another interesting online music service, centered around music recommendations—i.e., “songs like this song” a la pandora—is Echonest. Including here as a reminder to myself to check it out!
Just commited a small change to the common.start logging setup to allow for a “sort-of” monkey patching of the python logging API. From the commit comment:
common.start.main() now takes a monkey_log flag param which will cause the Logger created by the setup (which routes logs to a file or stdout depending on startup params) to be monkey patched into the logging.root object so that module level calls will us it. So now code can just import logging and call logging.info(), for example, and it will log correctly without needing to have a handle on the Logger instance itself.
I term this a “sort-of” monkey patch because, while its changing something behind the scenes of a standard python api, the object being change is of the same class as the old object, just configured differently. I guess you might call it a monkey-config.
This issue came up because of my starting in on the web endpoints for the app using the bottle python wsgi framework. Unlike a lot of wsgi frameworks out there, bottle takes a module level approach to request routing instead of a object oriented based one (and I would argue this makes it more lightweight, which if you will recall, using lightweight, loosely coupled frameworks is a big part of this project).
Because it operates at module levels, the bottle API makes it tricky to do stuff like “create this object and pass it in to some framework object.” For example, a Logger object.
Luckily python’s logging api is fairly easy to coax into doing what we want:
This code snippet corresponds very closely with the module level code that sets up the root logger in the logging module (in python2.6). The reasoning here should become more clear once I get the commit in for the bottle stuff. Stay tuned.
In discussing this project with a friend of mine, he pointed me at YAML as an option for doing configurations. I haven’t ever really explored it but it seems promising, so I’m putting it here as a reminder to myself to look at it again when the project’s a bit more fully formed (I think our use of the library config is good for now).