Skip to content

Posting from XML Feeds to Twitter. Or, This post’s title is entirely too long to fit in a single Tweet; whatever can be done about this grave situation?

I recently decided that I’d like to link to all of my blog posts from Twitter; obviously, this is a task for a script, and while I found a lot of blog posts describing one or two pieces of the process, when I put together a list of features I wanted:

  • Periodically poll an XML Feed for blog posts
  • In chronological order, link to new posts from Twitter
  • Use an URL shortening/tracking service
  • Automatically truncate long titles*
  • Limit number of posts linked to per run
  • No external dependencies (i.e., nonstandard PHP libraries, MySQL, etc)

and started Googling for it, all I came across were postings on various script-for-hire sites offering to buy or sell such a script, mostly in the $5-15 range. While that seems like a pretty small price to pay for the convenience, I don’t think I’d have been able to hold my head up high knowing that I’d paid someone else to write such a simple script. So, I started piecing together all of the parts, and came up with this PHP class, which I tossed up on Gist for public consumption:

TwitterFeedSync.php (raw)

As with many of my one-off scripts, this one is sorely lacking in documentation, and I’m not going to break it down line-by-line here as I often do, but I’ve intentionally broken down the steps into discrete chunks that should be pretty easy to digest. In general, the logical flow goes something like this:

  1. Load the timestamp of the last linked post from a file
  2. Load up the XML feed
  3. Starting at the end of the feed, loop through all entries
  4. Compare the date of each entry to the saved timestamp
  5. Shorten link to post using tr.im
  6. Calculate how much space we have left in the tweet for the title, and truncate if needed
  7. Post the title + link to Twitter
  8. Check if we’ve posted our max for this run
  9. Update the saved timestamp

The code linked above includes a sample invocation at the bottom, and can be called every other hour with a cron job entry similar to:

45 */2 * * * /export/scripts/TwitterFeedSync.php

I think it’s likely that the file I/O code, used to store and retrieve the timestamp of the most recent post, could be much more robust; in particular, you mught have to ‘touch’ the timestamp file into existence in order to get the script to run for the first time. Also, the default $date_format parameter is structured to work well with the feed for this blog, which of course is generated by Wordpress; other feeds may require some tweaking. But hopefully, someone out there besides myself can make use of this little snippet of code; if so, I’d encourage you to consider yourself $5-15 richer, and think about how you in turn can pass that value on to others in your world!

* as it turns out, I haven’t needed this yet; maybe I’ll pump up the length on this post’s title…  :)

Categories: Random.

Tags: , , , , ,

Blogger -> Wordpress Migration

I got an email this morning –  several of them, actually — letting me know that Blogger would be discontinuing FTP/SFTP publishing of blogs. Since I don’t like some of the minor restrictions of using Blogger’s hosted publishing, I’ve always appreciated this feature. But alas, all good things come to an end. Faced with the decision of moving from Blogger to some other blogging platform, the decision was an easy one; in my opinion, Wordpress is hands-down the best small-blog publishing platform available. I moved to Blogger several years back for performance reasons; I’d started hosting one of my sites on a severely underpowered server (486-class processor, and 128MB RAM), and running dynamic PHP pages would have taxed it to it’s limits; MySQL completely pushed it over the edge. Currently, this site is running on a Slicehost server with plenty of RAM and the flexibility to easily add more if needed, so dynamic, database-driven pages are back in vogue. Migrating was relatively painless; on the one hand, the Blogger import tool introduced a few errors, some of which still need to be dealt with, but on the other hand, I used this as an opportunity to make some minor styling tweaks that I’d been putting off.

So enjoy — you can now log in via Facebook Connect, by the way — and let me know if you come across any egregious problems with the new site!

Categories: Random.

My server’s hard drive crashed I have a new server

This weekend I spent quite a bit of time (that should have been dedicated to doing something fun, or at least to productive work) dealing with a significant hardware scare. The OS on my web server started freezing up randomly, not allowing me to make any changes to the system, or even to shut it down cleanly. A little bit of investigation showed that the root filesystem was setting itself to read-only, which in turn led me to “unspecified errors” in the SMART diagnostics. My excellent tech support contacts at Core Networks were quickly able to determine that the drive was indeed failing, and after a bunch of prep work and backups, we got the data moved to a new drive. Since mid-2009 I’ve had a colocated server with Core, and I honestly cannot say enough good things about them; they run a very cost-effective colo service, and their tech support is absolutely top-notch. If you need a physical colo server for any reason, I highly recommend them. However, physical servers do have one flaw: they’re physical, and they run on real hardware. Of course, virtual machines also run on real hardware, but the abstraction between the two is extensive enough that failing hardware can be easily migrated away from without the virtual system being aware of the change.

Wasting a day on diagnostics, tech support conversations, backups, and restorations made me question whether a physical server was really what I needed. The conclusion I came to was that I did not, and that a virtual server was the best choice. Although Core recently began offering virtual servers, and I was reluctant to take my business elsewhere, the fact is that while they’ve been doing colo for years, VMs are a very new market for them, and I’ve always had incredible success with Slicehost’s virtual servers. So, I signed up for a “512 slice” at Slicehost (which is a little cramped; I may upgrade in the near future) and have migrated all of my sites and data off of the physical server. While I’ll be sorry to say goodbye to Core, the fact is that I didn’t need the extras that having a colo server provided (hard drive space, in particular, tends to be much cheaper in physical servers) and the extra cost in terms of management burden simply wasn’t worthwhile.

Categories: Random.

Tags: , , , , , , ,

A Simple Django WebAuth Decorator

Like many universities, my employer uses Stanford’s WebAuth Single Sign-On package as one major piece in it’s computing account system. WebAuth is an MIT licensed infrastructure that allows decentralized web applications to securely authenticate users without themselves ever handling user credentials. Websites begin by sending unauthenticated users to a trusted WebAuth server which validates the user and provides them with a ticket which is passed back to the application. The application web server then communicates directly with the WebAuth server to validate the ticket it was given. If the ticket is valid, the application is provided with information on the user, and it can proceed without further interaction with the server. As long as the user’s WebAuth session remains active, any other application the user visits can authenticate the user behind the scenes.

The benefits to users of a consistent security interface are significant, so I’ve recently been pushing myself to make use of WebAuth where possible. Unfortunately, there’s not a lot of preexisting code for integrating WebAuth into mainstream web frameworks, so I’ve had to write my own, which — fortunately for me — hasn’t proven to be all that difficult. Today I’m going to share a snippet of code that I wrote to integrate WebAuth into a Django app:

webauth.py — Pretty Print HTML

webauth.py — Raw Code

@webauth_required is a Python decorator that functions similarly to the @login_required decorator that is provided with Django. Importing this decorator and prepending a view function with @webauth_required is all that’s needed to force that view to authenticate the user. Once they’re authenticated, their username is stored in a Django session for authorization purposes (as ‘netid’ in this code, since that’s the parlance familiar with users and developers on my campus). Obviously the WebAuth endpoint (AUTH_URL) is also specific to my situation, and most WebAuth providers will require registration of client applications to enhance security.

The only thing that’s left is to provide a logout mechanism. My logout view (one of the few views that doesn’t need the @webauth_required decorator in my app!) simply destroys the session data and provides the user with a link to log out of WebAuth entirely; it’s possible for the user to log out of the app but remain logged in to WebAuth, which effectively leaves them logged into the app (since they’ll be re-authenticated behind the scenes if they return), but that’s how the powers that be have asked client apps to behave, so that’s how it is.

Categories: Random.

Tags: , , , , ,