and I'm all out of bubble gum…

For the last few years (my JSON feed tells me: since 2008), I have been tagging and annotating articles of interest as they passed before my eyes in Google Reader. This served a two-fold purpose:

  1. I could find them again later, easily, because they were tagged and annotated.
  2. I could share an RSS feed with those annotations to particular interest groups that I worked with (e.g. anything tagged “for robotics” would show up on my advanced computer science class’ portal page, or anything tagged “for academic computing” would show up on my school home page).

This was a great way to share (and manage) resources. Granted, much of what passed before my eyes in Google Reader was trivial and not of lasting value, but this filtering allowed me to hang on to at least a few gems for future reference.

And then Google Reader got the Google+ treatment and sharing items broke. But you could download a JSON dump of all the items that you had ever shared. It wasn’t entirely clear what you could do with this JSON dump, but… there it was. And then: I realized that all of my other information is stashed on my web server (and that I have become increasingly distrustful of relying on cloud services to maintain my data and workflows — e.g. my weekly backup of all my Google Docs… just in case).

Wouldn’t it be handy to import that JSON feed into a new blog on my server? So I wrote a PHP script that converts (at least my) Google Reader JSON dump into an XML file that WordPress can import as a list of posts. With the tags and annotations converted over. In fact, with all of the data in the JSON dump embedded in the XML file (although WordPress doesn’t read all of it).

This comes with a few caveats:

  • For items that came from blogs with a full feed, the result is a republication of the original post — which feels ethically dubious to me. (I have made my new blog of Google Reader shared items private, so that I have the data but I’m not sharing it with the world).
  • I’ve made guesses as to how to treat some of Google’s data. Reasoned, educated guesses, but guesses nonetheless. For example, I’m not super-clear on which dates in the file correspond with what events — does a publication date refer to when the item was shared or the original post was posted?
  • I’ve added in some arbitrary (and therefore, ideally, eventually, configurable) WordPress tags to make the import go more smoothly. Where I have done that, I mark it in the script as a TODO item. (And, in truth, I didn’t really test to see if all of these items were necessary.)
  • The original authors of the posts are transfered to the XML file, which means that when the actual import into WordPress is done, you will have the option to either laboriously create a new user for each distinct author or simply revert authorship to the currently logged-in WordPress user. It doesn’t seem like WordPress has a format for exporting or importing users (or, at least, my cursory search didn’t find it). Clearly an ancillary SQL query could be generated that pre-populated the WordPress database with the users that the XML file refers to. But I haven’t bothered to do that.
  • You’ll need your own PHP-compatible webserver to run the script, since I have been quick and dirty and simply imported the JSON file from and exported the XML file to the script’s local directory. And I have no interest in setting up my world-facing webserver to take the traffic hit of processing other people’s multi-megabyte JSON dumps.
With that said, here is the script, as it stands this morning.

November 27th, 2011

Posted In: How To, Social Bookmarking, Social Media, Useful Tools

Tags: , , , , , , ,

One of my responsibilities at Jewish Day School is to write a weekly “tech tips” column for the online faculty news. This is one such tip.

One of the oft-touted features of social media, blogs, and news sites is RSS feeds. The phrase “subscribe to my feed for updates” probably connotes some twenty-something layabout in a coffee shop, but, in fact, RSS feeds are enormously useful to grown-ups (like thee and me) for managing large (vast, huge) amounts of information.

First, RSS stands for Really Simple Syndication. In this case, we’re using syndication in the same sense as a newspaper syndicate (not a crime syndicate — there’s different software for that): suppose Dave Barry writes for the Miami Herald and the San Francisco Chronicle carries the Bizarro cartoon. How is it that we open the LA Times and see both of these in our paper? The newspaper syndicates distribute all of the updates to Dave Barry’s column and Bizarro just prior to the newspaper going to press each night.

RSS, really simple syndication, is a newspaper syndicate for the rest of us: we can subscribe to the RSS feeds on web sites for updates from that web site, and use a feed reader (Google Reader, Newsgator, Bloglines, iGoogle, etc.) to present all of these updated feeds to us in one place. Common Craft has a short (brisk, even) video explaining this:

All of the blogs on our school blog server have RSS feeds. In fact, you can subscribe to updates from a particular category on a blog, or to updated comments on a particular post on a blog, if you want. Major newspapers provide lists of RSS feeds for their articles.

As a voracious newshound myself (I used to read two, three and sometimes four newspapers in a morning), I’m finding that — with RSS feeds — I no longer even open a physical paper. Instead, all of my information comes through Google Reader subscriptions to blog and newspaper feeds. About 5,000 updates a week.

For more plain English explanations of web technology, check out Common Craft’s YouTube channel.

April 22nd, 2010

Posted In: "Tech Tips" Column

Tags: , , , , , , , , , ,