Tuesday, June 11, 2013

A climate blog reader


Google Reader is kaput at the end of June. I had been lazily eyeing alternatives, but I had also been looking into RSS systems, and it seemed that I could fairly easily write my own. It's a bit like re-inventing, but there are advantages. I used Google Reader a lot, though its limitations were painful. Improved searching is one aspiration. But if you read the feeds yourself, you can accumulate as much back data as you like.

Anyway, I found along the way that I could fairly easily compile an updated searchable list of comments on the main blogs that I was reading. My first attempt is below the jump. So far, I just have a few days data on the main Wordpress blogs. There are a lot of idiosyncracies, so I'll gradually extend it. When it has stabilized, I'll promote it to a page.




Update - I see the time ordering does not work in Chrome, though it does in Firefox. It's late here, so that will have to wait until morning. Although the time order is wrong, the rest seems OK. Fixed

Currently it shows all the comments that it is aware of. Updating is hourly. All times are GMT. You can click the buttons at the top to reorder. Main posts are indicated by a background color.

You can select subsets. You'll see four selection boxes top right. If they are empty, everything passes. Otherwise, for three of them, you can enter names, and only those will show. You can select commenters, blogs and threads. Time is different. You can select one or two times. If one, it will show only posts more recent. If two, it will show posts between those dates.

On the right you'll see a selection panel. The steps are:
  1. Click on one of the selection boxes on left. It will turn pink to show it is active.
  2. Make a name appear in the Result section on right (see below for how)
  3. Click Enter. Your selection will be at the bottom of the list.
  4. You can click delete to remove the top item in the active list.
There are two ways to enter a result. The simplest is to find a line displayed that has the aspect you want. Then click at that level on the red bar that is to the right of the table. You may have to click twice. Your selection should appear. If it is what you want, press enter.

The second way is to enter the first few letters in the text box, top right, and then click. If it gets something else, add more letters. Again, when it is right, press enter.

To make your selection show, just click one of the four column buttons, depending on the ordering you want.

If you want to select by blog, the posts and comments count separately (separate RSS files). Posts are indicated thus (WUWT_).

Issues

There has been more fuss than I expected. I managed to get my IP banned at Lucia's (promptly restored). The RSS files aren't all that standard. My outstanding problem is with SkS comments, which don't have dates. RSS files have a mix of old and new, and I use dates to distinguish. Even the SKS link numbering is non-unique. So I'll have to do a sort of dendrochronology. You'll see that There are chunks of SkS comments, with duplicates.

I'm only passing metadata (links etc) but even so, it's going to get big. After a couple of days the data file is 180Kb. I'm not sure yet how to handle that. At worst, it will be restricted to the last fortnight or so. The limitation is download time. This is all done in Javascript. I may end up downloading in chunks on request, as Reader did.

The main improvement planned is to allow (local) storage of the choices, which I've called environments. You could have several and switch between them.

And of course, to extend the range to blogs covered. Blogger is messier, but I'll tackle it next.

Of course, it all depends on my computer running every hour to catch and process the RSS files. On busier sites, the info is only there for a couple of hours. If my computer fails for any reason, there will be a gap. And it may all just get too hard and come to an end. No guarantees or promises. It's an experiment.

0 comments:

Post a Comment