MeeTimer and myware and SQLite

I’m interested in the idea of self-interested self-surveillance. Long before we had PMOG (the Passively Multiplayer Online Game, now called The Nethernet) to make a game of it, Seth Goldstein was calling the idea ‘myware’ and building the (short-lived) AttentionTrust site. As Fred Wilson said at the time, “If someone is going to spy on you, it’s probably best if its you.”

With that in mind, I installed MeeTimer over the weekend. It’s a Firefox plugin which…

records where you spend your time online. It does it in a rather useful way, by allowing you to group websites into activities … so you can make sense of where your time is going. Finally, it accumulates time spent on a site over the course of a day…

I’ve been using it for 3 days and it’s giving some interesting food for thought.

MeeTimer

You can even optionally set up ‘tab warnings’ on specific groups (sites you’ve labeled ‘Procrastination’, say) which will pop up with a nice overlay telling you exactly how much time you’ve wasted in this site, and others in the same category (though allows you to click through and ignore the warning just this once or for the current browsing session if you still want to). I’m already finding this feature useful on the handful sites whose feed I’m subscribed to but for some reason still find myself visiting out of habit. (For me, it’s Waxy links and Boing Boing. I love them, but I’d rather be reminded to enjoy them as part of my feed reading routine rather than browsing out of habit. I bet you have your own which make you ask is this really what you want to be doing right now?). A little reminder is really useful for habit-breaking here.

Mostly MeeTimer is just quietly keeping track of a bunch of per-site accumulators, cleverly based on whether Firefox has focus and which is the currently active tab. The results are already interesting. I realised that I was spending a bit less time on Twitter and Flickr, and a bit more time on work webmail, than I thought.

This is all very well, but I want more. Specifically, I wanted to get at the data. Not just the accumulated weekly/daily/monthly (etc) totals and averages, but the number of visits to each site per day. The raw visits. In as much detail as possible. I want CSV exports, or an API, or something. If I’m spending a daily average of 21 minutes on Twitter, how many visits comprise that time? MeeTimer simply doesn’t tell me.

Or does it?

Digging around my Firefox profile, I find a very interesting file at /Library/Application Support/Firefox/Profiles/{profile-id}/meetimer.sqlite. Ooh, I bet I know what that is. So I open up SQLite and start poking.

Sorry. It’s about to get a bit dull from here on in. Unless you get excited about the idea of being able to manipulate this data you’ll probably want to scroll down to the end. Honestly, I won’t mind.

They’ve gone? Right. Let’s get hacking.

$ sqlite
SQLite version 3.6.12
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> .restore meetimer.sqlite
sqlite> .tables
deterrent_stats  groups           log            
deterrentlinks   groups_urls      url_maps       
deterrents       ignored_urls     urls

Excellent. We’ve got tables with sensible names and everything. Let’s see what log looks like.

sqlite> .headers on
sqlite> select * from log limit 3;
url_id|startdate|duration|day|week
4|1238324612508|3|200987|200913
5|1238324617244|44|200987|200913
6|1238324647668|17|200987|200913
sqlite> select * from urls limit 3;
id|url
1|mail.google.com
2|www.google.com
3|www.google.co.uk

Lovely. Easy enough then. The groups and groups_urls tables do what you’d expect too. For now, let’s make url_id more meaningful by doing a join with the url table.

sqlite> select
  url_id, duration, day, week, url
from log
left join urls on log.url_id=urls.id
limit 5;
url_id|duration|day|week|url
4|3|200987|200913|google.co.uk
8|40|200987|200913|meetimer.productivefirefox.com
4|16|200987|200913|google.co.uk
11|10|200987|200913|technorati.com
12|14|200987|200913|google.com/reader/

What if we wanted to show the number of visits, the total duration, and the maximum length of duration for visits to Twitter…

sqlite> select
  count(url_id), sum(duration), max(duration), url
from log
left join urls on log.url_id=urls.id
where url = 'twitter.com';
count(url_id)|sum(duration)|max(duration)|url
34|2712|455|twitter.com

Excellent. I wonder what the top seven URLs when ordered by the number of visits?

sqlite> select
  url_id, count(url_id), sum(duration), max(duration), day, week, url
from log
left join urls on log.url_id=urls.id
group by url
order by count(url_id) desc
limit 7;
url_id|count(url_id)|sum(duration)|max(duration)|day|week|url
9|34|2712|455|200989|200914|twitter.com
10|30|1075|249|200989|200914|search.twitter.com
1|22|2505|928|200989|200914|mail.google.com
4|20|206|57|200989|200914|google.co.uk
17|18|476|114|200989|200914|flickr.com
21|10|2480|2125|200989|200914|bbc.co.uk
39|8|13152|10212|200989|200914|webmail.bbc.co.uk

Twitter, with 34 visits. Sheesh. And for comparison, the top 7 sites by total duration of visit?

sqlite> select
  url_id, count(url_id), sum(duration), max(duration), day, week, url
from log
left join urls on log.url_id=urls.id
group by url
order by sum(duration) desc
limit 5;
url_id|count(url_id)|sum(duration)|max(duration)|day|week|url
39|8|13152|10212|200989|200914|webmail.bbc.co.uk
9|34|2712|455|200989|200914|twitter.com
1|22|2505|928|200989|200914|mail.google.com
21|10|2480|2125|200989|200914|bbc.co.uk
12|6|1355|633|200989|200914|google.com/reader/

13152 seconds (3.6 hours) on my work webmail between Sunday morning and Wednesday aftenoon. And all done in 8 visits. Yuck.

Ok. Let’s start thinking about daily summaries. Grouping by day, and then by URL (since I’m not very good at SQL, and don’t know how to limit it to 5 per day, I’ll just manually snip out all but the top 5 for each day for now)…

sqlite> select
  url_id, count(url_id), sum(duration), max(duration), day, url from log
left join urls on log.url_id=urls.id
group by day, url
order by day, sum(duration) desc;
url_id|count(url_id)|sum(duration)|max(duration)|day|url
1|2|306|228|200987|mail.google.com
9|6|296|217|200987|twitter.com
12|2|225|211|200987|google.com/reader/
28|1|128|128|200987|hunch.com
21|1|66|66|200987|bbc.co.uk
[...]
39|3|10222|10212|200988|webmail.bbc.co.uk
21|3|2155|2125|200988|bbc.co.uk
9|18|1494|235|200988|twitter.com
1|12|1003|185|200988|mail.google.com
10|14|777|249|200988|search.twitter.com
[...]
39|5|2930|2667|200989|webmail.bbc.co.uk
1|8|1196|928|200989|mail.google.com
9|10|922|455|200989|twitter.com
12|1|394|394|200989|google.com/reader/
21|6|259|151|200989|bbc.co.uk
[...]

And returning to the original question of just how many visits do I make to Twitter

sqlite> select
  count(url_id) as visits,
  round(sum(duration) / 60.0, 2) as total,
  round(max(duration) / 60.0, 2) as longest
from log
left join urls on log.url_id=urls.id
where url = 'twitter.com'
group by day
order by day;
visits|total|longest
6|4.93|3.62
18|24.9|3.92
10|15.37|7.58

So it seems that on Sunday I made 6 visits for a total of about 5 minutes and a single longest session of 3 and a half minutes. On Monday it was 18 visits for a total of 25 minutes including one session of nearly 4 minutes, while today, 10 visits so far (including one of over 7 minutes) have already added up to over 15 minutes.

.mode csv

in SQLite is handy too, because it changes that list format to look like

visits,total,longest
6,4.93,3.62
18,24.9,3.92
10,15.37,7.58

so it’s trivial to open it in a spreadsheet.

Making graphs from MeeTimer

Even better will be something cunning and programmatic. Maybe in PHP or Ruby or something. Even this exploratory manual approach is fun though. It will obviously be better once I’ve built up a bit more history but now I know that MeeTimer is storing my data in a way that I can access it, I’m even more excited about it. Thanks, MeeTimer. You rock.

6 Comments »

RSS feed for comments on this post. TrackBack URI

  1. As for something cunning and programmatic, it’s probably really bad form to mention what my company’s doing – sorry! this is my guilty-looking face – but this kind of application was one we had in mind for Timetric. Some people are already using it for self-surveillance applications; monitoring Inbox Zero, things like that (eg http://timetric.com/series/PAOQhl_sQ3CkFvgsJjxvgA/).

    Our API docs are up at http://timetric.com/help/httpapi/, and there are Python bindings on Github. Drop me an email if I can help out – this’d be a really nifty tool to get in people’s hands!

    Comment by Andrew Walkingshaw — March 31, 2009 #

  2. Nice graphs! I’ll have a play, thanks. How does Timetric differ from Pachube?

    Spitting out a series of time-stamped events for Timetric to track should be trivial. Especially since it looks like the ‘startdate’ column in MeeTimer’s log table is simply the Unix Time stamp for that event (though expressed in milliseconds rather than seconds).

    I’m thinking it would be easy to chart the session duration, or (with an accumulator) expressing it as a daily cumulative total. This would give a more meaningful ‘current value’ for a live chart e.g. ‘how long has roo been online today?’ (and eventually, ‘how long has roo been on site x today?’).

    Comment by Roo — April 1, 2009 #

  3. Yeah, the cumulative data (per day) would be one way of doing it – it’s probably what I’d do. Session duration might be interesting too, if you were trying to track how long you spend (on average) per visit, but it’s not so natural a fit, I think. (Glad you like the graphs! I’ll tell Dan, who wrote the plotter.)

    Pachube and Timetric have something in common, but really I think our emphases are fairly different. Pachube’s very much focussed around routing sensor data, from what I’ve seen; we’re more general, in that we have an API you can use to hook up sensor data, but the thing we’re really focussed around, at least in my head – my colleagues might give you a different answer! – is comparison and analysis.

    So, for example, using Timetric you can easily set up a calculation on a series or set of series – say a moving average or sum over the last seven points for weekly stats (click “filter” on a series page if you’re logged in to see what I mean). Equally, if you prefer, you can set up a new series which is the result of a formula acting on several other series: say you want to know the total time you’re spending on groups of your favourite sites by what kind of site they are (digg + reddit + Hacker News, or last.fm + youtube, etc…). Even better, you could take your procastination data, and mine, and build a new series equal to the difference between the two, and heckle me whenever I’m skiving – because whenever any of the underlying data’s updated, the series built on that data are updated too, so the results of formulae are always up to date. And so there’s a good range of data to compare with and build on, we’ve got a load of public data on there to go along with the data our users upload.

    Here’s another example. Imagine there was another series – say the weather (which is public data we’re working on). Maybe you’re on the Web more when the weather’s cold? That’s something Timetric’d, hopefully, help you find out.

    http://timetric.com/help/formulae/ is our primer on the modelling/calculation engine.

    We’ve been doing some analysis of the data in the news, which is a pretty decent overview of the kind of analytics we’ve got – that’s at http://byline.timetric.com/. The most recent post about the G7 and BRIC economies uses derived/calculated series pretty heavily – click through to the underlying series from the graphs!

    Comment by Andrew Walkingshaw — April 2, 2009 #

  4. Thanks Andrew. Timetric looks more and more interesting the more I play with it. The analysis features around plotting comparison and calculations are indeed very nice and your web usage vs weather example is a great one. Now I see the differences in emphasis from Pachube, I can see why I might want to pump raw data about not only my web usage but also my home electricity/gas/water usage, home internal temperature, doorbell presses (etc) into Timetric as private feeds. So please don’t have any security breaches, mkay? :-)

    Comment by Roo — April 3, 2009 #

  5. great post and thanks for the detailed view… I installed RescueTime https://www.rescuetime.com/ a while back to understand how much i was really spending in outlook vs making stuff and it was worse than i thought… made me realise i am turning into a manager! Not sure if you can get to the data in rescue time but i left it running since i find it useful for the occasional self surveillance.

    Comment by Dunc — April 3, 2009 #

  6. [...] mentioned last week that I’d installed MeeTimer and was using it to track my browsing history. Now [...]

    Pingback by Roo Reynolds - Browsing my browsing — April 5, 2009 #

Leave a comment

XHTML: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Powered by WordPress with GimpStyle Theme design by Horacio Bella.
The postings on this site are my own and don't necessarily represent my employer's positions, strategies or opinions.