Serious Privacy Problems with Bots on Google Wave

Posted by knorby on February 25, 2010 under Python, app engine, facebook, fortune, google, internet, privacy, wave | Be the First to Comment

I started writing this post while Google Wave was still pretty new, but it has been out for a while and half forgotten. It is still in closed preview, but it shouldn’t be hard to find an invite if you want to check it out. As I mentioned in my last post on wave, I wrote a quick fortune bot for wave. The bot got a decent bit of use at first, as many people played around, but now  use has dropped to almost nothing. Based on my own use, I figured early on that most of the use was from 1 or 2 real people interacting with a bunch of bots. I tested and confirmed that with the data google records by default.

Google App Engine, on which all bots must be hosted, by defaults logs any request and any error. A bot can register a number of different events, which will trigger a request to the bot. In the request, the state of the wave is contained in a json format. The log files can easily be downloaded, and the json easily parsed. From that, you see everything. You see the addresses of everyone, you see what has been entered,  even if it doesn’t relate to the events of the bot. As far as I am aware, no TOS or privacy agreement exists that covers the use of this data, and even if it were, the most nefarious uses still would be silent.

By putting data on any web app, you put yourself up to the same risks and invasions. The google ads in gmail are targeted at you for a reason after all. If you are using gmail though, it is a safe belief that google will be the only one other than you to see your data. A bot could be maintained by anyone. Facebook apps are a decent comparison. I have looked at the API a couple times, but my understanding is even with the permissions a user can grant or deny, apps get to see a lot. A fair bit of criticism has been made of this platform, but it is very safe to say the privacy structure in place on bots is much worse. Aside from the lack of permission controls, would you use something like facebook apps on your e-mail or google docs (to the extent that makes sense…)? I hope not.

A wave user has a somewhat unique problem here. If a bot provides a useful service to a particular use, and the wave for this use is private, should you use it? That isn’t a question anyone should have to ask. The question of “put this data in this web app or not” is one thing, but you shouldn’t have to worry about using a pivot tables tool on an online spreadsheet, which is essentially what is going on with bots here. There isn’t really way to distinguish what is a good bot vs. a bad one either. If I wanted to snoop on people on wave, I would write a useful bot, and no one, google included, would be the wiser to what I was doing with the collected data.

I don’t think there is an easy way to fix bots as they are. Anonymous search results aren’t really that anonymous, and I would guess wave data would be much worse. The problem isn’t that App Engine logs requests; the problem is what wave sends. If you consider the data in a wave in anyway private, I would recommend against using  bots.

My Project Ideas for Google Wave

Posted by knorby on November 10, 2009 under Python, coding, doit, google, internet, wave | 3 Comments to Read

Silly:

  • fortune/doit – Implemented. See Wave Fortune. You can use it be adding wavefortune@appspot.com to your contacts. I mostly made this bot to satisfy my fortune lust, and to get more familiar with app engine and the wave bot api.
  • wompus/adventure – Not sure I am actually going to do this one. If I do, it will be the wompus. Basically, the problem to solve is effectively storing state for such games. Wompus is tiny, and the games are short, so it wouldn’t take much thinking. Adeventure/zork would require a lot more work, and I honestly don’t care that much.

Tools:

  • logging interface – It occurs to me that wave might work great in a situation where I think e-mail falls short now: data/msg dumps. I see this sort of thing at my jobs a lot. I get a log messages I generally don’t care about, and I filter them out, and as a result I sometimes miss something. A similar case is something like a bug tracker, where so many replies can be generated that the thread is easy to ignore. Centralization would help a lot I think, but again, I am not sure I care.
  • RPN calculator - Nothing really to explain here. Could do save the calculator’s state in past blips, and make them editable. The end result would be a collaborative calculator of sorts. Could be interesting.
  • something with jMol – Not too much thought here. When I was a student in the Computational Material Science group at ORNL, I ended up playing with jMol a bit from javascript. Some sort of gadget/bot combo could do some interesting stuff, but again, I don’t care.

I will post more about my thoughts on wave later on, as I have many mixed thoughts on it. Google has a lot to do, both on wave itself and extensions that they should provide. I am hesitant to work on large projects, as I don’t want to have google copy my work, or experience some odd situation with app engine. I don’t think anyone, google included, has any remote idea of what to expect from wave yet.