Posted by knorby on November 26, 2008 under Python, google, internet |
After reading the comments on a story on reddit on IQs, I became curious about how IQs are reported on the internet. A few people were saying that when they see someone mention their IQ on the internet, it is usually above 130. The explanations given were along the lines of people lying, biased online tests, and segmentation in where people browse. I was curious what sort of frequencies the different IQs are mentioned, so I wrote up a little python to get the google search results for IQs 50-199 (I would have included lower values after seeing the result, but I choose to go the scraping route rather than gdata, which ends up getting you blocked by google, something I didn’t know). I ran the number with the word “iq”; I think there may be better queries, but simple seemed good enough. Here are the results, plotted with matplotlib:

I found these kind of surprising. Most of the result counts were around 6 million, but there were a few sharp drops. I was especially surprised by 100 and 130, since, if memory serves, 100 is the 50th-percentile for IQs and 130 is the 99th; I would expect a greater count on these two, since more sites would include those numbers while explaining the scale; instead, there are large drops. Weird. I don’t think there is any connection between these results and anything proposed on reddit either.
Posted by knorby on July 3, 2008 under humor, internet |
Twitterfeed got me thinking. How long could I get a twitter-twitterfeed infinite recursion loop to go for? I created the infinatwit to put that question to the test by setting twitterfeed to follow infinatwit and post to infinatwit. I think the next step will be to see how many web 2.0 services I can combine to produce this effect. I think feedburner is next…
Posted by knorby on April 29, 2008 under advertising, internet |
I was randomly browsing last night, when I came across some eBay sniping service. I was recently sniped in the last 5 seconds on a VAX I was bidding on, so I viewed the site with a mild degree of interest. I started to read through the user testimonial page when I noticed a little gem on the page. Each one seemed to be fine on its own; I never really trust these pages, but I could believe that these were real, until I noticed these two:
| Not your Daddy’s Sniping Service!!! |
March 27, 2007, by lambykins |
Hello, I have used a few sniping services and none of them really did it for me. I found BidSlammer and it was just totally different. Very intuitive. I can move fast. Good job guys. Slam-It is awesome, BTW. Wm. Howard, citro_cell
| This is the best one |
March 24, 2007, by lobster_soss |
Hello, I have used a few sniping services and none of them really did it for me. I found BidSlammer and it was just totally different. Very intuitive. I can move fast. Good job guys. Slam-It is awesome, BTW. Thanks, Jeff
It seems that the only things they bothered to change in these two were the title, month, “username,” and end line. The worst part is that they put these two right next to each other. I guess it just goes to show how much you can trust advertising.
Posted by knorby on February 1, 2008 under Chicago, internet, personal |
Since my roommates and I first moved into our apartment, we have left our wifi access point open. No security al all. None of the other people in our apartment building strike me as the type to break any sort of security. Besides, some of my roommates were having some trouble getting it setup with security on their computers, and I really didn’t want to have to configure their computers or deal with problems that came up. Really, I liked the idea of leaving an access point open. I knew the security was weak to begin with, and it can be a lifesaver for others at times. Bruce Schneier wrote a piece on why he keeps his wireless network open that follows this same line of reasoning. Unfortunately, there is a very real problem with open wifi in apartments. Some people moved into the apartment a floor above ours, and it appears they never bothered to get an ISP; they just leached off ours. I would think that few people would mind someone using their connection while waiting to get their own. The connection started to really slow down. I suppose one way to solve the problem would have been to talk to them, but I decided to implement MAC address filtering instead. I suppose such things were to be expected, but I always hate when I end up being disappointed by human nature.
Posted by knorby on January 10, 2008 under blogs, internet |
TechCrunch ran an article a few hours ago (its about 11PM in Chicago as I write this) titled Network Solutions Using Questionable Tactic to Sell More Domain Names, which claims that Network Solutions, the domain registrar, was using its powers as a registrar in questionable ways to reserve any domain searched on their site, thus locking the searcher into buying it from them for much more than they could get otherwise. I thought this believable, but I decided to put this claim to the test for the fun of it, and it is looking much less true, or at least different from what was reported.
I searched for the domain “THISISABOGOUSDOMAIN-NETWORKSOLUTIONSSUCKS.COM”, as I wanted something that was not going to have been taken, and it was not something I would want. Obviously, if Network Solutions put any filters on the domains they pull this scam on, this one wouldn’t be one. Since they loose nothing from doing it, they have no reason to not pull it for every one. I did my first test using whois on my computer, which appears to search verisign databases:
Whois Server Version 2.0
Domain names in the .com and .net domains can now be registered
with many different competing registrars. Go to http://www.internic.net
for detailed information.
No match for “THISISABOGOUSDOMAIN-NETWORKSOLUTIONSSUCKS.COM”.
>>> Last update of whois database: Fri, 11 Jan 2008 04:20:40 UTC <<<
NOTICE: The expiration date displayed in this record is the date the
registrar’s sponsorship of the domain name registration in the registry is
currently set to expire. This date does not necessarily reflect the expiration
date of the domain name registrant’s agreement with the sponsoring
registrar. Users may consult the sponsoring registrar’s Whois database to
view the registrar’s reported date of expiration for this registration.
TERMS OF USE: You are not authorized to access or query our Whois
database through the use of electronic processes that are high-volume and
automated except as reasonably necessary to register domain names or
modify existing registrations; the Data in VeriSign Global Registry
Services’ (“VeriSign”) Whois database is provided by VeriSign for
information purposes only, and to assist persons in obtaining information
about or related to a domain name registration record. VeriSign does not
guarantee its accuracy. By submitting a Whois query, you agree to abide
by the following terms of use: You agree that you may use this Data only
for lawful purposes and that under no circumstances will you use this Data
to: (1) allow, enable, or otherwise support the transmission of mass
unsolicited, commercial advertising or solicitations via e-mail, telephone,
or facsimile; or (2) enable high volume, automated, electronic processes
that apply to VeriSign (or its computer systems). The compilation,
repackaging, dissemination or other use of this Data is expressly
prohibited without the prior written consent of VeriSign. You agree not to
use electronic processes that are automated and high-volume to access or
query the Whois database except as reasonably necessary to register
domain names or modify existing registrations. VeriSign reserves the right
to restrict your access to the Whois database in its sole discretion to ensure
operational stability. VeriSign may restrict or terminate your access to the
Whois database for failure to abide by these terms of use. VeriSign
reserves the right to modify these terms at any time.
The Registry database contains ONLY .COM, .NET, .EDU domains and
Registrars.
Just for reference, there is a six hour time difference between Central (where I am) and UTC. Also, my clocks are in 24 hour time. Anyway, I then went to networksolutions.com and did a search there. It was about 22:23 Central.

I then got the results. It was marked as available, of course.

The test begins….
I tried godaddy first. This test was immediately after the search at 22:25, so I didn’t expect it to change that fast necessarily, but I figure if it was going to happen, it was going to happen soon…

Whois confirmed the same a minute later…
No match for “THISISABOGOUSDOMAIN-NETWORKSOLUTIONSSUCKS.COM”.
>>> Last update of whois database: Fri, 11 Jan 2008 04:26:33 UTC <<<
Time passed, but still nothing… Godaddy kept on giving me the same page, and about an hour later, whois still said nothing:
No match for “THISISABOGOUSDOMAIN-NETWORKSOLUTIONSSUCKS.COM”.
>>> Last update of whois database: Fri, 11 Jan 2008 05:16:42 UTC <<<
According the the TechCrunch story, they only can keep it for 5 days before they have to pay for it, so it seems like if they can’t get it in an hour, there isn’t much point. It is quite possible that they cut it out as soon as anyone said anything, but that seems dubious. Its all very strange…
Posted by knorby on December 31, 2007 under humor, internet, personal |
World of Warcraft, the popular Massive Multi-User Online Heroin Alternative (MMUOHA) almost killed my friend. True story… sort of. My friend, I will call him John, has been playing WoW for a few month, and, like many players of the game, it has sucked away his life. Anyways, he recently had some surgery. Apparently, his arm started to hurt, so he went to the ER. The doctor came out and asked John if he had been in “an extended sedentary state.” John had developed a blood clot in his arm. I don’t know if this would have killed him, but it couldn’t have been good. I think there is only one word to describe this story: WOW. Granted, WoW had little to do with this problem, but it is true that WoW claims many lives each year, even if it is not in the physical sense.
Really, its amazing how addictive certain types of games can be. They tend to be games that allow players to have stats and doesn’t have any clear end game. When you add in the internet factor, it just means that one can find others who care about something so meaningless. Maybe people should carry around a notepad with “exp” written on the front. Anytime the carrier does something favorable in his (it is probably safe to assume that anyone who do this would be male) life, he adds some points to his score. Every so often, he can level himself up once he gets to a particular number of experience points. He could keep a board with various characteristics like dexterity, strength, etc… and just add a few points to a some of them each time he levels up. Would it be anymore absurd than those who spend months playing these games just doing tasks to level up? From what I understand, most of what you have to do is go around killing small animals. WoW.
Posted by knorby on December 12, 2007 under coding, design, internet, web design |
Since I am not a big fan of spam, I normally do something to obfuscate my e-mail address whenever I put it up on a publicly accessible site in plain text. I usually don’t do anything special; I just do something like name {at} example {dot} net. As far as I am aware, spammers have not started to collect addresses formatted using such methods, but it has always bothered me, because it would be so simple to collect such addresses. Of course, that fear is far greater with any sort of compute-generated obfuscation. For example, mailman has a particularly dumb formatting. Every address is like name at example.net. Since spammers don’t have to worry about false positives, they could collect every address off of a mailman archive just by joining together every word immediately before an ‘at’ with the word that comes immediately after the ‘at’ with an ‘@’. Without reducing the accuracy, the false positive count could be reduced just by checking for a dot in the latter word. Really, the difficultly in extracting such addresses is incredibly minimal. The only reason that it is not an issue yet is that there are still enough people out there who put addresses up without any sort of obfuscation, so the task is still easy. I am think that spammers will have to start collecting such addresses soon enough, if they haven’t started already. So my goal is to determine exactly how I would go about an e-mail extraction system were I a spammer; this way, I can determine what sort of addresses could not be extracted easily. To start, let’s go through the assumptions we are making about spammers:
- False positives aren’t important. There will be plenty of bad addresses already, so a few more won’t hurt.
- Want to keep everything simple. The spammer is not looking for the theoretically best system, just something that works and is simple to write.
- Want to write things for the most global cases. If someone does something unique, then we should expect the system to fail.
- Keep the system to the level of joining together strings. Don’t look for cyphers or anything like that.
So of all things, the obfuscation method is least likely to do anything to the actual text in the address. In the name@example.net example, ‘name’, ‘example’, and ‘net’ are never changed. The only constants here are the top level domains such as ‘com’, ‘net’, ‘org’, ‘edu’, etc…. On the web, the most frequent occurrences of these TLDs will be in URIs and e-mail addresses. The first step would be to filter out the URLs from this mix. Any address without a protocol specified would result in a false positive. First thing to do is find all the obvious addresses. With the rest of the TLDs found, if the match is connected by a dot to anything to the left, join the word to the word occurring two positions to the left with an ‘@’. Otherwise, join the word two positions to the left with a ‘.’, and then join that new string with the word two positions to the left of that with an ‘@’. The spammer surely could think of other methods like these I have outlined. This exercise makes it clear that the only way to avoid most trouble is to come up with some sort of encoding method that is human readable, but is obfuscated to these sorts of general extraction methods. With something like an e-mail address on a uchicago system, if it is listed on a uchicago site, it is possible to make abbreviations like uchic.... edu. These sort of obfuscations could still be detected by a sophisticated extraction system, but it would be too much of a hassle for too limited results. There are other tricks one could employ along these same lines; for example, the addresses name..../\7.....example#...#net or USER:(....name....) |AT| ADDY:(....example{dot}net....) . The problem with these is that they are just human-readable, and a means to extraction is not that far off.
What is the best solution? For all intents and purposes, I consider javascript obfuscation the equivalent to putting addresses in plain text without obfuscation. As I have previously discussed, it is pretty easy to extract the contents of the DOM from firefox. The first method that comes to mind is essentially a series of variations on the barely readable obfuscations. Basically, using php or something else server-side, addresses can be written as normal, and then encoded. The problem with this solution is that these methods are barely human-readable, and the text is still left unaltered. What sort of method could leave everything as human-readable but also modify the text itself? I haven’t been able to think of anything. Perhaps I will think of something soon…..
Posted by knorby on November 20, 2007 under design, internet, media |
Not to be a shameless fanboy, but I really love Woot Shirt. I am a night owl, so I have no problem whatsoever checking this site all the time. There are at least a few decent ones per week, and $10 for a decent t-shirt with shipping included is hard to argue with. Since all the designs are done by random people, and there are random contests to decide some. I have definitely purchased quite a few of these. Anyways, its worth a look or two if you haven’t already seen it.