Friday, January 20, 2006

Google Power

While Microsoft is preoccupied with censoring and security problems, Google continues to deliver to the people.

Check out the beta version of google video.

Although Google's compliance with Chinese commies means users of the search engine in China are subject to the censor wall, unlike other search engines, such as Yahoo, Google is so far resisting demands by the US government to provide the contents of their search term database. As part of a governmental probe concerning online pornography, the Justice Department is also requesting a random sample of one million web addresses.

Google Inc. has been subpoenaed by the U.S. Justice Department to turn over a database of search terms as part of a government probe of online pornography but Google rejected the demand as overreaching by the government.

In a Wednesday filing in the U.S. District Court for the Northern District of California, the Justice Department demanded that Google provide all queries entered on the company's Web search system between June 1 and July 31 of last year.

The Justice Department includes a request for Google to produce a random sample of one million Web addresses, known as URLs.

The data request is part of a broader government effort to track the effectiveness of a 1998 law, the Child Online Protection Act, or COPA, which penalizes Web site operators who allow children to view pornography, the filing said.

A 2004 U.S. Supreme Court decision, Ashcroft vs ACLU, upheld an injunction that blocked the government from enforcing the law and the Justice Department is seeking evidence from Google and others as part of an appeal of this injunction.
See Search Engine Watch for a detailed analysis of the situation:
Getting a list of all searches in one week definitely would let US federal government dig deep into the long tail of porn searches. But then again, the sheer amount of data would be overwhelming. Do you know every variation of a term someone might use, that you're going to dig out of the hundreds of millions of searches you'd get? Oh, and be sure you filter out all the automated queries coming in from rank checking tools, while you're add it. They won't skew the data at all, nope.

Moreover, since the data is divorced from user info, you have no idea what searches are being done by children or not. In the end, you've asked for a lot of data that's not really going to help you estimate anything at all.

[..] It's important to note that from what I read, the requests do not involve user data at all. Shutting off your cookies or purging your personalized search data wouldn't protect you with this request, because the request wasn't going after personal data. To stress again:

According to the report, they wanted a list of one million web addresses. Not who went to the web pages and when, just a list of URLs picked randomly.

They wanted searches for one week. I haven't seen the court documents, but I'm guessing Google could have handed over a list of searches that were entirely unassociated with IP addresses, times, cookies and registration information. Nothing suggests that they wanted to know who did the searches in any way.

Having said this, such a move absolutely should breed some paranoia. They didn't ask for data this time, but next time, they might. Of course, it bears reminding that this type of data is easily obtainable from ISPs. So even if the search engines refuse to comply, your own ISP could be giving up your data -- or selling it.