By Mark Malseed
An unsettling reality has begun to descend on the millions of fans and devotees of the Internet giants Google and Yahoo: They know an awful lot about us.
Every Google search ever typed, every Yahoo news article ever read—all are logged and stored indefinitely in these companies’ massive databases. Think about that for a moment. We whisper a lot into the ears of these shadowy search engines, including plenty of secrets that we’d want to keep from our spouses and kids. And we do so without ever bothering to check what is being done with that information.
If you don’t already know, let me be the first to tell you: Google, Yahoo and their less-well-known brethren are keeping tabs on what is being searched, viewed and clicked on, all across their sprawling Web empires.
You know all those e-mails you’ve sent using free services such as Yahoo Mail or Google’s Gmail? They are kept for posterity on company servers, even in cases when they have been deleted from users’ accounts. And instant messages? A new service from Google leaves a digital record long after the conversations have been forgotten. Driving directions? Not only do Google and Yahoo know the way to our intended destination, they also know that we probably made the trip. (We all but told them we were going, didn’t we?)
Searches are not by default linked to our names—only to an Internet address or a unique browser ID. But armed with that information, investigators and sometimes the companies themselves can make the crucial link to our names and addresses.
Existing laws offer fewer protections for data and e-mail communications stored by a third party than for the contents of someone’s personal computer. And though there are gray areas in the law, this much is clear: plenty of what the search engines have amassed about us may be obtained without a wiretap or search warrant.
The words we type into Google may seem anonymous and innocuous at the instant we’re doing a search, almost as if we are confessing to some digital high priest: “Lord, just between you and me, I am fascinated with . . . recreational drugs, bondage, Islamic radicalism, how to cheat my friends at poker. . . . ” But our inquiries leave behind permanent tracks that could come back to haunt someday.
Consider for a moment what a complete history of just your Internet searches alone might reveal. Chances are the list would offer pretty good clues as to your political leanings, your health condition, your finances, your job satisfaction, your marital fidelity, your obsessions and addictions, and plenty else that you may want to keep private.
Add to that the full archive of your Web e-mails—and depending on what other Yahoo or Google services you use, a partial or full record of your Web surfing habits—and these companies have got a fairly comprehensive digital dossier on you, me and several hundred million other people.
This is a treasure trove by any accounting, and potentially a very valuable, perfectly legitimate asset to criminal and terrorism investigations. But such an accumulation of personal data also presents a tempting target for intrusive fishing expeditions by law enforcement, divorce lawyers, government prosecutors and even less savory characters.
What’s more, the whereabouts of this data are generally kept secret. The records are sequestered in undisclosed locations, entirely out of our control, and may even be stored in a country other than the one in which a user lives, raising potential legal complications.
In China, both companies have come under fire for complying with the communist regime’s censoring of the Internet. Yahoo has twice turned over personally identifying information about Chinese dissidents that led to their being jailed.
Here in the United States, the search engines say they comply with legal requests for information, but they rarely comment on the extent of their cooperation in handing over search data for criminal or civil cases. (In at least one case, a person’s search history was used in prosecution, but that information was skimmed from his own computer.) Nevertheless, unless laws are rewritten or company policies changed, the search engines will find themselves increasingly bombarded with subpoenas for their users’ search histories.
If all this sounds like a privacy disaster in the making, it is one that until recently has received scant attention in the mainstream press. Google’s generally gung-ho media coverage has glossed over some of the serious concerns that privacy advocates have voiced about its information-hording habits.
This week’s cover story in Time magazine is a perfect example. Despite a cover headline that provocatively asks, “Can We Trust Google With Our Secrets?” the article largely dodges the question, instead retracing (yet again) the admittedly impressive rise of the seven-year-old firm. Not until the last paragraph does it break what is surely news to many people, mentioning in passing that Google “retains loads of our data—what we search for, what we say in our Gmails—so we need to know it won’t be evil with them.”
Well, indeed, it would be reassuring to know Google intends to uphold its unofficial motto of “Don’t Be Evil” with regard to our records. But how does one prove that? What we really need to know more about—and this is a matter of fact, not conjecture—is what data are being retained; for how long; who has access to the information, and for what purposes; and what our rights are under the law.
Given the fact that Google just made a $1-billion investment in AOL (which is owned by Time magazine’s parent company, Time Warner), one might expect the magazine to give the search company gentle treatment on touchy issues. It’s not going to bite the hand that is helping to rescue papa. But how about some tougher questioning? The softballs lobbed in the accompanying interview transcript reveal that the Google guys are still fun and down to earth, which they are, but they shed no light on the advertised topic of “Can we trust them?”
The existence of detailed logs like the ones Google and Yahoo compile has never been a secret among technology insiders. Owners and developers of websites naturally want to have data on what’s being viewed, how often and by whom, as this helps in analyzing and improving operations and in spotting malicious attacks. In some ways, it is no different than in the offline world, where businesses like to keep a careful eye on their inventories and customers.
So last month’s news that the Justice Department had subpoenaed search records from Google, Yahoo, AOL and Microsoft came as an eye-opening jolt. Loyal users, investors and the media began asking long-overdue questions. What exactly was in those records and for the taking? Could search engines produce lists of what searches came from what Internet addresses?
Tech-news site CNET posed a series of specific questions along these lines to several major search engines, but many of the responses were comically short on detail. “We keep data for as long as it is useful,” said a Google spokesman when asked if records were ever purged. A Yahoo rep offered this: “We maintain data that will help us provide users with the best possible experience.”
The specifics of the Justice Department subpoena were as follows: Federal prosecutors, hoping to revive a previously overturned law protecting minors from exposure to pornography, went googling for data that would buttress their case. (Some early reports about the subpoena said the law in question dealt with child pornography, which was not true.)
Initially, the government demanded a list of every website address available on Google and every search term entered during July 2005—a staggering amount of data, considering that Google handles 300 million searches per day. The request was later narrowed to a list of 1 million random Web pages and all the search queries for a given week.
Perhaps trying to show off its bureaucratic muscle for data-crunching, the Justice Department also requested similar information from Yahoo, America Online and Microsoft, all of which have said they turned over some aggregated data, though they have not specified how much.
Their compliance with the subpoena is disappointing from a privacy standpoint, but it does not add up to a doomsday scenario. None of the search engines released any personally identifiable information to the government, nor were they asked to.
Google, to its credit, gallantly refuses to turn over any data at all. The company is being seen by many as taking a stand against the Bush administration, which is not well liked in Silicon Valley. “The demand for the information is overreaching,” Google attorney Nicole Wong told the San Jose Mercury News, which broke the subpoena story. Google co-founder Sergey Brin later told Bloomberg, “We don’t think it’s a proper subpoena for some legal case; it’s not anything we’re even a party to.” (A court hearing is scheduled for Feb. 27.)
Brin’s main reason for putting up a fight, of course, is to protect Google’s business. The Internet is as hotly competitive as ever, and while Google holds a commanding market share in search, Yahoo is still the most visited website in the world and Microsoft is still king of the desktop. Google does not want to give them or anyone else a window into its proprietary information. Nor does it want to see a precedent established for regular government trawling of its data, which might make users and investors skittish.
But Google has its work cut out, in part because of the high expectations it has set for itself. Even as the search leader seems to be standing firm against the Department of Justice, it sent the opposite signal last month when it rolled over and acceded to the Chinese government’s wishes.
By launching a China-based service in January, Google agreed to actively restrict certain Web pages on the totalitarian government’s behalf, a stark departure from its thumb-in-the-nose approach to, at one time or another, its venture capitalists, Wall Street and even the SEC.
Although Google has operated a Chinese-language site for several years, the site had been run from outside the country and served unfiltered content that the government then censored through its so-called Great Firewall of China. After much internal debate, Google’s founders and executives made peace with the Chinese regime’s demands, deciding that was best for business and, they also argue, ultimately was a way to break down the oppressive speech restrictions.
Yahoo, which also operates in China, has come under even more intense fire for its apparent role in the jailing of two activists. The cases, which have been publicized by the human rights groups Reporters Without Borders and Amnesty International, have also sparked bipartisan criticism in Washington.
“I don’t like any American company ratting out a citizen for speaking out against their government,” Rep. Tim Ryan, an Ohio Democrat and member of the House Human Rights Subcommittee, told Reuters last week. The committee is holding a hearing on Feb. 15 on the activities of U.S.-based Internet companies in China, and lawmakers have said they intend to push Yahoo to reveal what information it has provided to the Chinese government.
It’s easy to take an absolutist stance and condemn Google and Yahoo for their decisions to do business in China given the strict censorship. But to stay out would mean compromising the service they provide to the world’s second-largest Internet audience. (If Google didn’t agree to self-censorship, the Chinese government’s Great Firewall would do the censoring, and that firewall greatly slows down the speed of the Web.) For companies whose missions are all about open information, but which need millions of satisfied users to keep their advertising engines running, this was a tough call.
Then again, the serious privacy concerns outlined above—namely, that Google and Yahoo know all and see all—now come into play for China’s 1.3 billion citizens, whose government is not as mindful of rights and legal processes as our own.
Google co-founders Sergey Brin and Larry Page may be young tech geeks but, make no mistake, they are also shrewd businessmen. They are visionaries too, with a grand mission to organize all the world’s information and make it accessible.
Yahoo’s leaders have a similarly broad vision, which they summarize with the acronym FUSE, for “find, use, share and expand all human knowledge.”
No ordinary dot-com enterprises, these are powerful global juggernauts whose actions matter, whose products increasingly define how we experience the Internet.
Brin and Page tend to beg forgiveness, not permission, when pursuing their bold ideas, as was the case in 2004 when they launched Gmail, a free e-mail service that automatically scanned the contents of messages to display relevant ads. Despite complaints by privacy advocates, the duo did not back down, insisting that the novel features and massive free storage capacity would win over users. They have. Gmail accounts are in high demand, and are now being offered in a trial version to schools and businesses.
Still, the Google guys’ penchant for pushing boundaries and challenging the status quo may make them party to landmark legal struggles in coming years, perhaps reaching as high as the Supreme Court.
Locked in a fierce battle for supremacy on the Internet, Google and Yahoo are innovating at a dizzying pace, in fields ranging from advertising to video search to artificial intelligence, biology, energy, even space exploration. Yahoo is aggressively researching new forms of online communities to engage its enormous audience of 420 million registered users, such as the free photo sharing site Flickr. Google, meanwhile, is busy scanning millions of library books without regard for traditional copyright laws, and it has quietly embarked on a project with maverick scientist Craig Venter to build a database of genetic and biological information.
Funding this race is a robust online advertising business that generates billions of dollars in yearly revenue for each of the companies. In order to deliver the best-performing ads, the firms will strive to learn and anticipate our wants, needs and aspirations. And that probably means tapping into our surfing habits, search histories, personal preferences and more.
How do we balance the admittedly impressive features that Google and Yahoo provide, on the one hand, and cherished notions of personal privacy on the other? There are small steps users can take to minimize the digital dossiers that these companies can amass—for example, clearing “cookies” from Web browsers every so often, or using different sites for search and for e-mail. But the lead must come from the firms themselves.
Tomorrow’s Internet will be far more interesting than today’s—which is why it is critical for the leading search engines to work out industry-wide privacy standards sooner rather than later. They can start by resisting unwarranted requests for data, appointing internal “chief privacy officers,” and being more forthcoming about what information they record and share with third parties.
Until that day, and probably for as long as they are around, Google and Yahoo will know a lot more about us than we know about them.
Mark Malseed is coauthor of “The Google Story: Inside the Hottest Business, Media and Technology Success of Our Time,” an international bestseller that is being published in 17 languages worldwide. Formerly the researcher to Bob Woodward for the books “Plan of Attack” and “Bush at War,” Malseed contributes to numerous online and offline publications, including The Washington Post.