Friends, Technology, Web2.0 - What I am reading

    [Home] [Recent] [Site Map]

   

Google: A Clear & Present Danger to Corporate Data Privacy

jayant%20madhavan%20google.jpg

A few hours ago, Google announced to the world that the company has been crawling forms on "high-quality" Web sites to index "Invisible Web" content in the Google.com search engine.

Google"s intention (as always) aims to improve the quality of search results for users of Google"s search engine.

Crawling Web site forms, though, constitutes a sea change in terms of data privacy; specifically, the privacy of corporate data.

"In the past few months we have been exploring some HTML forms to try to discover new web pages and URLs that we otherwise couldn"t find and index for users who search on Google," according to Jayant Madhavan (pictured) and Alon Halevy, from the Crawling and Indexing Team in the Official Google blog.

Here"s how Googlebot does it, according to Google engineers:

"We might choose to do a small number of queries using the form. For text boxes, our computers automatically choose words from the site that has the form; for select menus, check boxes, and radio buttons on the form, we choose from among the values of the HTML. Having chosen the values for each input, we generate and then try to crawl URLs that correspond to a possible query a user may have made. If we ascertain that the web page resulting from our query is valid, interesting, and includes content not in our index, we may include it in our index much as we would include any other web page."

Last year, as the search marketing analyst for JupiterResearch, I said that the biggest issue in 2007 would be the threat to the privacy of corporate data.

I was wrong, 2008 is the year corporate IT departments worldwide will be forced to spend time, money and resources to ensure that search engine spiders do not inadvertently index data a company would prefer to be private.

The same holds true for non-profit organizations and other institutions.

From a personal standpoint, I have confidence in Google"s data security systems, despite the recent departure of Google CIO, Doug Merrill.

I have full confidence that Google practices "good Internet citizenship."

I"m confident Google has paved the road to relevance with good intentions.

This is not simply a "pioneering move" by Google.

That the robotic filling-in of forms has already been practiced by AOL"s Quigo, according to SearchEngineLand, does not reassure me.

I"m sorry, Sergey, Larry, Eric. I can"t in good conscience defend Google"s decision to our readers. The costs to CEOs, CIOs and CTOs at corporations far outweigh the benefits to consumers.

Please, reconsider.

Do not make the robotic querying of Web site forms the default spidering practice for Google. As a search engine, Google has become the gateway to the Internet and with great power comes great responsibility.

End this experiment now.

Stop this experiment before the backlash against Google develops. It"s not a question you want to answer when Wall St. analysts quiz you on the company"s performance on April 17th during the First Quarter earnings conference call.


>>Source Link
>>Blog: searchenginewatch
>>Publish Date: 4/12/2008 7:01:47 AM
>>Keywords: google data

Related Posts
>>Google-internal Data Restrictions #
    There"s two sides to protecting your personal data stored at Google: defending abuse from the outside, and defending abuse from the inside. Google"s Douglas Merrill recently gave some remarks on how G
>>Google Analytics Data Sharing #
    Site stats program Google Analytics offers a new opt-in data sharing setting. Log-in and you"ll see a message dialog, and somewhere below it a link reading "Edit Account and Data Sharing Settings". Th
>>Google is a Data Company #
    Google is a data company. No, it"s not a search engine company, or an advertising company, it"s a data company. I"m talking about their core expertise here. My thinking on this emerges from a few d
>>Google Defends Data-Retention Practices #
    In response to an E.U. Article 29 Working Party investigation, Google has changed its data retention policies again. Instead of the 18-24 months that it announced in March as the cut-off for keeping s
>>Google Adds Transit Data to Maps #
    Google has been showing locations of train, bus or subway stops on its maps, but now those locations will link directly to more detailed information about a specific station, route, or schedule, accor
>>SEW Experts: Can Google Analytics Be Evil? #
    In today"s Search Ads column, "Can Google Analytics Be Evil?," Tony Wright is looking for feedback on Google Analytics. Like many search marketers, he has recommended that clients stay away from Googl
>>Google Documents API Released #
    Google released a brand-new API titled "Google Documents List Data API." Sounds confusing, but the "Google Documents List" is just the Windows Explorer-style file browser available at docs.google.com,
>>Personalized Search - All"s Well or Orwell? #
    Google is now (and has been for some time) collecting data on individual users, and they are assuming that users will trust them with this data to "Do No Evil," as their famous slogan goes. Only time

Other Posts:
>>Yahoo!, Now with More Local Listings
>>Keep an Eye on Image Search
>>Yahoo! Support for Dynamic URL Rewriting
>>What People Reveal
>>SEW Experts: Tools of the Trade
>>SEW Experts: Scaling Your Big Business Internationally
>>Search Headlines & Links: August 27, 2007
>>Scoble Predicts Google Death By Facebook
>>Compete announces Best-In-Show SES 2007 Awards
>>Search and Offline Converge
>>SEW Experts: Tips for Being a Great PPC Client
>>Search Headlines & Links: August 24, 2007


Month Archives:

Top Tags:
Company & Product Profiles Google Internet Technology Search feature Business and Technology Web2.0 column analysis 服务介绍 application letter comment 业界信息 news Startups deal Search Headlines China2.0 產業策進 未來趨勢 創投 Social Network widget news_in 創業案例 业界动态 SEW Experts Web 2.0 News & Ideas


@2007 All rights Reserved