
A few hours ago, Google announced to the world that the company has been crawling forms on "high-quality" Web sites to index "Invisible Web" content in the Google.com search engine.
Google"s intention (as always) aims to improve the quality of search results for users of Google"s search engine.
Crawling Web site forms, though, constitutes a sea change in terms of data privacy; specifically, the privacy of corporate data.
"In the past few months we have been exploring some HTML forms to try to discover new web pages and URLs that we otherwise couldn"t find and index for users who search on Google," according to Jayant Madhavan (pictured) and Alon Halevy, from the Crawling and Indexing Team in the Official Google blog.
Here"s how Googlebot does it, according to Google engineers:
"We might choose to do a small number of queries using the form. For text boxes, our computers automatically choose words from the site that has the form; for select menus, check boxes, and radio buttons on the form, we choose from among the values of the HTML. Having chosen the values for each input, we generate and then try to crawl URLs that correspond to a possible query a user may have made. If we ascertain that the web page resulting from our query is valid, interesting, and includes content not in our index, we may include it in our index much as we would include any other web page."
Last year, as the search marketing analyst for JupiterResearch, I said that the biggest issue in 2007 would be the threat to the privacy of corporate data.
I was wrong, 2008 is the year corporate IT departments worldwide will be forced to spend time, money and resources to ensure that search engine spiders do not inadvertently index data a company would prefer to be private.
The same holds true for non-profit organizations and other institutions.
From a personal standpoint, I have confidence in Google"s data security systems, despite the recent departure of Google CIO, Doug Merrill.
I have full confidence that Google practices "good Internet citizenship."
I"m confident Google has paved the road to relevance with good intentions.
This is not simply a "pioneering move" by Google.
That the robotic filling-in of forms has already been practiced by AOL"s Quigo, according to SearchEngineLand, does not reassure me.
I"m sorry, Sergey, Larry, Eric. I can"t in good conscience defend Google"s decision to our readers. The costs to CEOs, CIOs and CTOs at corporations far outweigh the benefits to consumers.
Please, reconsider.
Do not make the robotic querying of Web site forms the default spidering practice for Google. As a search engine, Google has become the gateway to the Internet and with great power comes great responsibility.
End this experiment now.
Stop this experiment before the backlash against Google develops. It"s not a question you want to answer when Wall St. analysts quiz you on the company"s performance on April 17th during the First Quarter earnings conference call.
Other Posts:
>>Yahoo!, Now with More Local Listings
>>Keep an Eye on Image Search
>>Yahoo! Support for Dynamic URL Rewriting
>>What People Reveal
>>SEW Experts: Tools of the Trade
>>SEW Experts: Scaling Your Big Business Internationally
>>Search Headlines & Links: August 27, 2007
>>Scoble Predicts Google Death By Facebook
>>Compete announces Best-In-Show SES 2007 Awards
>>Search and Offline Converge
>>SEW Experts: Tips for Being a Great PPC Client
>>Search Headlines & Links: August 24, 2007
Month Archives:
Top Tags:
Company & Product Profiles Google Internet Technology Search feature Business and Technology Web2.0 column analysis 服务介绍 application letter comment 业界信息 news Startups deal Search Headlines China2.0 產業策進 未來趨勢 創投 Social Network widget news_in 創業案例 业界动态 SEW Experts Web 2.0 News & Ideas
@2007 All rights Reserved |