Search Engine Dilemma

The history of Internet search engines can be traced back to the year 1990 when a group of understudies from McGill University of Montreal created Archie, a script-based data-gathering program that downloaded the directory listings of all the files located on File Transfer Protocol (FTP) sites and created a searchable database of file names. Archie was a response to the primary method of storing and retrieving files in the pre-web days. After the introduction of the World Wide Web in 1991, the Internet started expanding outside the domain of academia, it gained critical mass with the introduction of a web browser called ‘Mosaic’. Parallel to Mosaic first search engine called ‘Wandex’ was introduced. One of the most prominent search engines ‘Google’ was launched in the year 2001 along with its prominent peers Yahoo! Search, MSN Search, and Bing. The backend of the search engine is very complicated but in the end, all the functionality can be condensed and explained by defining it as a curator of URLs and facilitator of access[1].

It is to be noted that the Indian regime for data protection is in an evolutionary stage, with the recent passage of new Intermediaries guidelines, 2021[2], the government aims at creating a liability principle for the existing intermediaries operating in India. The search engine compliances are a grey area which was identified in the recent case filed by Google challenging New IT rules[3], the argument which was put forth by Google’s representing counsel was that it is a ‘mere aggregator’. Search engines can be treated as the facilitator of access to various websites, this is correct on the surface of it, but once we take the look at the backend it starts getting complicated, akin to a library it has an index of all the URLs and by the virtue of tags, a particular website can be placed on the search engine anywhere, may it be on top or bottom.

"Tags" are keywords or term assigned to a piece of information which defines the position of a website on the search engine[4], a tag can be related to anything and those tags can lead to the information which might be false in nature and can potentially affect the sanctity of the country. So Google's claim that it’s a ‘Mere Aggregator’ might not be correct. Search Engines are responsible for documenting URLs of websites based upon their tags which brings it under the ambit of intermediaries guidelines because "Indexing" is a form of automated curation[5], and if we closely look into the case of Metropolitan International Schools Limited Vs. Design Technica Corporation, Google UK, and Google Inc.[6] it was made clear that indexing done by Google was completely automated with no human intervention, there is no possible way to control all the results which appear at the receiving end, so the court decided that Google is “an intermediary but a different kind of intermediary.”[7]Since the process is automated it can be trained to target malicious tags, but the problem will ensue in a jurisdiction like ours when legal authorities apply rules that are not developed to address and fulfil content removal requests to its fullest extent. Websites and their content listed among search results are created and uploaded by third parties; websites are owned by third parties, not the search engine operator, which makes it legally and technically impossible to interfere with the content. Suggestions and placement of websites with malicious tags can be filtered through the automated processes.[8]Our recent law is underdeveloped if we look at the definition of ‘online curated content”[9] it only covers audio visual content not any other type of curated content.

However, if we look at the definition ‘access control mechanisms’[10]it covers the measures to controlled access to online curated content and ‘access services’ means any measure, including any technical measures, for controlling the access to the online curated content, meaning that a company would have to allocate the mechanism for tackling of situations like spamdexing[11] in case of a search engine. Even the definition of ‘content’[12] covers all electronic records meaning that malicious tags associated with the search engine optimization can be controlled using the mechanisms deployed by a search engine. The problem which might be encountered in tackling all the spamdexed or Meta tagged websites will be that they have to upgrade tag library from time to time. So the argument of it being a ‘Mere Aggregator’ might not be correct in its entirety.

However, a need to develop a flexibility in the definition of Online Curated Content is required, Indexing URLs is also a form of a curation and what is curated and what is not should be covered under the compliances by the search engine. Currently, the definition of Online Curated Content and Content is not broad enough to cover all of the aforementioned. controls like SEO compliances might prove to be beneficial for the companies.

Footnotes: [1]“Regulating Search Engines: Taking Stock & Looking Ahead, Vol 8 Issue 1 Yale Law Journal of law and technology, Urs Gasser. Accessed on 01/06/2021 [2] Information Technology (Intermediary Guidelines and Digital Media Ethics Code) Rules, 2021, MEITY, Accessed on 29/05/2021 [3] Google challenges new IT rules, says it’s not a social intermediary, Indu Bahn, Google challenges new IT rules, says it's not a social intermediary - The Financial Express (, Accessed on 03/06/2021 [4] How much important Tags are for SEO? Tech Supremo, important-tags-are-for-seo/ Accessed on 06/05/2021 [5] Automatic Indexing of Specialized Documents: Using Generic vs. Domain-Specific Document Representations, Aurélie Névéol James G. Mork and Alan R. Aronson, National Library of Medicine, pub2007036.pdf (, Accessed on 04/06/2021 [6] Metropolitan International Schools Limited Vs. Design Technica Corporation, Google UK, and Google Inc., [2009] EWHC 1765 (QB), , Accessed on 05/06/2021 [7] Supra ‘5’, Para 55 [8] Understanding Search Engines: Legal Perspective on Liability in the Internet Law Vista, Gönenç Gürkaynak, gonenc-gurkaynak/ Accessed on 06/06/2021, “Spamdexing”- It is a form of SEO spamming. SEO is an abbreviation for Search Engine Optimization, which is the art of having your website optimized, or attractive, to the major search engines for optimal indexing. Spamdexing is the practice of creating websites that will be illegitimately indexed with a high position in the search engines. [9] Supra ‘2’ Rule 1q [10] Supra ‘2’ Rule 1a [11] SEO Spam: Spamdexing, Web Spam,, Accessed on 07/06/2021 [12] Supra ‘2’ Rule 1g

Submitted by,

Mr. Prajanya Rathore,

Sr. Editor, AmicusX.