Page masking can be broadly defined as a technology used to deliver different web pages in different environments. People use page masking for two main reasons:
i) It allows them to create a separate optimized page and another page for each search engine, which is aesthetically pleasing and designed for their human visitors. When a search engine spider visits a site, pages optimized for that search engine will be delivered to it. When a person visits a site, a page designed for human visitors is displayed. The main benefit of this is that human visitors do not need to see pages optimized for search engines, because pages designed for search engines may be unsightly and may contain duplicate keywords.
ii) It allows them to hide the source code of the optimized pages they create, thereby preventing competitors from copying the source code.
Page masking is achieved by using some special masking scripts. A hidden script is installed on the server, which can detect whether the requesting page is a search engine or a human. If a search engine is requesting a page, the hidden script will be passed to the page optimized for that search engine. If someone is requesting a page, the disguise script will deliver a page designed for humans.
Disguise scripts can detect whether search engines or humans are accessing the site in two main ways:
i) The first and simplest method is to check the User-Agent variable. Whenever anyone (whether a search engine spider or a human-operated browser) requests a site page, it will report a user agent name to the site. Generally, if a search engine spider requests a page, the User-Agent variable contains the name of the search engine. Therefore, if the masking script detects that the User-Agent variable contains the name of the search engine, then it will pass the page optimized for that search engine. If the masquerading script does not detect the name of the search engine in the User-Agent variable, it will assume that the request was made by a human and deliver a page designed for humans.
However, while this is the easiest way to implement hidden scripts, it is also the least secure. It is very easy to fake User-Agent variables, so people who want to view optimized pages sent to different search engines can do this easily.
ii) The second, more complicated way is to use invisibility based on IP (Internet Protocol). This involves using an IP database containing a list of IP addresses of all known search engine spiders. When a visitor (search engine or human) requests a page, the disguise script checks the visitor’s IP address. If the IP address exists in the IP database, the hidden script knows that the visitor is a search engine and provides a page optimized for that search engine. If the IP address is not in the IP database, the disguise script assumes that someone has requested the page and sends the page to the human visitor.
Although more complicated than user agent-based masquerading, IP-based masquerading is more reliable and secure because it is difficult to forge IP addresses.
Now that you understand what masking is and how to implement it, the question arises whether you should use page masking. The one-word answer is “no”. The reason is simple: search engines don’t like it, and if they find that your site uses disguise, it’s likely to ban your site from being indexed. The reason why search engines don’t like page masking is that it prevents them from searching for the same page that visitors will see. If search engines can’t do this, they can’t confidently provide users with relevant results. In the past, many people have created optimized pages for some very popular keywords, and then used page masquerading to take people to real sites that have nothing to do with these keywords. If search engines allow this to happen, they will suffer because their users will abandon them and switch to another search engine that produces more relevant results.
Of course, one problem is how search engines can detect whether a site uses page hiding. There are three ways to do this:
i) If the site uses user agent masking, the search engine can simply send a spider to a site that does not report the name of the search engine in the User-Agent variable. If the search engine finds that the page passed to this spider is different from the page passed to the spider, and the latter reports the name of the search engine in the User-Agent variable, then it knows that the site uses page masking.
ii) If the website uses IP-based masquerading, search engines can send a spider from a different IP address instead of any previous IP address. Since this is a new IP address, the IP database used for concealment will not contain this address. If the search engine detects that the page delivered to the spider with the new IP address is different from the page delivered to the spider with a known IP address, it knows that the site uses page hiding.
iii) A human representative from a search engine can visit a site to see if it uses disguise. If she sees that the page sent to her is different from the page sent to the search engine spider, she knows that the site uses disguise. So when it comes to page hiding, my advice is simple: don’t even consider using it.