Don’t Index My Pages

March 4, 2009 · Filed Under SEO 

Wednesday’s technical post day.

After my post about “Are Your Web Pages Indexed?” there was a comment left by Dante from www.datingattractionreviews.com asking “how do I keep some pages from being indexed and listed in the search results?”. Before I attempt to answer his question I need to warn you (the reader) that some of the content on his site is mildly risque.  I take no responsibility for any of the content or information on his site if it offends your sensibilities.

I researched his site in an attempt to discover what it is that he would not want a search robot to index and/or follow.  The site is relatively new and it appears there are unfinished pages. That is, pages that he has put some content on but, has not completely finished. And, unfortunately uploaded to his server.

So, here’s Captain Obvious’ first tip: Don’t upload pages that are NOT completed. You can exclude this type of page from being index until you finish it and that’s what this post is about. However, if you’re like me, I forget I set it up as a “no index” and I don’t change the parameter when it’s completed. And, about 2 weeks later when checking my stats I discover it. Two weeks of potential readers and/or customers are lost.

Okay, let’s say there are completed pages that you don’t want indexed so they won’t show up in the search engine results. For example, I often find pages in search results for downloading software, ebooks, photos and other digital products that the site owner didn’t intend to be open to the public. But, they are because they either neglected to block the robots or they simply didn’t know about indexing.

The first thing we need to understand is that the search engines want to index everything they find. These “bots” crawl EVERYTHING on your site. You can keep them from crawling every page. You can stop them from indexing pages and/or following any links listed on the page in your “robots.txt” file. This the most common method and has been used since 1996. See the below table.

This table is from SEOBOOK.com which has the most comprehensive explanations about all things related to search engines and SEO. Please visit their site and join as a member. You can get a free 7 day email course about SEO Success, as well. I have no financial interest in their site or products. It’s simply a terrific site for learning about all things SEO. They constructed this table based on an interview between Eric Enge and Matt Cutts from Google. Read the interview.

Crawled by Googlebot?
Appears in Index?
Consumes PageRank
Risks? Waste?
robots.txt no If document is linked to, it may appear URL only, or with data from links or trusted third party data sources like the ODP yes People can look at your robots.txt file to see what content you do not want indexed. Many new launches are discovered by people watching for changes in a robots.txt file.

Using wildcards incorrectly can be expensive!

robots meta noindex tag yes no yes, but can pass on much of its PageRank by linking to other pages Links on a noindex page are still crawled by search spiders even if the page does not appear in the search results (unless they are used in conjunction with nofollow on that page).

Page using robots meta nofollow (1 row below) in conjunction with noindex do accumulate PageRank, but do not pass it on to other pages.

robots meta nofollow tag destination page only crawled if linked to from other documents destination page only appears if linked to from other documents no, PageRank not passed to destination If you are pushing significant PageRank into a page and do not allow PageRank to flow out from that page you may waste significant link equity.
link rel=nofollow destination page only crawled if linked to from other documents destination page only appears if linked to from other documents no, PageRank not passed to destination If you are doing something borderline spammy and are using nofollow on internal links to sculpt PageRank then you look more like an SEO and are more likely to be penalized by a Google engineer for “search spam”

Please note that Matt did not say they are more likely to ban you for using rel=nofollow, but they have on multiple occasions stated that they treat issues differently if they think it was an accident done by an ignorant person or a malicious attempt to spam their search engine by a known SEO blackhat trick.

The “no follow” parameter stops the page from being crawled and indexed as well as any PageRank from that page being passed or shared with any pages linked from it.  For example, web pages like “About Us” and “Contact Us” are typically high PageRank pages and they share that rank with every other page on your site that links to them. Therefore, if the “About Us” page links to 9 other pages each page gets about 10% of the PageRank of the “About Us” page. This efffectively dilutes it’s PageRank to 10%.

In order to keep the PageRank from being excessively diluted or spread you want to link to these type of pages from your “Home” page without a “no follow” parameter. Then use a “no follow” parameter on any other pages on your site that links to the “About Us” page. You now have the PageRank split 50/50 between the “About Us” page and your “Home” page. You are sculpting the PageRank for your Home page. And, a little sculpting is generally okay with the search engines. Excessive is not okay and wil get your site penalized.

Question: What is the best way to sculpt your PageRank, if you need those links on a page?

Answer: rel=nofollow on the link itself…all the other techniques have you spending PageRank on the pages you do not want indexed.

Question: How much does Google frown on using rel=no follow to sculpt internal PageRank?

Answer: It is NOT officially against the rules to use rel= no follow on links. However, if you are already fairly aggressive with your SEO techniques then using it confirms that you probably know some SEO tricks. And, Google is more likely to police you more closely and often.

Question: What do you mean by “fairly aggressive with your SEO techniques?

Answer: That’s like asking what’s going on inside the head of every Google site engineer and will it change or remain the same over time. My advice is not to use rel=no follow on internal links unless your site is pretty darn clean of SEO. Just as an example, let’s say that you have ten links on a page to 10 other pages on your site. You use the rel=no follow on nine of the ten links so all the page rank is forced to just 1 of the ten. That’s pretty aggressive and likely would be flagged by Google.

Question: Is it okay to “no index, no follow” in the page’s meta data?

Answer: Yes. Google can’t afford to have people believe that using those parameters in meta data or occasionally on page links is a poor quality attribute. These no follow and no index techniques are needed in some cases. Login pages need a no index attribute and spammer links need a no follow attribute. Hence the reason why so many blogs have their comments setup as no follow. There are spam bots that seek out blogs to leave a crappy generic comment with 10 links attached.

If you don’t want a page indexed or none of the links on that page followed, open its source code in a HTML editor and look for the Meta Keyword and Meta Description lines. Remove any keywords and description and add “ROBOTS” CONTENT=”NOINDEX,NOFOLLOW” in both fields.  Or, if you want to protect an entire folder on your site do this: Open Notepad and type in -

User-agent: *
Disallow: /cgi-bin/
Disallow: /secretfolder/

Save the file as robots.txt and upload it to the folder you want to protect.

Well this post has gone on long enough or maybe too long. Checkout some of the information on the sites I mentioned earlier in this post.

It is not this bad!Read it again. Please!B- to C+ ?You are terrific!I will click some ads for you. (No Ratings Yet)
Loading ... Loading ...
10 Total TweetBacks: (Tweet this post)
  • en: its kind of difficult to understand the purpose of twitter - is it really made for common men or made just for celebrities. 07/14/09 12:04pm
  • en: @JeDiJae I'm not joking smiles... who is it... Its got to be a rapper or maybe the 120th member of wu tang right!!! 07/14/09 12:04pm
  • en: @merineroxx yer same here but like when were ur last hols or do u hav hols soon? 07/14/09 12:04pm
  • en: Every time I cough...or TALK...it feels like I die a little. FML 07/14/09 12:04pm
  • en: so pumped for warped tour. ok im standing by either the 3oh3 stage or the a day to remember stage 07/14/09 12:04pm
  • en: Question to ponder. If given a choice between moron or lazy. Which would you choose? 07/14/09 12:04pm
  • en: @niravj how s ur college? u going ther regular or in bandra, bankst... :P @KirtiB 07/14/09 12:04pm
  • en: @Slasher yes slasher! internet hero <3 fucking lame when people r selective over sourcing either do it right 100% of the time or not at all 07/14/09 12:04pm
  • en: @stewartcousens Or Klingon. 07/14/09 12:04pm
  • en: Ready to get to work on Wednesday's food section: Melon recipes with little or no cooking. 07/14/09 12:04pm
Blog Traffic Exchange Related Posts
  • blog traffic exchangeXSitePro II vs. Firepow Website Design Software Review XsitePro II versus Firepow "What is the difference between XSitePro II and Firepow web software?" "And, Which is the better one?" A fellow attendee at SES NY posed these questions, which I thought was interesting, since the conference was about blogs and blogging and not website......
  • blog traffic exchangeIs Microsoft's Bing Search Engine A Google Killer? Bing versus Google On June 3rd Microsoft launched its new search engine Bing.com. They spent tons of money on advertising it and a lot of hype and some criticism followed from the blogosphere. As with everything new some folks like it and some folks don't. The real question is "Will......
  • getresponse_logoBack Links and Web 2.0 Using Web 2.0 Resources to Get Back Links What is Web 2.0? This is a term coined a few years ago to describe the "new" interactive internet. Evolving web technologies allow people to get more involved with websites — gone are the days when all you could do on the......
  • tomsaint11Wordpress Plugins You Should Use by Hal Major If you have a self-hosted blog using Wordpress, and you should, you'll want to extend its capability using plugins. Plugins are scripts, usually created with PHP coding, that turn your plain vanilla blog into Double Chocolate Rocky Road with sprinkles on top. Okay, I wish I hadn't......
  • GardenAre You Building Traffic and Assets? Are You Building Traffic and Assets For the past 6 to 8 weeks, I have had several requests from business owners to submit a proposal to increase the traffic and page rank to their web sites. When confronted with the question “What is it you hope to accomplish?” or “What......
Blog Traffic Exchange Related Websites
  • Download the Related Websites PluginRelated Websites Welcome to the power of relevant chaos. The Related Websites plugin is the latest to come from the labs of the Blog Traffic Exchange. The Related plugin has been built by a blogger for the benefit of fellow bloggers everywhere. There is no advertising present on member blogs - only......
  • blog traffic exchangeReaching the Target Audience Through SEO Practices In order to do any kind of business, it is very important to know who the target audience is. Only after you have decided on it can you implement a marketing plan or strategy for your business. In the online industry it becomes even more crucial for a company to......
  • Image of Google local resultsHow To: Ranking In Local Search Google local is a fantastic service and has become even more important due to the recent inclusion of local search mixed in with web (organic) results. To see what I'm talking about go to Google and type in a competitive "core" keyword or phrase,  "mexican restaurant",  into the search box......
  • SEO EliteTop Internet Marketing Products Reviews: SEO Elite Author: Brad Callen Puplisher: Bryxen Software latest version: 4.0 Related Products: Keyword Elite, Affliates Elite (the new comer). If you are a webmaster; you would know the importance of S.E.O (Search Engine Optimization) for any web site to start, devolpe and to shoot up to the top of any niche......
  • blog traffic exchangeEasy Ways to Get SEO Page Rank! Easy Ways to Get SEO Page Rank! By James B Trent Many people who surf the internet look towards Google's PageRank as a tool to tell them about each website they visit. People consider a site's PageRank to be a symbol of the site's trustworthiness and reliability. They believe that......

Comments

Tell me what you're thinking...
and oh, if you want a pic to show with your comment, go get a gravatar!





CommentLuv Enabled

This site uses KeywordLuv. Enter YourName@YourKeywords in the Name field to take advantage.

Security Code:

  • best freelance web developers
  • Mastermind Group

  • Article Archive

  • Hal Major's Facebook profile
  • My Tweets

SEO Powered by Platinum SEO from Techblissonline