Don’t Index My Pages

March 4, 2009 · Filed Under SEO 

Wednesday’s technical post day.

After my post about “Are Your Web Pages Indexed?” there was a comment left by Dante from www.datingattractionreviews.com asking “how do I keep some pages from being indexed and listed in the search results?”. Before I attempt to answer his question I need to warn you (the reader) that some of the content on his site is mildly risque.  I take no responsibility for any of the content or information on his site if it offends your sensibilities.

I researched his site in an attempt to discover what it is that he would not want a search robot to index and/or follow.  The site is relatively new and it appears there are unfinished pages. That is, pages that he has put some content on but, has not completely finished. And, unfortunately uploaded to his server.

So, here’s Captain Obvious’ first tip: Don’t upload pages that are NOT completed. You can exclude this type of page from being index until you finish it and that’s what this post is about. However, if you’re like me, I forget I set it up as a “no index” and I don’t change the parameter when it’s completed. And, about 2 weeks later when checking my stats I discover it. Two weeks of potential readers and/or customers are lost.

Okay, let’s say there are completed pages that you don’t want indexed so they won’t show up in the search engine results. For example, I often find pages in search results for downloading software, ebooks, photos and other digital products that the site owner didn’t intend to be open to the public. But, they are because they either neglected to block the robots or they simply didn’t know about indexing.

The first thing we need to understand is that the search engines want to index everything they find. These “bots” crawl EVERYTHING on your site. You can keep them from crawling every page. You can stop them from indexing pages and/or following any links listed on the page in your “robots.txt” file. This the most common method and has been used since 1996. See the below table.

This table is from SEOBOOK.com which has the most comprehensive explanations about all things related to search engines and SEO. Please visit their site and join as a member. You can get a free 7 day email course about SEO Success, as well. I have no financial interest in their site or products. It’s simply a terrific site for learning about all things SEO. They constructed this table based on an interview between Eric Enge and Matt Cutts from Google. Read the interview.

Crawled by Googlebot?
Appears in Index?
Consumes PageRank
Risks? Waste?
robots.txt no If document is linked to, it may appear URL only, or with data from links or trusted third party data sources like the ODP yes People can look at your robots.txt file to see what content you do not want indexed. Many new launches are discovered by people watching for changes in a robots.txt file.

Using wildcards incorrectly can be expensive!

robots meta noindex tag yes no yes, but can pass on much of its PageRank by linking to other pages Links on a noindex page are still crawled by search spiders even if the page does not appear in the search results (unless they are used in conjunction with nofollow on that page).

Page using robots meta nofollow (1 row below) in conjunction with noindex do accumulate PageRank, but do not pass it on to other pages.

robots meta nofollow tag destination page only crawled if linked to from other documents destination page only appears if linked to from other documents no, PageRank not passed to destination If you are pushing significant PageRank into a page and do not allow PageRank to flow out from that page you may waste significant link equity.
link rel=nofollow destination page only crawled if linked to from other documents destination page only appears if linked to from other documents no, PageRank not passed to destination If you are doing something borderline spammy and are using nofollow on internal links to sculpt PageRank then you look more like an SEO and are more likely to be penalized by a Google engineer for “search spam”

Please note that Matt did not say they are more likely to ban you for using rel=nofollow, but they have on multiple occasions stated that they treat issues differently if they think it was an accident done by an ignorant person or a malicious attempt to spam their search engine by a known SEO blackhat trick.

The “no follow” parameter stops the page from being crawled and indexed as well as any PageRank from that page being passed or shared with any pages linked from it.  For example, web pages like “About Us” and “Contact Us” are typically high PageRank pages and they share that rank with every other page on your site that links to them. Therefore, if the “About Us” page links to 9 other pages each page gets about 10% of the PageRank of the “About Us” page. This efffectively dilutes it’s PageRank to 10%.

In order to keep the PageRank from being excessively diluted or spread you want to link to these type of pages from your “Home” page without a “no follow” parameter. Then use a “no follow” parameter on any other pages on your site that links to the “About Us” page. You now have the PageRank split 50/50 between the “About Us” page and your “Home” page. You are sculpting the PageRank for your Home page. And, a little sculpting is generally okay with the search engines. Excessive is not okay and wil get your site penalized.

Question: What is the best way to sculpt your PageRank, if you need those links on a page?

Answer: rel=nofollow on the link itself…all the other techniques have you spending PageRank on the pages you do not want indexed.

Question: How much does Google frown on using rel=no follow to sculpt internal PageRank?

Answer: It is NOT officially against the rules to use rel= no follow on links. However, if you are already fairly aggressive with your SEO techniques then using it confirms that you probably know some SEO tricks. And, Google is more likely to police you more closely and often.

Question: What do you mean by “fairly aggressive with your SEO techniques?

Answer: That’s like asking what’s going on inside the head of every Google site engineer and will it change or remain the same over time. My advice is not to use rel=no follow on internal links unless your site is pretty darn clean of SEO. Just as an example, let’s say that you have ten links on a page to 10 other pages on your site. You use the rel=no follow on nine of the ten links so all the page rank is forced to just 1 of the ten. That’s pretty aggressive and likely would be flagged by Google.

Question: Is it okay to “no index, no follow” in the page’s meta data?

Answer: Yes. Google can’t afford to have people believe that using those parameters in meta data or occasionally on page links is a poor quality attribute. These no follow and no index techniques are needed in some cases. Login pages need a no index attribute and spammer links need a no follow attribute. Hence the reason why so many blogs have their comments setup as no follow. There are spam bots that seek out blogs to leave a crappy generic comment with 10 links attached.

If you don’t want a page indexed or none of the links on that page followed, open its source code in a HTML editor and look for the Meta Keyword and Meta Description lines. Remove any keywords and description and add “ROBOTS” CONTENT=”NOINDEX,NOFOLLOW” in both fields.  Or, if you want to protect an entire folder on your site do this: Open Notepad and type in -

User-agent: *
Disallow: /cgi-bin/
Disallow: /secretfolder/

Save the file as robots.txt and upload it to the folder you want to protect.

Well this post has gone on long enough or maybe too long. Checkout some of the information on the sites I mentioned earlier in this post.

It is not this bad!Read it again. Please!B- to C+ ?You are terrific!I will click some ads for you. (No Ratings Yet)
Loading ... Loading ...
10 Total TweetBacks: (Tweet this post)
  • en: its kind of difficult to understand the purpose of twitter - is it really made for common men or made just for celebrities. 07/14/09 05:04pm
  • en: @JeDiJae I'm not joking smiles... who is it... Its got to be a rapper or maybe the 120th member of wu tang right!!! 07/14/09 05:04pm
  • en: @merineroxx yer same here but like when were ur last hols or do u hav hols soon? 07/14/09 05:04pm
  • en: Every time I cough...or TALK...it feels like I die a little. FML 07/14/09 05:04pm
  • en: so pumped for warped tour. ok im standing by either the 3oh3 stage or the a day to remember stage 07/14/09 05:04pm
  • en: Question to ponder. If given a choice between moron or lazy. Which would you choose? 07/14/09 05:04pm
  • en: @niravj how s ur college? u going ther regular or in bandra, bankst... :P @KirtiB 07/14/09 05:04pm
  • en: @Slasher yes slasher! internet hero <3 fucking lame when people r selective over sourcing either do it right 100% of the time or not at all 07/14/09 05:04pm
  • en: @stewartcousens Or Klingon. 07/14/09 05:04pm
  • en: Ready to get to work on Wednesday's food section: Melon recipes with little or no cooking. 07/14/09 05:04pm
Blog Traffic Exchange Related Posts
  • blog traffic exchangeThe Big, Bad List of Pre-SEO Questions You Need to Answer, Part IV In this installment we'll look deeper at the concept of hiring someone to manage your campaign in-house, specifically answering questions about what it takes to hire someone who already has experience. by Stoney deGeyter on Search Engine Guide This is a continuing series of questions that you need to ask......
  • Example of a Blogger.com blog siteMaking Money Online - Part 3 Blogging for Big Bucks Nearly every mom on the web has staked her claim to a blog where she can write and bloviate about whatever topic comes to mind. She can blog about her baby taking her first steps, or how she wants to leave her company job for more......
  • 6 Critical On-Page SEO Factors6 Critical On-Page SEO Factors 6 Critical On-Page SEO Factors Two main factor are considered when search engines review and decide what rank to give your web page: 1) The content on the page and how relevant it is to the search terms people use. This is called on-page factors. The nice thing about these......
  • blog traffic exchangeThe Big, Bad List of Pre-SEO Questions You Need to Answer, Part VII Part VII of Pre-SEO questions you need to answer We tackle the thorny issue of cost and results in this section of the series on pre-SEO questions. One of the difficulties in determining how effective SEO tweaks are the unknown and unknowable issues of doing business on the web. For......
  • blog traffic exchangeTurning Your Forum Posts into Profits Turning Your Forum Posts into Profits Most forums discourage direct promoting of products within your posts. However, that’s where the signature file comes in handy. With a sig file, your product recommendation is included as part of your post each and every time you converse on the board. Your signature......
Blog Traffic Exchange Related Websites
  • blog traffic exchangeSurveys Show Lack Of Knowledge On Deep Linking Adversely Affecting Sem And Web Directories For the Internet and World Wide Internet to survive in these days’s therefore referred to as Net 2.0 climate it wants to preserve the integrity of what made the internet work in the first place and that's take care of links and their numerous usages. Deep Linking as so much......
  • blog traffic exchangeGet Results With These Link Building Methods Link Building the Right Way Link building is an essential part of search engine optimization, without which it wouldn't be really possible to get high rankings. Not all means of link building are meant for everyone, but there're a few ways it can be done. It depends on what your......
  • blog traffic exchangeOptimizing Flash Files For The Search Engines What is a Flash Movie? A flash movie, or shock wave file (SWF), is the file format published when a Flash movie is exported. A SWF file can also be exported by several other Macromedia or Adobe Products. The SWF is usually an animation, dynamic menu, or highly interactive web-based......
  • blog traffic exchangeIncreasing Google Pagerank So you've heard about Google Pagerank and you know what it is. Now that you know exactly what it is, I am sure you want to increase your pagerank so you can get better rankings in the SERP's and get more website traffic. Search Engine Optimisation is the art of......
  • blog traffic exchangeThe Heavily-Optimised SEO Article As Legitimate Search Engine Doorway Page The doorway page, the satellite page, the channeling page, the information page, the landing page, the entry page, the bridge page..., whatever you call them they've long been considered the bete noir of seo or search engine optimisation. Everyone's got their own definition but all of them embrace roughly the......

Comments

Tell me what you're thinking...
and oh, if you want a pic to show with your comment, go get a gravatar!





CommentLuv Enabled

This site uses KeywordLuv. Enter YourName@YourKeywords in the Name field to take advantage.

Security Code:

  • best freelance web developers
  • Mastermind Group

  • Article Archive

  • Hal Major's Facebook profile
  • My Tweets

SEO Powered by Platinum SEO from Techblissonline