/*====== Modification - Share Button Code Script ======*/ /*====== End Modification ======*/

Pay Now Button

Advertisement

RE: Website Crawling

Website Crawling

A Web crawler is a computer program that browses the World Wide Web in a methodical, automated manner or in an orderly fashion.  Other terms for Web crawlers are ants, automatic indexers, bots, Web spiders, Web robots, or—especially in the FOAF community—Web scutters.

This process is called Web crawling or spidering.  Many sites, in particular search engines, use spidering as a means of providing up-to-date data.  Web crawlers are mainly used to create a copy of all the visited pages for later processing by a search engine that will index the downloaded pages to provide fast searches.  Crawlers can also be used for automating maintenance tasks on a Web site, such as checking links or validating HTML code.  Also, crawlers can be used to gather specific types of information from Web pages, such as harvesting e-mail addresses (usually for sending spam).

A Web crawler is one type of bot, or software agent.  In general, it starts with a list of URLs to visit, called the seeds.  As the crawler visits these URLs, it identifies all the hyperlinks in the page and adds them to the list of URLs to visit, called the crawl frontier.  URLs from the frontier are recursively visited according to a set of policies.

The large volume implies that the crawler can only download limited number of the Web pages within a given time, so it needs to prioritize its downloads.  The high rate of change implies that the pages might have already been updated or even deleted.          -- Wikipedia.

How big is the World Wide Web ?  Billions, not millions.  That is a bunch of website pages !  But that is just a part of the whole iceberg.  Just the indexed pages by the search engines that is seen by the general public.

Prevent Crawling

No Index, No Follow

Are there certain areas of your website that you do not want indexed ?  Some that you want private ?  Maybe you have areas that are only meant for the employees of your company.  Maybe you have a page with insider information that you don't want the whole world to have access to.  Maybe these website pages need to be excluded from being indexed.

There are instructions that can be put into the back end of the website to make this possible.  Index, Follow -- No Index, Follow -- Index, No Follow -- No Index, No Follow.  Index refers to whether or not you want that particular page to be indexed.  Follow refers to whether or not you want the search engine spider to follow the links on that page.

Instructing the search engines to do exactly what you would like in this manner, will help your website be more successful for your individual business needs.

Here, preventing too much crawling will help your PageRank overall.

301 Redirect

To a search engine, http://www.mywebsite.com and http://mywebsite.com, are seen as 2 totally different websites, even though, typing either, someone is probably going to wind up in the same place.  This can divide your possible PageRank into two parts and hurt the potential of searches for your website.

A 301 Redirect can merge these website addresses together from a search engine's perspective, and therefore, cut it's workload and increase your PageRank to it's maximum potential.

OK.  Caution Now.  Here is some code for you that should work in most Joomla! situations.  And note that I said most.  If you don't know what to do with this stuff -- don't try this at home, and please leave it to the professionals !  Add this into your home directory .htaccess file and replace mywebsite and .com with your own relevant domain name information.

     # Always use www in the domain
     RewriteCond %{HTTP_HOST} ^([a-z.]+)?mywebsite\.com$ [NC]
     RewriteCond %{HTTP_HOST} !^www\. [NC]
     RewriteRule .? http://www.%1mywebsite.com%{REQUEST_URI} [R=301,L]

 

This is a 301 redirect.

And if you need a good code editor, we like Bluefish.  It's free and open source.  Much less expensive than Dreamweaver ( $ 399.00 ).  We like free !

      Step Ten -- Increasing Prominence

RE: can assist you with any of these basic SEO tasks.  Please contact us with any requirements that you may have.

 
 

PayPal Donate ButtonRE: Supports Worthy Entities !

RE: actively seeks out and is committed to completing projects for and otherwise supporting not-for-profit, small or fledgling organizations, causes and business, who we deem as important and necessary, at discounted rates or at no charge.  Information on these worthy entities can be found in Web Links in the side menu of this RE: website.  We wish for you to support them if you are able.  If you would like to support RE: in these charitable endeavors, please make a donation to us using the donate button here.

RE: has also provided a TON of free, and very valuable information on all of these website pages.  Certainly, we have have helped you and saved you time and money, so please consider being kind to us so we may keep on truckin' !

RE: thanks you for your donations, support and consideration.

PayPal Merchant Services

RE: accepts payments from PayPal and major credit cards utilizing PayPal Merchant Services.  You do not need a PayPal account to pay with a major credit card.  RE: also has capabilities to send online invoices to receive payments.  Please contact us at This email address is being protected from spambots. You need JavaScript enabled to view it. to coordinate your bill payment options.

PayPal Banner ImageAt any time, you may click on the "Pay Now" button located in the top right corner of this web page to make bill payments.  RE: thanks you for your consideration.

Advertisement
Advertisement
Advertisement