Use Sitemaps and Google will Crawling your Site 5 Times more

Status
Not open for further replies.

Mr Happy

Active Member
4,093
2009
2,572
0
So i got two brand new domains with zero backlinks. Each has the exact same number of links per page with a similar layout and targeted Keywords. The script is custom but based on a similar layout to Wordpress. I gave each domain the same amount of limited in links so Google could find the site.

So basically two small pretty much identical sites with the same amount of content.

The only difference is one has a sitemap and rss feeds that Google can use and the other doesn't have any. For a few days it was that I forgot to enable it but then I though I'd leave it for a while longer to see the difference.


The difference is massive. Google crawls the site that has a sitemap and rss feeds 5 times more than the one without a sitemap and growing at a much faster rate.

Now everyone tells you to add a site map because it's better for your site but I never expected it to have this much of an impact.

So what do you do now? Well it you don't have a sitemap your killing yourself. Get one now!

Now make a robots.txt file and upload it to the root of your site.

I'm not really interested in what you allow or disallow Google to crawl you can just Google that stuff yourself but at the end of the sitemap I want you to add a link to the sitemap as it helps search engines find it way way faster.

Code:
User-agent: *
Sitemap: http://example.com/sitemap.xml

If you have RSS feeds (which you should and if you don't get them) then you can add them too like below. Change the links to what ever your RSS feeds are.


Code:
User-agent: *
Sitemap: http://example.com/rss/
Sitemap: http://example.com/rss/app/
Sitemap: http://example.com/rss/movie/
Sitemap: http://example.com/rss/game/
Sitemap: http://example.com/rss/music/
Sitemap: http://example.com/rss/tv/
Sitemap: http://example.com/rss/ebook/
Sitemap: http://example.com/sitemap.xml


Seriously this will really help your site. I'm amazed how much of a difference it makes. Once you've it add in the robots.txt file you can see that it will be accepted in the results in the Google Webmaster Area. They will find it in less than a day usually and you don't even have to manually submit it.

[slide]http://screensnapr.com/u/u46vbu.png[/slide]

If you don't have a Google Webmaster Area your missing out too. Get one of them straight after reading this topic as the info they have is invaluable.

Help me spread the message. When your reviewing a site. Don't just say get a favicon. Tell them to add a sitemap too if they don't have one.
Anyone who asks me for help in future won't get any unless they have a sitemap :P

Any questions let me know


EDIT:

Another tip. If you have a huge site you can brake up your sitemaps. Google has limits on how big a sitemap should be which is no more than 50000 urls and a max 10MB file size (after un-gzipping).

nxo5ll.png


Here's an example of a site and the way I split them up.

5jfxn0.png
 
34 comments
My Robots.txt

Code:
# Allow Archiver
User-agent: ia_archiver
Allow: /


User-agent: Slurp
Crawl-delay: 60


User-agent: *

Disallow: *.php
Disallow: *.js
Disallow: *.jsp
Disallow: *.cfm
Disallow: *.asp
Disallow: *.html
Disallow: *.htm
Disallow: *.aspx
Disallow: *.cgi

Disallow: /includes/
Disallow: /install/
Disallow: /customavatars/
Disallow: /archive/
Disallow: /sitemap/
Disallow: /calendar/
Disallow: /activeusers/
Disallow: /go/
 
I have all of this i just dont have my RSS set up, any chance of a little explination on how to sort it out ?

If it's for the site in your profile then Wordpress already has RSS feeds

The default one is
Code:
http://post-zone.com/feed/rss/

Then all the other's follow the same format
Code:
http://post-zone.com/category/movies/feed/rss/
http://post-zone.com/category/games/feed/rss/
etc etc.

It's the same for any wordpress site. All you have to do is add them to your robots.txt file as outlined above.


I also add tips on how big a sitemap should be for anyone with a really large site.
 
My Robots.txt

Code:
# Allow Archiver
User-agent: ia_archiver
Allow: /


User-agent: Slurp
Crawl-delay: 60


User-agent: *

Disallow: *.php
Disallow: *.js
Disallow: *.jsp
Disallow: *.cfm
Disallow: *.asp
Disallow: *.html
Disallow: *.htm
Disallow: *.aspx
Disallow: *.cgi

Disallow: /includes/
Disallow: /install/
Disallow: /customavatars/
Disallow: /archive/
Disallow: /sitemap/
Disallow: /calendar/
Disallow: /activeusers/
Disallow: /go/

What's your site and why are you disallowing so much?

Also your not following my above tutorial. your not adding

Code:
sitemap: http://example.com/sitemap.xml
to the end of it.
 
Oh i get it, only thing is, there is a x and clocks, they should be ticked right ?

[slide]http://i.imgur.com/PZJ6G.png[/slide]
 
Oh i get it, only thing is, there is a x and clocks, they should be ticked right ?

[slide]http://i.imgur.com/PZJ6G.png[/slide]

The clock means it's processing it. The clock will disappear and it will turn to a green tick soon.

Once the clock goes if you still see a red 'x' you can click it and it will give you the error and reason it can't be accepted. Won't take long to process.

For example in the case below it won't accept it and it's still a red 'x' as it's a html sitemap and not a xml sitemap.

6jqyz5.png


And then the warning message it gives after you've clicked on it

iwncwt.png


Its added on my site, i just didn't add it to my post
oh right that's cool so. Search engines will find it then.
 
@ CammyD, Mr Happy, Splitice..

When you type site:xxx.com logout from your google accounts first, and you see it's back to what it was before (no 100+k on restrictedwarez, no 6 million at nexusddl..)
 
@The Coon: that just means that not all links are in the main index for your site. All google figures are approximates after all.
 
I am getting this error on every single one..

[slide]http://i.imgur.com/uphIL.png[/slide]

How did you submit your sitemaps?

Did you include www in the link when you were submitting them?

Also what sorta weird link do you have in your robots.txt file for sitemap?

Why don't you just add the RSS feeds in the sitemap so all search engines will find them easier instead of having to submit them to Google, MSN, Yahoo, Badu etc etc.


Its very true.

2sd3da.png
lol I've never seen so many sitemaps for a single site.
 
@_brazzO .. I got the same error when i first submitted a sitemap.. I changed the URL settings in wordpress and no more error

[SLIDE]http://screensnapr.com/u/1tuqsn.png[/SLIDE]
 
Status
Not open for further replies.
Back
Top