AboutUsSiteMap

OurWork Edit-chalk-10bo12.png

What (summary)

Index our site and present search engines with the resulting sitemap.

Why this is important

Traffic from search engines is our bread and butter. If there are pages that they would index if only they knew about them ... we should let them know. This has a 5% chance of doubling our traffic with 3 days worth of work.

DoneDone

  • We have a validated index and sitemap files that include all of our page titles
  • We are serving it up from the proper location so that googlebot and other search engines crawlers can find it

Steps to get to DoneDone

  • Create a branch and stage it locally. sitemap
  • Read out http://sitemaps.org/
  • Read out http://sitemaps.org/protocol.php
  • Read out https://www.google.com/webmasters/tools/docs/en/protocol.html
  • Read out https://www.google.com/webmasters/tools/docs/en/sitemap-generator.html
  • Understand this task
  • Get a list of all of our pages Got hold of 100k pages from mist.
  • Load these pages into our branch database.
  • Create a script that takes a limit and offset parameter to generate the sitemap for these pages
  • Convert time into W3C Datetime format
  • Refactor the code
  • Generate sitemap_index.
  • Write a runner script that will walk over the pages table and generate xml for sitemap.
  • Break it up into sitemaps that have no more than 50,000 urls and are smaller than 10MB each
  • Find a function in Ruby to encode the URLs in UTF-8
  • Run the script on mist and generate a sitemap.
  • Stage it on the staging server
  • validate index and sitemap files
  • Simulate the sitemaps on the staging server.
    • Copy sitemaps at the staging server in the appropriate directories.
    • Make proper symlinks for the sitemaps
    • Read and understand http://httpd.apache.org/docs/2.0/mod/mod_rewrite.html
    • Understand RewriteRule in apache
    • Configure Apache to point the sitemaps to their appropriate location
    • Test the site maps
    • Write steps to add sitemaps onto live server.
  • Ask EthanD to deploy the sitemaps. Did it on: User_talk:Ethan_Devenport
  • Deploy to the squal servers
  • Notify the search engines.

Steps to Deploy Sitemaps

  • Copy the sitemap files to /opt/sitemaps/versions/20080211 (where 20080211 is a folder named after your current date).
  • Add symlink:
ln -s /opt/sitemaps/versions/20080211 /opt/sitemaps/versions/current
  • In your apache configuration file, add the rule:
RewriteRule ^/sitemap(.+).xml.gz$ /sitemaps/sitemap$1.xml.gz [PT,L]
Alias /sitemaps /opt/sitemaps/versions/current

Notes



Retrieved from "http://aboutus.com/index.php?title=AboutUsSiteMap&oldid=14802422"