XML Sitemap

Having an XML site map of your site submitted to Google is one of the fastest ways to get your site deeply indexed by Google. The more quality pages that are indexed, the better for your sites optimization health. Drupal ‘s XML Sitemap module, formally known as Google Sitemap, will automatically generate an xml sitemap that conforms to the proper specifications.

With it, you can specify priority for individual pages and terms. You can allow log access so you can determine through watchdog how often the sitemap has been accessed and by whom. You can also specify whether you want your sitemap submitted when changes are made and to which major engines. You can also attach submission to cron.

After installing XML Sitemap, you can enable the options to include nodes, taxonomy terms and user data. Each of these are individually listed in the modules page. If you are running the Pathauto module, it is highly recommended that you download and install the Module Weight module. Because the pathauto module is by default executed after the xml module, it will generate the original paths in the sitemap instead of the optimized aliases.
Once you install Module Weight, this will add an additional weight column in the modules table. I have set the weight for the xml modules to 10 to ensure they are executed last.

The XML sitemap module for Drupal 6 outputs the correct alias in the sitemap without the module weight being enabled.

To configure the xml sitemap, navigate to Administer > Site Configuration > XML Sitemap. You can leave the default setting as is. Click on the Search Engine tab and check the box for Log Access. You can add your personal data and verification for Google found in your Webmaster Central control panel.
Lastly, you can add custom urls to your sitemap by clicking on the Additional tab.

When you create your content or vocabularies, you will now have the option to determine the priority as it is added to your sitemap. You also have control over each term and node from the edit pages.

Your sitemap will be located at http://www.yoursite.com/sitemap.xml. For visual purposes, Drupal will generate the map using CSS. However, with the latest version, I get an error message in Firefox that states, Error loading stylesheet: An XSLT stylesheet does not have an XML mimetype: (only when I access it without the www).It is viewable as a table in Internet Explorer.

Make sure you have a good look at the urls being generated in the sitemap. It's a good idea to remove pages that you are disallowing in your drupal robots.txt file. You can do this by selecting "not is sitemap" under the priority field in the sitemap settings that are located at the bottom of each node edit page. You can also remove pages by content type and taxonomy.

Here's another thing to watch out for. Make sure that none of the urls that you are using 301 Redirect/Mod Rewrite are present in your xml sitemap. For example, on this site, I have the top level taxonomy pages redirecting to the parent book page that represents each category. However, I also had the term pages in the sitemap. With these settings, it returned a warning message in my Google webmaster tools dashboard that told me to remove the redirecting url and replace it with the destination url.

about Weight module

At the weight module your read

"Users are encouraged to use CCK and views instead of this module."

Is possible to achieve the same result with views and CCK?

manoloka | Fri, 02/22/2008 - 20:44

The Weight module is

The Weight module is different than the Module Weight module. What Module Weight does is order the execution of your modules and is used with XML sitemap to make sure that you pathauto alias are generated in your sitemap. Otherwise, the sitemap would display the original node/### style urls.

The Weight module is used to order node types which can also be accomplished with CCK and Views.

Michelle | Sat, 02/23/2008 - 03:55

I see, my mistake :-( Thanks

I see, my mistake :-(

Thanks

manoloka | Sat, 02/23/2008 - 09:21