How It Works

The Sitemap plug-in generates sitemaps in the sitemaps.org XML format specified in http://www.sitemaps.org/protocol.html. This is what a (very small) sitemap.org sitemap document looks like:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
   <url>
      <loc>http://www.example.com/incoming/article51.ece</loc>
      <lastmod>2013-05-31T11:51:13+06:00</lastmod>
   </url>
   <url>
      <loc>http://www.example.com/incoming/article42.ece</loc>
      <lastmod>2013-05-31T11:51:23+06:00</lastmod>
   </url>
</urlset>

The plug-in can generate two basic types of sitemap:

Aggregated sitemap

An aggregated sitemap contains the URLs of all selected content items that are in a published state at the time the sitemap is generated. This kind of sitemap is only really intended to be generated one time, when a site is first published and you want ensure that the entire site is indexed. The idea is that you explicitly request generation of the sitemap yourself and then upload it to the search engines you are interested in.

Update sitemap

An update sitemap only contains the URLs of recently published content items that have been published recently (by default over the last 72 hours). The idea is that you publish the URL of this sitemap in your site's robots.txt file so that in can be found by search engine indexers, which periodically visit it and index all the listed URLs. Alternatively you can control the process yourself by creating an application or cron job that actively posts it to the search engines you are interested in at intervals.

Both types of sitemap have exactly the same structure, the only difference is the number of entries they contain.

In order to prevent sitemap documents becoming unmanageably large, the sitemaps.org standard allows sitemaps to be split into multiple documents that are then referenced by a master sitemap index. The Sitemap plug-in makes use of this feature. It generates one sitemap index per Escenic publication, which in turn references one sitemap document for every content type that you choose to include. Here is a small example of a sitemap index:

<?xml version="1.0" encoding="UTF-8"?>
   <sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
   <sitemap>
      <loc>http://www.example.com/sitemap/sections.xml</loc>
      <lastmod>2013-05-31T12:07:35+06:00</lastmod>
   </sitemap>
   <sitemap>
      <loc>http://www.example.com/sitemap/news.xml</loc>
      <lastmod>2013-05-31T12:07:35+06:00</lastmod>
   </sitemap>
   <sitemap>
      <loc>http://www.example.com/sitemap/review.xml</loc>
      <lastmod>2013-05-31T12:07:35+06:00</lastmod>
   </sitemap>
   <sitemap>
      <loc>http://www.example.com/sitemap/video.xml</loc>
      <lastmod>2013-05-31T12:07:35+06:00</lastmod>
   </sitemap>
</sitemapindex>

If number of articles of a content type exceeds entry per sitemap value which is defined in SitemapConfig.properties then sitemap documents are generated based on the ratio of entry per sitemap value for every content type. Here is a small example of a sitemap index which contains multiple sitemap documents for a single content type:

<?xml version="1.0" encoding="UTF-8"?>
       <sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
          <sitemap>
             <loc>http://www.example.com/sitemap/sections.xml</loc>
             <lastmod>2013-05-31T12:07:35+06:00</lastmod>
          </sitemap>
          <sitemap>
             <loc>http://www.example.com/sitemap/news/1.xml</loc>
             <lastmod>2013-05-31T12:07:35+06:00</lastmod>
          </sitemap>
          <sitemap>
              <loc>http://www.example.com/sitemap/news/2.xml</loc>
              <lastmod>2013-05-31T12:07:35+06:00</lastmod>
          </sitemap>
          <sitemap>
             <loc>http://www.example.com/sitemap/review.xml</loc>
             <lastmod>2013-05-31T12:07:35+06:00</lastmod>
          </sitemap>
          <sitemap>
             <loc>http://www.example.com/sitemap/video.xml</loc>
             <lastmod>2013-05-31T12:07:35+06:00</lastmod>
          </sitemap>
       </sitemapindex>

You choose which content types you want to be included by adding seo:enabled elements to content types in your publication content-type resources (see Editing the content-type Resource).