Generating sitemap for a headless WP + Next.js site

In this tutorial, you will learn how to create a dynamic sitemap for your headless WordPress site. I’m not going to create a full site only going to focus on creating the sitemap but Jeff Everhart has a great tutorial you can checkout.

You can find the full project source code here.

Live site Url.

What do we need to create a sitemap?

At a basic level, all you need is the URLs of all the pages on your site in an XML page.

Here is a basic XML sitemap example

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>http://www.example.com/foo.html</loc>
    <lastmod>2018-06-04</lastmod>
  </url>
</urlset>

Learn more about sitemap here.

Let’s take an example

We have a WordPress blog site and it has categories, tags, posts, and pages. We will create min one sitemap page for each of these type but if we have more than 1000 items in any type we will add another page for that type.

Let’s say on our site we have 2K posts, 100 categories, 600 tags, and 10 pages. So here’s what our sitemap index page will look like.

<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <sitemap>
    <loc>http://www.example.com/sitemap/post_sitemap1.xml</loc>
  </sitemap>
<sitemap>
    <loc>http://www.example.com/sitemap/post_sitemap2.xml</loc>
  </sitemap>
  <sitemap>
    <loc>http://www.example.com/sitemap/category_sitemap1.xml</loc>
  </sitemap>
  <sitemap>
    <loc>http://www.example.com/sitemap/tag_sitemap1.xml</loc>
  </sitemap>
  <sitemap>
    <loc>http://www.example.com/sitemap/page_sitemap1.xml</loc>
  </sitemap>
</sitemapindex>

Configure Your WordPress Site

We are going to use this plugin to get our sitemap info from WordPress. Github Link. Download this plugin from GitHub and upload it to your WordPress site.

Let’s see quickly how this plugin works

Our WP Sitemap Rest API Plugin adds four endpoints to WordPress rest API.

/wp-json/sitemap/v1/totalpages
/wp-json/sitemap/v1/author?pageNo=1&perPage=1000
/wp-json/sitemap/v1/taxonomy?pageNo=1&perPage=1000&taxonomyType=category or tag
/wp-json/sitemap/v1/posts?pageNo=1&perPage=1000&postType=post or page

Let’s create a sitemap index page 

In your NextJS project pages folder create a file name sitemap.xml.js

import getSitemapPages from "~/utils/getSitemapPages";
import getTotalCounts from "~/lib/getTotalCounts";
export default function SitemapIndexPage() {
  return null;
}
export async function getServerSideProps({ res }) {
  const details = await getTotalCounts();
  const { totalCategories, totalPublishedPages } = details;
  const { totalPublishedPosts, totalTags, totalUsers } = details;
  const categoryPaths = getSitemapPages(totalCategories, "category_sitemap");
  const tagPaths = getSitemapPages(totalTags, "tag_sitemap");
  const postPaths = getSitemapPages(totalPublishedPosts, "post_sitemap");
  const pagePaths = getSitemapPages(totalPublishedPages, "page_sitemap");
  const authorPaths = getSitemapPages(totalUsers, "author_sitemap");

  let sitemapIndex = `<?xml version='1.0' encoding='UTF-8'?>
  <sitemapindex xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
           xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/siteindex.xsd"
           xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
     <sitemap>
     ${pagePaths}
     ${authorPaths}
     ${categoryPaths}
     ${tagPaths}
     ${postPaths}
     </sitemap>
  </sitemapindex>`;
  res.setHeader("Content-Type", "text/xml; charset=utf-8");
  res.setHeader(
    "Cache-Control",
    "public, s-maxage=600, stale-while-revalidate=600"
  );
  res.write(sitemapIndex);
  res.end();
  return { props: {} };
}

As you can see this is going to be an SSR page as we want this to be a dynamic sitemap. Let’s see how the two main functions are working.

1. getTotalCounts()

import axios from "axios";

export default async function getTotalCounts() {
  const res = await axios.get(
    `${process.env.NEXT_PUBLIC_WORDPRESS_URL}/wp-json/sitemap/v1/totalpages`
  );
  return (await res?.data) ?? {};
}

This is a simple fetch function to get the total number of pages, posts, authors, etc you have on your WP site.

2. getSitemapPages()

export default function getSitemapPages(count, path) {
  const items = [];
  let sitemapPerPage = process.env.NEXT_PUBLIC_ITEM_PER_SITEMAP; //1000
  for (let i = 1; i <= Math.ceil(count / sitemapPerPage); i++) {
    let url = `${process.env.NEXT_PUBLIC_FRONTEND_URL}/sitemap/${path}${i}.xml`;
    items.push(
      ` 
      <sitemap>
         <loc>
            ${url}
        </loc>
    </sitemap>
    `
    );
  }
  return items.join("");
}

This function takes two parameters number of URLs and the name of the slug and returns how many pages need to be returned for this type. Let’s see two examples this will help you see how it works.

await getSitemapPages(500, "category_sitemap");

{/* <sitemap>
<loc>http://www.example.com/sitemap/category_sitemap1.xml</loc>
</sitemap>
 */}

await getSitemapPages(2300, "tag_sitemap");

{/* <sitemap>
<loc>http://www.example.com/sitemap/tag_sitemap1.xml</loc>
</sitemap>
<sitemap>
<loc>http://www.example.com/sitemap/tag_sitemap2.xml</loc>
</sitemap>
<sitemap>
<loc>http://www.example.com/sitemap/tag_sitemap3.xml</loc>
</sitemap> */}

Now that our index sitemap is done let’s see how we can create all the individual sitemap pages. You can see the live example site here.

Create sitemap pages

In your pages folder create a file name [slug].js inside sitemap folder.`pages/sitemap/[slug].js`

import getSitemapPageUrls from "~/lib/getSitemapPageUrls";
import generateSitemapPaths from "~/utils/generateSitemapPaths";
export default function SitemapTagPage() {
  return null;
}
export async function getServerSideProps({ res, params: { slug } }) {
  let slugArray = slug.replace(".xml", "").split("_");
  let type = slugArray[0];
  let pageNo = slugArray[1]?.match(/(\d+)/)[0] ?? null;
  let page = pageNo ? parseInt(pageNo) : null;
  let possibleTypes = ["category", "tag", "post", "page", "author"];
  if (!page || !possibleTypes.includes(type)) {
    return {
      notFound: true,
    };
  }
  let pageUrls = await getSitemapPageUrls({ type, page });
  if (!pageUrls?.length) {
    return {
      notFound: true,
    };
  }
  let sitemap = `<?xml version="1.0" encoding="UTF-8"?>
  <urlset xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd"
  xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
    ${generateSitemapPaths(pageUrls)}
  </urlset>`;
  res.setHeader("Content-Type", "text/xml; charset=utf-8");
  res.setHeader(
    "Cache-Control",
    "public, s-maxage=600, stale-while-revalidate=600"
  );
  res.write(sitemap);
  res.end();
  return { props: {} };
}

If you are familiar with NextJS dynamic routes [slug].js will give you the slug that the user will visit. Let’s say you visit this URL `/sitemap/post_sitemap1.xml` so how can you get the page type and the page number from this string (post_sitemap1.xml).? You can split this string into an array with the separator as `_` so the first element of the array would be the page type and the other/last element in the array contains the page number and you can easily get that number with a simple regular expression that extracts the number from the string. Next, You validate a few things to see if it follows our sitemap index page’s URL pattern or throws a 404 page.

1. getSitemapPageUrls()

import axios from "axios";

export default async function getSitemapPageUrls({ type, page }) {
  let sitemapPerPage = process.env.NEXT_PUBLIC_ITEM_PER_SITEMAP; // 1000
  let wpSiteUrl = process.env.NEXT_PUBLIC_WORDPRESS_URL; // https://www.example.com
  if (type === "category" || type === "tag") {
    const res = await axios.get(
      `${wpSiteUrl}/wp-json/sitemap/v1/taxonomy?pageNo=${page}&taxonomyType=${type}&perPage=${sitemapPerPage}`
    );
    return (await res?.data) ?? [];
  }
  if (type === "author") {
    const res = await axios.get(
      `${wpSiteUrl}/wp-json/sitemap/v1/author?pageNo=${page}&perPage=${sitemapPerPage}`
    );
    return (await res?.data) ?? [];
  }

  const res = await axios.get(
    `${wpSiteUrl}/wp-json/sitemap/v1/posts?pageNo=${page}&postType=${type}&perPage=${sitemapPerPage}`
  );
  return (await res?.data) ?? [];
}

Let’s see how `getSitemapPageUrls` function works, we can see it takes an object with two properties `{type and page}` as a parameter. So based on our above example our type would be post and page would be 1 and it should trigger the fetch to /wp-json/sitemap/v1/posts?pageNo=1&postType=post&perPage=1000 route.

After we get the URLs for any type of page we need to generate the paths and for that, we are using this `generateSitemapPaths` Let’s take a look at how it works.

2. generateSitemapPaths()

export default function generateSitemapPaths(array) {
  let frontendUrl = process.env.NEXT_PUBLIC_FRONTEND_URL;
  const items = array.map(
    (item) =>
      `
          <url>
              <loc>${frontendUrl + item?.url}</loc>
              ${
                item?.post_modified_date
                  ? `<lastmod>${
                      new Date(item?.post_modified_date)
                        .toISOString()
                        .split("T")[0]
                    }</lastmod>`
                  : ""
              }
          </url>
          `
  );
  return items.join("");
}

This function receives an array of objects containing url and post_modified_date and it returns an XML string version.

Hurray! 🎉

Congratulation you have created a fully dynamic sitemap that will be always upto date with your wp site and no matter how much your site scales, you will naver have to worry about sitemap. If you have any question you can reach me at @maikap_dipankar

Dipankar Maikap

Dipankar Maikap

I'm a freelance web developer. I work out of Kolkata, India. My favorite number is 77.

Leave a Reply