September 12, 2018 - This post has been updated to reflect the changes in release 0.9.9 and above. The following code examples will not work for releases before 0.9.9.

There are two Queue systems in TriTan that allows you to run background tasks. There is the regular Queue which namespace is Tritan\Queue and there is the Nodeq Queue which namespace is TriTan\Queue\NodeqQueue. You can use one or the other or you can utitlize both. In this article, I will explain the process of using the Queue system to run a background task.

First, before we continue, in order for background tasks to run, you will need to make sure to add a new a new line in your crontab on your server to run every minute. It should look similar to this: * * * * * * http://replace_url/cronjob/. Once you've done that, head over to the General Options screen and enable Cronjobs.

Now that we've taken care of the preliminaries, I will show you how to run a background task that will create and update a sitemap, add it to your root directory and update the robots.txt file. In this example, I am using the regular Queue because it allows me to use its task worker. I will use this in conjuction with Nodeq because it adds an item to a queue, and when it is done with it, it will release the item from the queue.

Now, we need to create a dropin file called sitemap.dropin.php. Since this is for the main site, this file will to go in private/sites/1/dropins/. Open sitemap.dropin.php and paste in the following code:

<?php
use TriTan\Container as c;
use TriTan\Common\Date;
use TriTan\SitemapGenerator;
use TriTan\Queue;
use TriTan\Exception\Exception;
use TriTan\Database;
use Cascade\Cascade;
use TriTan\Common\Hooks\ActionFilterHook as hook;

function generate_sitemap_task_worker()
{
    $queue = new Queue();
    $task = [];
    $task['task_worker'] = [
        /**
         * Unique processing id. Must be an integer.
         */
        'pid' => (int) 8439,
        /**
         * Unique name of this task.
         */
        'name' => 'Generate Sitemap',
        /**
         * task_callback is the name of your custom function which is associated
         * with the action_hook.
         */
        'task_callback' => '_generate_sitemap_queue',
        /**
         * action_hook is the hook that should be fired when the queue runs. This
         * is a custom action hook which you define. Do not use one that is already
         * defined by the system. It could cause adverse reactions.
         */
        'action_hook' => '_generate_sitemap_hook',
        /**
         * The Cronjob schedule. This example is every hour. To learn more
         * about cronjob schedule format, check out
         * https://www.cyberciti.biz/faq/how-do-i-add-jobs-to-cron-under-linux-or-unix-oses/
         */
        'schedule' => '0 * * * *',
        /**
         * If your task is not working, set this to true for debugging.
         */
        'debug' => (bool) false,
        /**
         * How long the processing is expected to take in seconds. After this
         * expires, the item will be reset and another consumer can claim the item.
         */
        'max_runtime' => (int) 30,
        /**
         * When you want to disable a task from running for a while, set this to
         * false.
         */
        'enabled' => (bool) true
    ];
    $queue->enqueue($task);
}

function _generate_sitemap_queue()
{
    $now = (new Date())->format('YYYY-MM-DD H:i A');
    $db = new Database();

    $time = explode(" ", microtime());
    $time = $time[1];

    // create object
    $sitemap = new SitemapGenerator(site_url());

    // will create also compressed (gzipped) sitemap
    $sitemap->createGZipFile = false;

    // determine how many urls should be put into one file
    $sitemap->maxURLsPerSitemap = 10000;

    // sitemap file name
    $sitemap->sitemapFileName = "sitemap.xml";

    // sitemap index file name
    $sitemap->sitemapIndexFileName = "sitemap-index.xml";

    // robots file name
    $sitemap->robotsFileName = "robots.txt";

    /**
     * Add main site url to sitemap.
     */
    $sitemap->addUrl(site_url(), date('c'), 'daily', '1');

    /**
     * Url's array.
     */
    $urls = [];

    /**
     * Retrieve posts with the page posttype to add to sitemap.
     */
    $pages = $db->table(c::getInstance()->get('tbl_prefix') . 'post')
            ->where('post_type.post_posttype', 'page')
            ->where('post_status', 'published')
            ->where('post_created', '<=', $now)
            ->sortBy('post_created', 'DESC')
            ->get();
    foreach ($pages as $page) {
        $urls[] = [site_url(esc_html($page['post_slug']) . '/'), date('c'), 'daily', '0.5'];
    }

    /**
     * Add the main blog page to sitemap.
     */
    $sitemap->addUrl(site_url('blog/'), date('c'), 'daily', '0.5');

    /**
     * Retrieve posts with the post posttype to add to sitemap.
     */
    $posts = $db->table(c::getInstance()->get('tbl_prefix') . 'post')
            ->where('post_type.post_posttype', 'post')
            ->where('post_status', 'published')
            ->where('post_created', '<=', $now)
            ->sortBy('post_created', 'DESC')
            ->get();
    foreach ($posts as $post) {
        $urls[] = [site_url('blog/' . esc_html($post['post_slug']) . '/'), date('c'), 'daily', '0.5'];
    }

    // add many URLs at one time
    $sitemap->addUrls($urls);

    try {
        // create sitemap
        $sitemap->createSitemap();

        // write sitemap as file
        $sitemap->writeSitemap();

        // update robots.txt file
        $sitemap->updateRobots();
    } catch (Exception $ex) {
        Cascade::getLogger('error')->{'error'}(sprintf('QUEUE[%s]: %s', $ex->getCode(), $ex->getMessage()));
    }

    Cascade::getLogger('info')->{'info'}("Memory peak usage: " . number_format(memory_get_peak_usage() / (1024 * 1024), 2) . "MB");
    $time2 = explode(" ", microtime());
    $time2 = $time2[1];
    Cascade::getLogger('info')->{'info'}("
Execution time: " . number_format($time2 - $time) . "s");
}

hook::getInstance()->{'addAction'}('_generate_sitemap_hook', '_generate_sitemap_queue', 10);
hook::getInstance()->{'addAction'}('ttcms_task_worker_cron', 'generate_sitemap_task_worker', 10);

So let's break it down to better under stand what's going on. Our new dropin consists of two functions: generate_sitemap_task_worker() and _generate_sitemap_queue().

In the generate_sitemap_task_worker(), we have key/value pairs that are needed to add to the task worker:

  1. pid = This is a unique processor id. This helps identify which process is stalling if you run into issues with a particular background task.
  2. name = This is the unique name for the task.
  3. task_callback = This is the function that should be called to carry out the task when added to the queue.
  4. action_hook = When this hook fires, the task_callback function will run.
  5. schedule = How often should this be added to the queue to run. This example is set to every hour.
  6. debug = If your task is not being added to the queue, set this to true, and check out the error logs.
  7. max_runtime = How long do you expect the particular task to run. For this example, it is set for 30 seconds because it shouldn't take that long to create a sitemap for a small site.
  8. enabled = Set this to false, if the task should be paused or is no longer needed.

The second function _generate_sitemap_queue() will generate the sitemap. It will generate a record for the main site, the blog page and the single blog posts. In this example, I am not using post_relative_url to generate links for the sitemap, but instead I am using a custom route which is blog + post_slug. Otherwise, you can use site_url( $post['post_relative_url'] ) to generate the native blog urls. 

So, that is it. The task worker will be added to the task node, it will run based on the schedule, it will be added to the queue, and it will update the sitemap with new records if found.

Need help? Head on over to the Github repository and create a new issue.