Fixing Drupal's Last-Modified header for RSS feeds
Update: I've added a patch for Drupal 7 that addresses a similar issue for the core node module. Take a look if you're interested.
If you use Drupal with Twitterfeed or other services that consume its RSS feeds, you may find that new posts aren't recognized. I recently had this problem with a client's website, but thanks to some excellent support from Twitterfeed (Excellent support, and on a free service no less. Thanks, Mario!), I was able to track down the problem and fix it.
In essence, Twitterfeed keeps track of the HTTP "Last-Modified" header to save bandwidth and other effort. They expect that Last-Modified will be set to the date the most recent entry in the feed was published, which seems reasonable. The problem with Drupal here is that the Last-Modified header is always set to the time of the request. Twitterfeed told me that when the Last-Modified value matches the request time, they ignore it, and consequently can miss posts (a manual "check now" will force Twitterfeed to ignore Last-Modified, but the whole point is to let this be automated). Other software does this as well, apparently, so it's not unique to Drupal (though for what it's worth, WordPress's core feeds set the value as expected).
The built-in rss.xml feed is controlled by node module, and can't be altered without making your own copy of it as far as I can see, but I'm going to try to fix that for a future release (see this issue). Meantime, here's how you can fix this in your theme (or a module if you prefer) if you're using Drupal 7 and Views to generate your RSS feed(s), with the default RSS Feed display format, the newest post being first in the feed, and all that sort of thing. Add this code to your template.php:
<?php
/**
* Set the Last-Modified header of the RSS feed to the date of the most recent post
*/
function phptemplate_views_pre_render(&$view) {
switch ($view->name) {
case 'YOURVIEWNAME' :
if ($view->current_display == 'feed') {
$posts = $view->result;
drupal_add_http_header('Last-Modified', gmdate(DATE_RFC2822, $posts[0]->node_created));
}
break;
}
}
?>
hook_views_pre_render fires right before rendering the view, so the query has been run, and importantly the dates of all the items in the feeds are known.
You'll need to change YOURVIEWNAME to the machine name of your view, the current_display value if it doesn't match your feed, and maybe phptemplate in the function name to the name of your theme if you like. Clear your Drupal cache (or at least your theme registry cache) to make sure the function is picked up, clear your Views caches if needed, then open up your favorite Web Inspector or HTTP sniffer and refresh your feed, checking the value of Last-Modified (note that the value is always GMT, as required by the spec). Here's what you're looking for in Chrome, on the Network tab:
By the way, Twitterfeed is a really handy service if you want to automatically post content to various social networks. It's from bitly, and though it's free, they're apparently getting enough value from it that they have no plans to charge money for it. I find it to be a much better solution than adding modules to Drupal that would do the same thing. This is one kind of outsourcing I can totally get behind!
Comments
That solution won't update the header if a node gets removed from the feed.
How about simplifying it to just:
drupal_add_http_header('Last-Modified', gmdate(DATE_RFC2822));
And then enable views caching, probably with views_content_cache.
If that header is set in hook_views_pre_render then views caching will include that header in the cache and re-use it when serving the feed from cache.
If you want your feed's date to update when older content is removed, that's a fine-sounding solution. For this purpose, the modification date needed to be the same as the most recently published post, so caching that date with the feed, until there's new stuff in it, is the desired approach.