The HTTP specification allows browsers to perform conditional requests avoiding use of extra bandwidth if the page was not modified. In 43 Things I added support for conditional HTTP requests to some of our RSS feeds to save bandwidth.
As an alternative you can use the built-in page caching which uses the web server to automatically handle conditional requests, but this is not always appropriate. If you have content that is customized for a user, like home page links for logged-in users, the page cache won't work because different users will have different home page links. In our case it was simpler to add a few lines to our code than to set up a shared space to use as a page cache.
The solution I chose can be applied to any page you can determine a last modified time for.
The hard part in adding Not Modified support is deciding when a page was last modified. For RSS feeds related to a single person's content I use "ten minutes before that person last visited the website". I can't use the last visited date because the user will create a new entry or comment then visit the a page with the content they just created which makes the last visited time be after the time that they created the content.
The conditional requests are handled by a method I've named check_modified which I added to the RssController:
def check_modified(person)
return false unless @request.env.include? 'HTTP_IF_MODIFIED_SINCE'
since_date = Time.parse @request.env['HTTP_IF_MODIFIED_SINCE']
last_visit_date = person.last_visit_date
return false if last_visit_date.nil? # last_visit_date may be nil
# Last visit date will always be > last update, so let RSS
# feeds be up to 10 minutes stale.
content_date = last_visit_date - 600
if since_date >= content_date then
render :nothing => true, :status => 304
return true
end
return false
end
Which is use in an action method roughly like this:
def uber
person = Person.find_by_username params[:username]
return if check_modified person
# collect RSS feed items ...
@pub_date = @rss_items.sort_by { |item| item.content_date }.last.content_date
@response.headers['Last-Modified'] = @pub_date.httpdate
end
So, if there's no If-Modified-Since header in the request (CGI turns this into the HTTP_IF_MODIFIED_SINCE environment variable) there's nothing to check. Next the content date and the since date are compared and a 304 response is returned with an empty body if the user hasn't updated anything.
Filling in some missing context, a person is an instance of our user model (called Person). The last_visit_date for a person is stored in memcached to prevent writing to the database on every page view. The last_visit_date may have expired from memcached, so I need to check for a nil.
When a browser or RSS reader first visits the page they'll make note of the Last-Modified header and store its value. When you next reload the page the browser will add an If-Modified-Since header including the date it remembered when it last loaded the page. The times will be compared on the server and either a complete 200 OK request or an empty 304 Not Modified response will be returned. For RSS readers that frequently refresh feeds this can significantly reduce bandwidth consumption.

Leave a comment