Redirecting documentation

I just spent an inordinately long time trying to get a simple documentation redirect working.

This was suggested to me by the recent discussion on PHP, where it was noted that http://php.net/foo takes you directly to the documentation on the function foo. Which is, of course, a really cool feature.

So, I decided that the Apache docs need that too. But while this would be really easy to do with, say, php, or perhaps a mod_perl handler, I wanted to do it with stuff that comes with Apache. mod_rewrite seemed like the obvious choice.

There are, however, a number of obstacles. mod_rewrite is, to put it nicely, documented in a user-hostile manner. Particularly in a beginner hostile manner.

I had already decided that I needed to use RewriteMap, because I have a large number of URLs that I want to redirect, and I want that list to be easily updateable. So, using a simple Perl script, I generated a starter map file from the XML documentation source:

#!/usr/bin/perl
use XML::Simple;

foreach my $file(@ARGV) {
    my $doc = XMLin($file);
    
    my $mod = $file;
    $mod =~ s/^.*mod_//;
    $mod =~ s/.xml$//;
    
    my $directives = $doc->{directivesynopsis};
    foreach my $d (keys(%$directives)) {
        next unless $d;
        print "$d http://httpd.apache.org/docs-2.0/mod/mod_$mod.html#$dn"
    }   
}

Running this script:

xml2rewritemap mod_rewrite.xml

generated a starter rewrite map. A single line of this file looks simply like:

RewriteMap http://httpd.apache.org/docs-2.0/mod/mod_rewrite.html#RewriteMap

So far, so good. If this works, it’s a simple matter to generate a map for the entire documentation tree, since it’s all in XML.

Then, in httpd.conf, I put the following:

RewriteEngine On
RewriteMap directive2url txt:/www/vhosts/drbacchus/fajita/scripts/rewritemap.txt

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^/([^/]*)$ ${directive2url:$1} [NC,R]

So that creates the mapping function “directive2url”, pointing at the rewritemap file I generated. And then it runs that function on any URL which generated a 404. (Yes, that’s slightly over-simplifying.)

And it almost works. Going to http://fajita.drbacchus.com/RewriteMap *almost* redirects you to the desired location. Unfortunately, the # gets converted to a %23, which, in turn, generates a 404.

Well, after a bit of rooting around, it turns out that this exact problem is covered in the RewriteGuide (Look for “extended redirection”) and that it’s actually by design. Hrmph. Well, I may end up doing this without mod_rewrite. But it will be very disappointing. Perhaps I should agitate for a DoNotEscape flag for RewriteRule so that stuff like this isn’t such a pain.

Or maybe there already is one, and I’m just missing it because the mod_rewrite documentation is written for people who have already read the code. *sigh*.