Tag Archives: apache

Happy 10th Birthday Apache

I do a podcast called Feathercast, about technologies and people within the Apache Software Foundation. I do this for a number of reasons.

I love playing with technology, even when I don’t really understand it. Using it is the best way to understand it, and I’ve learned a lot about audio recording in this process, although I’m still far from an expert.

I get to talk with some amazing people, and ask them about stuff that’s truly fascinating.

And I enjoy educating. I like to weasel out the important details and teach people about things that they might otherwise have dismissed as unimportant. I like taking complicated ideas and explaining them in terms that everyone can understand.

In other words, it’s a mixture of selfishness and altruism, as are all worthwhile human endeavors. If we’re doing something entirely for ourselves, that’s no good, but it’s also important to have a passion for something, and for it to be fun.

Coincidentally, these are the reasons that I’m involved in the ASF. They happened in a different order – I got involved because I found an interesting technology and started writing about it. But along the way I’ve met some amazing people – Douglas Adams, Brian Behlendorf, Arthur C Clarke, Sanjiva Weerawarana, Mark Shuttleworth, Ken Coar, Deepal Jayasinghe, Larry Wall, and so many others it’s impossible to list them. Some of these people I’ve come to consider friends.

I’ve also had the opportunity to be involved in amazing technologies that have changed the way we communicate, play, and do business. The Web is, of course, built on generations of advances, and even more amazing things are to come, but it’s been a fascinating ride to be part of that.

Apache, and other open source technologies that I’ve had the opportunity to be involved in, have changed the world, and I got to be part of that, because they are open source, where the willingness to participate is rewarded with the permission to participate, unlike so many other parts of our world. We get to be a part of things that matter, and the barrier to entry is that willingness to participate and make a difference.

It’s a great honor to be a member of the Apache Software Foundation. It’s a badge that I wear with pride, both because I know how hard I worked to achieve it, and because I’ve seen the other amazing things that the ASF has accomplished.

Happy Birthday, Apache. Here’s hoping the next ten years are as exciting as the last ten, and that I get the chance to be even more involved than I have for the last ten.

mod_rewrite docs rewrite at ApacheCon

The plan, (assuming I don’t get sidetracked on a million other things, which is what usually happens) is to do a major overhaul of the mod_rewrite documentation during the hackathon at ApacheCon. Please speak up if you have specific comments or recommendations. So far, the outline is something like this.

1) A couple of years ago, I split the “Rewrite Guide” into basic and advanced. This was ill-advised, and the division was stupid. Now it’s just harder to find stuff. Going to re-merge those, and then try to do a division based on topic, rather than difficulty, since that’s not a particularly useful concept.

2) Rewrite cookbook, divided into categories of, perhaps:
a. redirecting/remapping
b. controlling access
c. when not to use mod_rewrite (aka ‘mod_rewrite is obsolete’)
d. advanced features

3) Scrap the inscrutable examples. Both the guide and the formal docs are littered with examples that either never happen in the real world, or are done better using some of the built-in functionality of other modules like mod_alias and mod_dir. Scrap those examples entirely, rather than continuing to try to make then scrutable.

4) Rewrite Flags documentation. Started this years ago, and never really finished it. Also, needs to be updated to include the new flags that have been added in 2.2 and trunk.

5) General grammatical overhaul, hopefully with help from Noirin, who has better grammar than all the rest of us put together. (Actually, that’s the problem – it was written by all of the rest of us put together, resulting in a mish-mash of styles and voices.)

6) A document about (so-called) S.E.O. uses of mod_rewrite, discussing both the techniques that can be used, and the misinformation that tends to drive the desire to use those techniques. This needs to be handled carefully, because there’s a tendency to simply state that all SEO is snake oil – which much of it is – and ignore the topic entirely. But, folks are going to do this stuff whether or not we approve, and it’s better if they do it well. At least, that’s what I think at this particular moment.

2c, above, is both about stuff that you shouldn’t do with mod_rewrite at all, and also some of the new features in 2.2 and trunk that make mod_rewrite unnecessary.

Tomcat at ApacheCon

Tomcat is one of the oldest members of the Apache family, and one of the standard building blocks of the web as we know it today. It can sometimes fall below the radar, because it just works, so most folks are completely unaware of it.

Filip Hanik will be doing a training class on Tomcat at ApacheCon this year. I spoke with him last week for Feathercast, and I’ve finally edited it. You can listen here, or come to ApacheCon and hear him there.

FallbackResource

There’s a new directive in mod_dir, and I’m very pleased about it, because it’s something that we’ve wanted for a long time, and which a LOT of people will benefit from. The directive is called FallbackResource, and is documented here.

Many web applications use the concept of a “front controller”, which means that every incoming request is mapped to a single script/handler/program that determines what needs to be done based on the URI. There’s some kind of an internal URL parsing mechanism that takes the action and arguments from the URI and sends it to the correct function or controller to produce output.

In order to produce this effect, there needs to be some kind of rewrite that sends all requests to that front controller. Something like this:

RewriteEngine On
RewriteRule (.*) index.php

Then, when the URI /foo/bar is requested, that gets sent to index.php, which, in turn, looks at the variable REQUEST_URI (‘/foo/bar’) and figures out what to do.

The trouble with this approach is that actual file resources, like ‘my_logo.jpg’ and ‘style.css’ also get sent to index.php, so we have to take steps to avoid that happening:

RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule (.*) index.php

-f and -d are rewrite-speak for “is a file” and “is a directory”, respectively. So this says “if it’s not a file, and not a directory, do the rewrite.”

A lot of web app projects use a rewrite ruleset exactly like this one. WordPress does. CakePHP does. Drupal and Habari and Django, and lots of other things do. Unfortunately, many of them get it horribly wrong, or, sometimes, just sufficiently wrong to make it hard to troubleshoot. And WordPress famously had a 72-line version of this at one point. Fortunately, they fixed that several years ago.

But even that isn’t enough, because sometimes you have Aliases that need to be explicitly avoided, and you’ll need to list those, too:

RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_URI} !^/(pony|server-status|server-info|icons)
RewriteRule (.*) index.php

And, of course, you may have other rewrite rules for other things on your site. And with added complexity comes added fragility, so there’s greater chance of breaking something, or just getting confused.

All of the above is now replaced by one directive:

FallbackResource index.php

That’s all. It says “if a request doesn’t map to something, send it to index.php.”

This directive is in trunk, and so will be in the 2.3/2.4 release. I don’t know if it will get backported to 2.2.

[DPI]

The Apache Web Server version 2.2.12 released today, including the following nugget of joy:

‘discardpathinfo|DPI’ (discard PATH_INFO)

In per-directory context, the URI each RewriteRule compares against is the concatenation of the current values of the URI and PATH_INFO.

The current URI can be the initial URI as requested by the client, the result of a previous round of mod_rewrite processing, or the result of a prior rule in the current round of mod_rewrite processing.

In contrast, the PATH_INFO that is appended to the URI before each rule reflects only the value of PATH_INFO before this round of mod_rewrite processing. As a consequence, if large portions of the URI are matched and copied into a substitution in multiple RewriteRule directives, without regard for which parts of the URI came from the current PATH_INFO, the final URI may have multiple copies of PATH_INFO appended to it.

Use this flag on any substitution where the PATH_INFO that resulted from the previous mapping of this request to the filesystem is not of interest. This flag permanently forgets the PATH_INFO established before this round of mod_rewrite processing began. PATH_INFO will not be recalculated until the current round of mod_rewrite processing completes. Subsequent rules during this round of processing will see only the direct result of substitutions, without any PATH_INFO appended.

It might also be handy, some day, to have a [DQS] flag that discards the query string explicitly, rather than having to do silly tricks to make it disappear.

httpd.conf – the Apache Web Server conference

The Apache Software Foundation and the Apache Conference Committee are delighted to announce httpd.conf – the mini-conference focusing on the Apache Web Server, embedded into the larger ApacheCon conference.

httpd.conf consists of two days of talks focused on the Apache Web Server, as well as two days of pre-conference training classes.

On Monday, November 2, and Tuesday, November 3, JimJag and I will be teaching a two-day training class on the Apache Web Server for the beginner who wants to get up to speed on how to run the server.

On Wednesday, you can take a look around you at some of the other Apache technologies.

Then, on Thursday and Friday, there’s a full day of httpd talks. Thursday is primarily focused on the administrator, while Friday is focused mainly on the developer.

And, of course, there’s also a full schedule of other ApacheCon events and talks, including the 10th anniversary party for the Apache Software Foundation. So make plans to come to Oakland in November.