RSS Entries RSS
RSS Subscribe by Email

Intro to URL Rewriting with Apache’s .htaccess

I have created an .htaccess file to do URL rewriting for every site I’ve ever created. If you’re not familiar with URL rewriting, it is used to modify a URL or redirect the user before the requested resource is fetched. One of its major uses is to make URLs human readable. That means your users can visit a pretty URL like http://www.tabworldonline.com/guitar/A/ and have it interpreted by the server as http://www.tabworldonline.com/artists.php?instrument=guitar&letter=A.

Most of the time, this file can be relatively simple. I would always recommend using one for URL canonicalization, which is a fancy term for making sure you have one unique URL for each page. For example, lumidant.com redirects to www.lumidant.com. This is beneficial for SEO because you want to ensure that search engines don’t split your ranking points between pages that are actually one and the same.

The code below is the .htaccess file from this site. The declarations in the file are regular expressions, which you might need to get a quick refresher on if you’re not familiar with. A few other things to be aware of include the fact that [NC] stands for no case and means that the text is not case-sensitive, [R=301] tells the server to do a 301 redirect, and [L] tells the server it can quit there and and not bother processing the rest of the file.

<IfModule mod_rewrite.c>

  RewriteEngine on

  # rewrite all lumidant.com requests to the lumidant subdirectory
  RewriteCond %{HTTP_HOST} ^(www\.)?lumidant\.com$
  # this is needed to stop infinite looping
  RewriteCond %{REQUEST_URI} !^/lumidant/.*$
  # don't redirect these directories to the lumidant subdirectory
  RewriteCond %{REQUEST_URI} !^/pinknews/.*$
  RewriteRule ^(.*)$ /lumidant/$1

  # if you're asking for a directory and there is no trailing slash then add one
  RewriteCond %{REQUEST_FILENAME} -d
  RewriteCond %{REQUEST_URI} !^.*/$
  RewriteRule ^/lumidant/(.*)$ http://www\.lumidant\.com%{REQUEST_URI}/ [R=301,L]

  # add a www if there's not one
  RewriteCond %{HTTP_HOST} ^lumidant\.com$ [NC]
  RewriteCond %{REQUEST_URI} !^/blog.*$
  RewriteRule ^lumidant/(.*)$ http://www\.lumidant\.com/$1 [R=301,L]

</IfModule>

This blog is currently hosted with BlueHost. For accounts with multiple domains, BlueHost places the add-on domains in subdirectories of the main domain. This can be confusing to maintain, so I moved all of the lumidant code to a subdirectory as well and then updated the .htaccess file to make this organization transparent to the end user.

The last few lines add a www to all non-www pages. While I could have placed this at the beginning of the file, the file would be executed again after the redirect causing possibly another redirect to be executed if a trailing slash needed to be added. Keep in mind while organizing the file that you’d like to minimize the number of redirects for many reasons including response times, reducing server load, and optimizing for search engines.

URL rewriting can be tricky at first, especially if you’re not familiar with regular expressions. If you’re working with redirections, then it may help to check the HTTP headers of your request to see what intermediate redirects are occurring.

Finally, if you’re not using Apache there are other alternatives to .htaccess. For example, I have used the UrlRewriteFilter in the past for Java web apps.

Buy Me a Beer

Comments

Welcome – Lumidant’s First Blog Post

I’ve been creating websites for years, since I started Tab World Online while in high school. That site was receiving 1 million page views per month before I sold it, though admittedly I knew comparatively little about web development at the time. So today, I decided I’d take the plunge, live in the spirit of the times, and start a blog. I pick up quite a lot of knowledge on a day-to-day basis that I thought would be worth passing along, so in that spirit I’ll share what it took to get this blog up and running:

The first step was choosing a blogging platform. I chose WordPress for two reasons. The first being that it’s probably the most common blogging platform, which I find is helpful for locating themes, plug-ins, and support. The second is that I was having discussions with a potential client about importing his current site from an outdated, cludgy CMS to WordPress.

Then it was time to get the ball rolling, so I followed the WordPress 5 minute installation.

Visiting my newly created blog showed a sample post on a very ugly site (no offense to the default theme creator). It was clear that a new site design was imperative to getting started blogging. I did a Google search for WordPress themes and found one released under the GPL called Almost Spring. I tweaked it a bit to fit my needs, starting with adding the Lumidant logo to the page header, which was easy enough since the header was in a file called (surprise!) header.php.

The next essential step towards getting started was changing the URL structure. By default links looked like:

http://www.lumidant.com/blog/?p=123

This isn’t great for search engine optimization purposes and really it’s just not as pretty or as intuitive as the alternative I chose:

http://www.lumidant.com/blog/getting-started-blogging/

I’m not sure why the date and name based option isn’t used as the default, but it can be found under “Options >> Permalinks >> Date and name based”. Perhaps it is to allow WordPress to run on hosts without mod_rewrite enabled. I chose the custom option of just my post name as I believe it will be better for SEO since some search engines prefer posts which are not buried deeply in many layers of directories. I think sites should be designed for people first and foremost, so the shorter URL makes me smile.

I personally prefer tagging to categorizing and replaced the categories with a tag cloud by simply calling the handy function wp_tag_cloud(). I added the RSS images to the theme as well, which can be found under wp-includes\images.

Finally, I thought I’d see how many visitors this new blog would garner, so I installed the Ultimate Google Analytics WordPress plug-in and configured it by going to “Options >> Ultimate GA” in the WordPress admin screen. Among other reasons, this is preferable to simply pasting the tracking code in the theme footer because it allows you to switch themes. Only problem is that when I viewed the HTML output I noticed the plug-in was utilizing the legacy analytics code, so I had to update the plug-in source to utilize the newer tracking code. If you do this yourself, just remember to escape the single quotes in the analytics source with a ‘\’ character:

<script type="text/javascript">
    var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");
    document.write(unescape("%3Cscript src=\'" + gaJsHost + "google-analytics.com/ga.js\' type=\'text/javascript\'%3E%3C/script%3E"));
</script>
<script type="text/javascript">
    var pageTracker = _gat._getTracker("'.uga_get_option('account_id').'");
    pageTracker._initData();
    pageTracker._trackPageview();
</script>

Undoubtedly, I will continue to make other changes to this blogging layout as I familiarize myself to the WordPress environment, but I’m an 80/20 type of guy and this blog’s close enough to 80% done to share with the world now.

Buy Me a Beer

Comments (2)