PINT : Company : News : Cost-Effective Website Acceleration pt. 1: Client-side Optimization

Cost-Effective Website Acceleration pt. 1: Client-side Optimization

Posted: 03.10.04

This three-part series outlines a common sense, cost-effective approach to Website acceleration according to the two simple laws of Web performance:

1. Send as little data as possible
2. Send it as infrequently as possible

If used properly, these basic principles should result in:

Faster Web page loads
Reduction of server usage
Improved bandwidth utilization

These techniques should not only improve user satisfaction with your site or Web-based application, but should also help you save money on site delivery costs.

The principles we will present in this series will not only be applied to developer-accessible Web page source including (X)HTML, CSS, and JavaScript, but will also address Web server configuration and modifications. Some suggestions may touch on structural site changes or modifications to server-side programming environments, but our primary focus will be on relatively easy changes that can be made to existing sites.

The techniques derived from the two principles mentioned above fall into three major categories:

Client-side code optimization - dealt with in Part 1 of this series, which starts below.
Optimal cache control - Part 2 of this series
Server-side modifications - Part 3 of this series

In the first part of this series, we'll focus on client-side code optimization — the easiest and generally cheapest to implement of the three site acceleration techniques. As linked above, the subsequent parts of the series will deal with cache control and server side modifications in turn.

Part 1 - Client Side Acceleration

Code for Yourself, Compile for Delivery
Any application programmer knows that there are good reasons why the code we work with is not the code we should deliver. It's best to comment source code extensively, to format it for maximum readability, and to avoid overly terse, but convoluted syntax that makes maintenance difficult. Later, we use a compiler to translate that source code into some other form that's optimized for performance and protected from reverse engineering.

This model can be applied to Web development as well. To do so, you would take the "source" version of your site and prepare it for delivery by "crunching" it down through simple techniques like white space reduction, image and script optimization, and file renaming. You would then take your delivery-ready site and post it.

Now, presumably, this isn't too foreign a concept, as you're probably at least working on a copy of your site, rather than posting changes directly to the live site. If not, please stop reading right now and make a copy of your site! This is the only proper way to develop, regardless of whether the site is a static brochure or a complex, CMS-driven application. If you don't believe us now, you will some day in the very near future when you ruin some of your site files and can't easily recover them.

As you build your site, you're probably focused on the biggest culprits in site download speed reduction — images and binary files like Flash. While reducing the colors in GIF files, compressing JPEGs, and optimizing SWF files will certainly help a great deal, there are still plenty of other areas for improvement.

Remembering the first rule of Web performance, we should always strive to send as few bytes as possible, regardless of whether the file is markup, image, or script. Now, it might seem like wasted effort to focus on shaving bytes here and there in (X)HTML, CSS or JavaScript. However, this may be precisely where the greatest attention ought to be paid.

During a typical Web page fetch, an (X)HTML document is the first to be delivered to a browser. We can dub this the host document, as it determines the relationships among all other files. Once the XHTML document is received, the browser begins to parse the markup, and in doing so, often initiates a number of requests for dependent objects such as external scripts, linked style sheets, images, embedded Flash, and so on. These CSS and JavaScript files may, in turn, host additional calls for related image or script files.

The faster these requests for dependent files are queued up, the faster they will get back to the browser and start rendering in the page. Given the importance of the host document, it would seem critical to have it delivered to the browser and parsed as quickly as possible as, despite constituting a relatively small percentage of the overall page weight, it can dramatically impede the loading of the page. Remember: users doesn't measure bytes, they measure time!

So what, specifically, do you need to do to fully prep your site for optimal delivery? The basic approach involves reducing white space, crunching CSS and JavaScript, renaming files, and similar strategies for making the delivered code as terse as possible (see Google for an example). These general techniques are well known and documented both on the Web and in books like Andy King's Speed up Your Site: Website Optimization.

In this article, we present what we consider to be the top twenty markup and code optimization techniques. You can certainly perform some of these optimizations by hand, find some Web editors and utilities that perform a few of the features for you, or roll your own crunching utilities. We do also somewhat shamelessly point you to a tool developed at Port80 Software, called the w3compiler. This tool is the only one on the market today that provides a reference implementation for nearly all the optimizing features described here and that serves as a legitimate example of the "real world" value of code optimization. Now, on with the tips!

Markup Optimization
Typical markup is either very tight, hand-crafted and standards-focused, filled with comments and formatting white space, or it is bulky, editor-generated markup with excessive indenting, editor-specific comments often used as control structures, and even redundant or needless markup or code. Neither case is optimal for delivery. The following tips are safe and easy ways to decrease file size:

1. Remove white space wherever possible.
In general, multiple white space characters (spaces, tabs, newlines) can safely be eliminated, but of course avoid changing <pre>, <textarea>, and tags affected by the white-space CSS property.

2. Remove comments.
Almost all comments, other than client-side conditional comments for IE and DOCTYPE statements, can safely be removed.

3. Remap color values to their smallest forms.
Rather than using all hex values or all color names, use whichever form is shortest in each particular case. For example, a color attribute value like #ff0000 could be replaced with red, while lightgoldenrodyellow would become #fafad2.

4. Remap character entities to their smallest forms.
As with color substitution, you can substitute a numeric entity for a longer alpha-oriented entity. For example, È would become È. Occasionally, this works in reverse as well: ð saves a byte if referenced as ð. However, this is not quite as safe to do, and the potential savings are limited.

5. Remove useless tags.
Some "junk" markup, such as tags applied multiple times or certain <meta> tags used as advertisements for editors, can safely be eliminated from documents.

Questionable Markup Optimization Techniques
While the first five techniques can result in significant savings on the order of ten to fifteen percent, many tools and developers looking for maximum delivery compression employ some questionable techniques, including:

Quote removal on attributes
Doctype statement elimination
Optional close tag removal
Tag substitution like <strong> to <b>

While it is true that most browsers will make sense of whatever "tag soup" they are handed, reasonable developers will not rely on this and will, instead, always attempt to deliver standards-compliant markup. Generally speaking, the problems associated with bypassing standards (for example, diminished portability and interoperability) outweigh the small gains in speed, and, in the case of missing closing tags, there may even be a performance penalty at page render. While sites like Google have consciously employed many of these techniques on their homepage markup, you probably don't need to go that far. We suggest that you avoid them unless you have extreme performance requirements.

CSS Optimizations
CSS is also ripe for simple optimizations. In fact, most CSS created today tends to compress much more successfully than (X)HTML. The following techniques are all safe, except for the final one, the complexities of which demonstrate the extent to which client-side Web technologies can be intertwined.

6. Remove CSS white space.
As is the case with (X)HTML, CSS is not terribly sensitive to white space, and thus its removal is a good way to significantly reduce the size of both CSS files and <style> blocks.

7. Remove CSS comments.
Just like markup comments, CSS comments should be removed, as they provide no value to the typical end user. However, a CSS masking comment in a <style> tag probably should not be removed if you're concerned about down-level browsers.

8. Remap colors in CSS to their smallest forms.
As in HTML, CSS colors can be remapped from word to hex format. However, the advantage gained by doing this in CSS is slightly greater. The main reason for this is that CSS supports three-hex color values like #fff for white.

9. Combine, reduce, and remove CSS rules.
CSS rules such as font-size, font-weight, and so on can often be expressed in a shorthand notation using the single property font. When employed properly, this technique allows you to take something like:

p {font-size: 36pt; font-family: Arial; line-height: 48pt; font-weight: bold;}

and rewrite it as:

p{font:bold 36pt/48pt Arial;}

You also may find that some rules in style sheets can be significantly reduced or even completely eliminated if inheritance is used properly. So far, there are no automatic rule-reduction tools available, so CSS wizards will have to hand-tweak for these extra savings. However, the upcoming 2.0 release of the w3compiler will include this feature.

10. Rename class and id values.
The most dangerous optimization that can be performed on CSS is to rename class or id values. Consider a rule like this:

.superSpecial {color: red; font-size: 36pt;}

It might seem appropriate to rename the class to sS. You might also take an id rule such as:

#firstParagraph {background-color: yellow;}

Here, you could use #fp in place of #firstParagraph, changing the appropriate id values throughout the document. Of course, in doing this you start to run into the problem of markup-style-script dependency: if a tag has an id value, it is possible that this value is used not only for a style sheet, but also as a script reference, or even a link destination. If you modify this value, you need to make very sure that you modify all related script and link references as well. These references may even be located in other files, so be careful.

Changing class values is not quite as dangerous, since experience shows that most JavaScript developers tend not to manipulate class values as often as they do id values. However, class name reduction ultimately suffers from the same problem as id reduction, so, again, be careful.

Note: You should probably never remap name attributes, particularly on form fields, as these values are also operated on by server-side programs that would have to be altered as well. Though not impossible, calculating such dependencies would be difficult in many Website environments.

JavaScript Optimization
More and more sites rely on JavaScript to provide navigational menus, form validation, and a variety of other useful things. Not surprisingly, much of this code is quite bulky and begs for optimization. Many of the techniques for JavaScript optimization are similar to those used for markup and CSS. However, JavaScript optimization must be performed far more carefully because, if it's done improperly, the result is not just a visual distortion, but potentially a broken page! Let's start with the most obvious and easiest improvements, then move on to those that require greater care.

11. Remove JavaScript comments.
Except for the  masking comment, all JavaScript comments indicated by // or /* */ can safely be removed, as they offer no value to end users (except those who want to understand how your script works).

12. Remove white space in JavaScript.
Interestingly, white space removal in JavaScript is not nearly as beneficial as it might seem. On the one hand, code like this:

x = x + 1;

can obviously be reduced to:

x=x+1;

However, because of the common sloppy coding practice of JavaScript developers failing to terminate lines with semi-colons, white space reduction can cause problems. For example, consider the legal JavaScript below, which uses implied semi-colons:

x=x+1
y=y+1

A simple white space remover might produce the following:

x=x+1y=y+1

This would obviously throw an error. If you add the needed semi-colons to produce:

x=x+1;y=y+1;

you actually gain nothing in byte count. We still encourage this transformation, however, since Web developers who provided feedback on the Beta versions of w3compiler found the "visually compressed" script more satisfying (perhaps as visual confirmation that they are looking at transformed rather than original code). The also liked the side benefit of delivering more obfuscated code.

13. Perform code optimizations.
Simple ideas like removing implied semi-colons, var statements in certain cases, or empty return statements, can help to further reduce some script code. Shorthand can also be employed in a number of situations. For example:

x=x+1;

can become:

x++;

However, be careful, as it's quite easy to break your code unless your optimizations are very conservative.

14. Rename user-defined variables and function names.
For good readability, any script should use variables like sumTotal instead of s.

However, for download speed, the lengthy variable sumTotal is a liability and it provides no user value, so s is a much better choice. Here, again, writing your source code in a readable fashion and then using a tool to prepare it for delivery shows its value, since remapping all user-defined variable and function names to short one- and two-letter identifiers can produce significant savings.

15. Remap built-in objects.
The bulkiness of JavaScript code, beyond long user variable names, comes from the use of built-in objects like Window, Document, Navigator and so on. For example, consider this code:

alert(window.navigator.appName);
alert(window.navigator.appVersion);
alert(window.navigator.userAgent);

You could rewrite the above as follows:

w=window;n=w.navigator;a=alert;
a(n.appName);
a(n.appVersion);
a(n.userAgent);

This type of remapping is quite valuable when objects are used repeatedly, which they generally are. Note, however, that if the window or navigator object were used only once, these substitutions would actually make the code bigger, so be careful if you are optimizing by hand. Fortunately, many JavaScript code optimizers will take this into account automatically.

This tip brings up a related issue regarding the performance of scripts with remapped objects: in addition to the benefit of size reduction, such remappings actually slightly improve script execution times because the objects are copied higher up into JavaScript's scope chain. This technique has been used for years by developers who write JavaScript games, and while it can improve both download and execution performance, it does so at the expense of local browser memory usage.

The Obfuscation Side Effect of JavaScript Optimization
You'll notice that, if you apply these various JavaScript optimizations, the source code becomes effectively unreadable or, some might even say, obfuscated. While it's true that the reverse engineering of optimized JavaScript can be difficult, it is far from impossible. Real obfuscation would use variables like O1l1l1O0l1 and Ol11l001l, so that unraveling the code would be more confusing. Some may even go so far as to employ light encryption on the page. However, be aware that, in general, obfuscation and optimization can be at odds with each other, to the point that more obfuscated code may be larger than the original code. Fortunately, lightweight code obfuscation is generally enough to deter casual code thieves, while still offering performance improvements.

File-Related Optimization
The last set of optimization techniques is related to file and site organization. Some of the optimizations mentioned here might require server modifications or site restructuring.

16. Rename non-user accessed dependent files and directories.
Sites will often have file names such as SubHeaderAbout.gif or rollover.js for dependent objects that are never accessed by a user via the URL. Very often, these are kept in a standard directory like /images, so you may see markup like this:

Or, worse:

Given that these files will never be accessed directly, this readability provides no value to the user, only the developer. For delivery's sake it would make more sense to use markup like

While manual file-and-directory remapping can be an intensive process, some content management systems can deploy content to target names, including shortened values.

Furthermore, the w3compiler has a feature that automatically copies and sets up these dependencies. If used properly, this can result in very noticeable savings in the (X)HTML files that reference these objects, and can also make reworking of stolen site markup much more difficult.

17. Shorten all page URLs using a URL rewriter.
Notice that the previous step does not suggest renaming the host files like products.html, which would change markup like this:

<a href="products.html">Products</a>

to something like this:

<a href="p.html">Products</a>

The main reason is that end users will see a URL like http://www.sitename.com/p.html, rather than the infinitely more usable http://www.sitename.com/products.html.

However, it is possible to reap the benefits of file name reduction in your source code without sacrificing meaningful page URLs if you combine the renaming technique with a change to your Web server's configuration. For example, you could substitute p.html for products.html in your source code, but then set up a URL rewriting rule to be used by a server filter like mod_rewrite to expand the URL back into a user friendly value. Note that this trick will only put the new URL in the user's address bar if the rewrite rule employs an "external" redirect, thereby forcing the browser to re-request the page. In this case, the files themselves are not renamed, as the short identifiers are only used in the source code URLs.

Because of the reliance on URL rewriting and the lack of widespread developer access to, and understanding of, such server-side tools as mod_rewrite, even an advanced tool like the w3compiler does not currently promote this technique. However, considering that sites like Yahoo! actively employ this technique for significant savings, it should not be ignored, as it does produce noticeable (X)HTML reduction when extremely descriptive directory and file names are used in a site.

18. Remove or reduce file extensions.
Interestingly, there really is little value to including file extensions such as .gif, .jpg, .js, and so on. The browser does not rely on these values to render a page; rather it uses the MIME type header in the response. Knowing this, we might take:

and shorten it to:

If combined with file renaming, this might produce:

Don't be scared by how strange this technique looks; your actual file will still be sA.gif. It's just the end user who won't see it that way!

In order to take advantage of this more advanced technique, however, you do need to make modifications to your server. The main thing you will have to do is to enable something called "content negotiation," which may be native to your server or require an extension such as mod_negotation for Apache or Port80's pageXchanger for IIS. The downside to this is that it may cause a slight performance hit on your server.

However, the benefits of adding content negotiation far outweigh the costs. Clean URLs improve both security and portability of your sites, and even allow for adaptive content delivery whereby you can send different image types or languages to users based upon their browser's capabilities or system preferences! See "Towards Next Generation URLs" by the same authors for more information.

Note: Extension-less URLs will not hurt your search engine ranking. Port80 Software, as well as major sites like the W3C, use this technique and have suffered no ill effects.

19. Restructure <script> and <style> inclusions for optimal number of requests.
You will often find in the <head> of an HTML document markup such as:

In most cases, this should have been reduced to:

Here, g.js contains all the globally used functions. While the break-up of the script files into three pieces makes sense for maintainability, for delivery, it does not. The single script download is far more efficient than three separate requests, and it even reduces the amount of needed markup. Interestingly, this approach mimics the concept of linking in a traditional programming language compiler.

20. Consider cacheability at the code level.
One of the most important improvements to site performance that can be made is to improve cacheability. Web developers may be very familiar with using the <meta> tag to set cache control, but (apart from the fact that meta has no effect on proxy caches) the true value of cacheability is in found in its application to dependent objects such as images and scripts.

To prepare your site for improved caching, you should consider segmenting your dependent objects according to frequency of change, storing your more cacheable items in a directory like /cache or /images/cache. Once you start organizing your site this way, it will be very easy to add cache control rules that will make your site clearly "pop" for users who are frequent visitors.

Conclusion
You now have twenty useful code optimization tips to make your site faster. One by one, they may not seem very powerful, but, apply them together, and you'll see an obvious improvement in site delivery.

In the next installment of this series, we'll focus primarily on caching, explaining how it is generally misused and how you can significantly improve performance with just a few simple changes. See you then!

Originally published on sitepoint.com, Published: March 10, 2004.

About PINT

Headquartered in San Diego since 1994, PINT Inc. (http://www.pint.com ) is a nationally recognized interactive Web agency providing web strategy, interactive design, development, user experience, analytics, search marketing, and optimization to global companies and institutions. PINT founder Thomas Powell is the author of eleven best-selling industry textbooks on HTML and Web design. Clients include San Diego Chargers, ViewSonic, Hewlett-Packard, Allergan, Biogen Idec, UCSD, Linksys, Scripps Health, and USC. For updates and information about PINT and the Web, please subscribe to the PINT blog at http://blog.pint.com and follow PINT on Twitter at http://twitter.com/PINTSD

View All Articles

Cost-Effective Website Acceleration pt. 1: Client-side Optimization

Related Articles