Wipe up link rot with Web-based HTML error checker
Category: Web site link checker
Name of tool: HTML Toolbox
Company name: NetMechanic Inc.
Price: free to $200 and more, depending on the number of pages on your site
URL: www.netmechanic.com
Windows platforms supported: All
Quick description: A Web-based service providing a series of site tune-up
tools that check for broken links, misspellings, and other common errors.
Strom-meter:
*** = Hey, not bad. One notch below very cool
Key features:
All done via a browser. There is no software to download and install (see
cons). The product tests for bad links, HTML coding errors, and other
mistakes on your Web pages.
Pros:
Extremely easy and straightforward to use. You go to their web site and
click on a button to start the testing process.
Cons:
Reports should take into account the Microsoft XML page syntax and newest
Netscape browser versions.
Description:
Every web user is familiar with the situation when we go to a Web page, only
to be stopped by a broken link or other more annoying mistakes. The
challenge for web site operators is to keep up with the ever-changing nature
of the web. That means checking your links, correcting grammar and spelling
errors, and other mundane housekeeping chores.
Back when I began my own Web site at strom.com several years ago, I had a
rather foolhardy notion of zero tolerance for mistakes. Today, I am more
forgiving, but it still sticks in my craw when I come across a rotten link
on my pages. Often the problem isn't my doing -- someone Out There in the
great cyber universe has reorganized their web site, and an outbound link of
mine goes nowhere now.
Over the years I have tried a wide variety of tools to combat what is called
colorfully link rot. One method is the most tedious but guaranteed to work
all the time: go to your site, bring up a page, and start clicking on the
links therein. Then go to another page, and another. It is time consuming,
extremely tedious, but you'll find the rotten links almost every time.
If you are looking for a more automated technique, then you need a link
checker. There are literally hundreds of tools and services available. Over
the years I have used a number of link checker products, and the trouble is
trying to wade through their reports to find the really critical errors that
need fixing. Some of the products are so picky as to be useless, and others
operate at too gross a level to do much good either.
One product that strikes a nice balance is from NetMechanic, called HTML
Toolbox. It provides just the right kind of information, and its reports are
fairly easy to read and to act on.
HTML Toolbox actually runs five separate tests: a link checker that finds
dead links; HTML Check, to spot and fix HTML coding errors; Browser
Compatibility Check, to find unsupported HTML tags in both Netscape and
Microsoft browser versions; Load Time Check, to find pages which are slow to
load; and a spelling checker. The tests vary in their utility.
The toolbox is actually a service: there is no software for you to download,
and everything runs inside your browser. You submit your URL for analysis
and in a few minutes you will get an email notification with a link to your
reports. There are two basic types of reports: a summary, showing the
specific pages and a 1-5 "star" rating, along with the number of errors
reported and a link to more detailed report for each particular page that
was scanned by the service.
You can perform a one-time test on up to five pages on your site for free,
and upgrade for $35 to check up to 100 pages on your site. The fee-based
products also operate on a fixed schedule (weekly, biweekly, or monthly), so
you can periodically check your site and get reports emailed to you with the
results. That is perhaps the most valuable service, because you can never
check your pages too often, given how links can change so quickly. For
example, when I ran the tool on one of my pages, I found out that the
magazine Web Review had changed all of its archives of my previously
published articles, and many articles were temporarily gone. Such is the
nature of the web.
Thus, as you would suspect, the main focus of the toolbox is the link
checker report. The free version can test up to 25 links on a particular
page, while the subscription version can test up to 5000 links. The report
shows the line number, the link itself (with a hyperlink to the actual site,
so you can see for yourself if it is broken), and a status indicator.
There is another report that attempts to do more than check, and insert
in-line comments or corrections to your HTML syntax. I found this one less
than useful, particularly as it couldn't deal with the XML tagging from
Microsoft Word, which is one of the programs I use to generate my web pages.
(From my perspective, the Microsoft version of XML is enough of a standard
that the service should recognize it as such.)
If you pay a fee, this tool will also make changes to your HTML coding, and
allow you to compare the old and new versions side by side. That sounds good
in theory, but since most of my Web pages don't have hard line breaks, the
side-by-side comparison doesn't work well since the new code comes with
these embedded line breaks.
The reports for page load time shows the average time it will take for
various browser connections to view your page in seconds, and also show the
locations of each of the different web servers that are used to make up the
data on your pages. That can be useful, particularly if you have a link from
somebody else's web server that you have forgotten about on your page.
I was also less fond of the spell checker test. It didn't fare well with my
site, given that I have lots of company names and computer terms that didn't
parse its dictionary. You can create a custom dictionary and load it as part
of the service, but that may be more trouble than it is worth. And the
browser compatibility test neglects to include the latest Netscape version
in its reports. Even if it did, the report is still hard to understand.
Overall, link checking is still far from an exact science, but NetMechanic's
service is a good start and can help you keep your links up to date on your
site. And it is priced fairly and the reports are clear enough to use in
your regular site maintenance routines.
Strom-meter key:
**** = Very cool, very useful
*** = Hey, not bad. One notch below very cool
** = A tad shaky to install and use but has some value.
* = Don't waste your time. Minimal real value.
Bio: David Strom is president of his own consulting firm in Port Washington, NY. He has tested hundreds of computer products over the past two decades working as a computer journalist, consultant, and corporate IT manager.
Since 1995 he has written a weekly series of essays on web technologies and
marketing called Web Informant. You can send him email at david@strom.com.