DOCUMENTATION WebTester 1.04 by Darryl C. Burgdorf (burgdorf@awsd.com) http://awsd.com/scripts/webtester/ =========================================== WebTester (formerly WebMapper) is a handy site management tool, the primary purpose of which is to check your site for broken links. It will report both on missing files and on those which exist but aren't referenced. It can also check the validity of your external links. (That way, if someone moves or removes a page to which you've linked, you'll know that the link needs to be updated!) The script is fairly robust; in addition to following "straight" links, it will also parse and follow links in image maps, in text embedded via SSIs, and (optionally) in CGI-generated pages. It will let you know the effective "download" file size of your pages, and will provide estimates of how long they'll take to download at various connection speeds. Finally, it can create for you a simple "site map" showing, in outline format, all of your site's pages. (The site map can be included via SSIs on other pages.) =========================================== The files that you need are as follows: webtester.pl: This is the main program file. You don't actually need to do anything to it; in fact, you don't even have to execute it. config.pl: This is the configuration file. Most everything you need to change or modify is contained here. This is also the file that you will execute. (Things are set up this way so that you can effectively maintain multiple versions of the script, for example if you want to run separate site checks for different sites, just by keeping separate config files for each.) As noted above, the WebTester configuration file, and not the WebTester program itself, should be executed. The configuration file should, of course, be set executable. Make sure that the first line of the script matches the location of your system's Perl interpreter. As well, the following variables need to be defined: $InFile: The absolute path of the file to be used as the "key" file. This will usually be the main "index.html" file for the site you're mapping. $OutFile: The absolute path of the "site check" file to be created by WebTestr. $MapFile: The absolute path of the "site map" file to be created. If this variable is left undefined, no map file will be created. $LocalPath and $LocalURL: The absolute path and URL (both minus trailing slashes) of the base directory of the site to be checked. $CGIPath and $CGIURL: The absolute path and URL (both minus trailing slashes) of the site's CGI-BIN directory. $ImageMapPointer and $ImageMapPath: These variables are used to help the script correctly locate image map files. If you don't use image maps, or reference them via "normal" URLs, you won't need to worry about assigning them. However, if you use image maps which are referenced via the old method of tacking the map's address to the end of the imagmap program's address, you will need to assign them. (For example, if your image map is referenced via the URL "/cgi-bin/imagemap/foo/mapdirectory/mapfile", but the real path to find it is "/usr/foo/mapdirectory/mapfile", you'd want to set $ImageMapPointer = "/cgi-bin/imagemap/foo" and $ImageMapPath = "/usr/foo". This tells WebTester, which unfortunately isn't inherently smart enough to decipher all the possible variations in image map addressing, where to find the map file.) $SiteName: The name of the site. $Avoid: A regex (regular expression) identifying any particular files that you don't wish to be examined for links. $ParseCGI: A regex identifying any CGI scripts which you want to have parsed. If left undefined, the existence of any CGI scripts will be noted, but they won't be run. (Generally, that will be what you want to do.) However, if you have CGI scripts which generate actual pages which you want included in the site map, note them in this variable. $MissingLinks: If this variable is set to "1" the script will tell you about files which exist but are not referenced. If it is set to "0" the information will not be included. (This is useful if your directory structure contains a large number of files unrelated to your Web site.) $IgnoreExternals: If this variable is set to "1" the script will not check the validity of external links. If it is set to "0" it will check them. Note that this can take a significant amount of time if you have a lot of links! $ShowOnlyErrors: If this variable is set to "1" the "site check" report will only tell you about problems. If it is set to "0" the report will tell you about everything. $MinLevel: This variable allows you a bit of control over how your site map is constructed, by allowing you to specify that certain files won't appear too high in the hierarchy. The index file is level 1, files referenced by it are level 2, files referenced by them are level 3, etc. The "minimum level" for a particular file is the highest level at which it is allowed to appear on the map. (For example, if file "/usr/foo/document.html" is referenced from your main index page, but you want it to appear on the map under your news and information page, which also references it, you could set its minimum level to "3" and thus insure that it won't appear at level 2.) You can also specify that certain files won't appear on the map at all, simply by giving them very high minimum levels. =========================================== This documentation assumes that you have at least a general familiarity with setting up Perl scripts. If you need more specific assistance, check with your system administrators, consult the WebScripts FAQs (frequently-asked questions) file , or ask on the WebScripts Forum . -- Darryl C. Burgdorf