michael@0: # This Source Code Form is subject to the terms of the Mozilla Public
michael@0: # License, v. 2.0. If a copy of the MPL was not distributed with this
michael@0: # file, You can obtain one at http://mozilla.org/MPL/2.0/.
michael@0:
michael@0: Rough notes on setting up this test app. jrgm@netscape.com 2001/08/05
michael@0:
michael@0: 1) this is intended to be run as a mod_perl application under an Apache web
michael@0: server. [It is possible to run it as a cgi-bin, but then you will be paying
michael@0: the cost of forking perl and re-compiling all the required modules on each
michael@0: page load].
michael@0:
michael@0: 2) it should be possible to run this under Apache on win32, but I expect that
michael@0: there are *nix-oriented assumptions that have crept in. (You would also need
michael@0: a replacement for Time::HiRes, probably by using Win32::API to directly
michael@0: call into the system to Windows 'GetLocalTime()'.)
michael@0:
michael@0: 3) You need to have a few "non-standard" Perl Modules installed. This script
michael@0: will tell you which ones are not installed (let me know if I have left some
michael@0: out of this test).
michael@0:
michael@0: --8<--------------------------------------------------------------------
michael@0: #!/usr/bin/perl
michael@0: my @modules = qw{
michael@0: LWP::UserAgent SQL::Statement Text::CSV_XS DBD::CSV
michael@0: DBI Time::HiRes CGI::Request URI
michael@0: MIME::Base64 HTML::Parser HTML::Tagset Digest::MD5
michael@0: };
michael@0: for (@modules) {
michael@0: printf "%20s", $_;
michael@0: eval "use $_;";
michael@0: if ($@) {
michael@0: print ", I don't have that.\n";
michael@0: } else {
michael@0: print ", version: ", eval "\$" . "$_" . "::VERSION", "\n";
michael@0: }
michael@0: }
michael@0: --8<--------------------------------------------------------------------
michael@0:
michael@0: For modules that are missing, you can find them at http://www.cpan.org/.
michael@0: Download the .tar.gz files you need, and then (for the most part) just
michael@0: do 'perl Makefile.PL; make; make test; make install'.
michael@0:
michael@0: [Update: 28-Mar-2003] I recently installed Redhat 7.2, as server, which
michael@0: installed Apache 1.3.20 with mod_perl 1.24 and perl 5.6.0. I then ran the
michael@0: CPAN shell (`perl -MCPAN -e shell') and after completing configuration, I
michael@0: did 'install Bundle::CPAN', 'install Bundle::LWP' and 'install DBI' to
michael@0: upgrade tose modules and their dependencies. These instructions work on OSX
michael@0: as well, make sure you run the CPAN shell with sudo so you have sufficient
michael@0: privs to install the files.
michael@0:
michael@0: CGI::Request seems to have disappeared from CPAN, but you can get a copy
michael@0: from and then install
michael@0: with the standard `perl Makefile.PL; make; make test; make install'.
michael@0:
michael@0: To install the SQL::Statement, Text::CSV_XS, and DBD::CSV modules, there is
michael@0: a bundle available on CPAN, so you can use the CPAN shell and just enter
michael@0: 'install Bundle::DBD::CSV'.
michael@0:
michael@0: At the end of this, the output for the test program above was the
michael@0: following. (Note: you don't necessarily have to have the exact version
michael@0: numbers for these modules, as far as I know, but something close would be
michael@0: safest).
michael@0:
michael@0: LWP::UserAgent, version: 2.003
michael@0: SQL::Statement, version: 1.005
michael@0: Text::CSV_XS, version: 0.23
michael@0: DBD::CSV, version: 0.2002
michael@0: DBI, version: 1.35
michael@0: Time::HiRes, version: 1.43
michael@0: CGI::Request, version: 2.75
michael@0: URI, version: 1.23
michael@0: MIME::Base64, version: 2.18
michael@0: HTML::Parser, version: 3.27
michael@0: HTML::Tagset, version: 3.03
michael@0: Digest::MD5, version: 2.24
michael@0:
michael@0: 4) There is code to draw a sorted graph of the final results, but I have
michael@0: disabled the place in 'report.pl' where its use would be triggered (look
michael@0: for the comment). This is so that you can run this without having gone
michael@0: through the additional setup of the 'gd' library, and the modules GD and
michael@0: GD::Graph. If you have those in place, you can turn this on by just
michael@0: reenabling the print statement in report.pl
michael@0:
michael@0: [Note - 28-Mar-2003: with Redhat 7.2, libgd.so.1.8.4 is preinstalled to
michael@0: /usr/lib. The current GD.pm modules require libgd 2.0.5 or higher, but you
michael@0: use 1.8.4 if you install GD.pm version 1.40, which is available at
michael@0: . Just do 'perl
michael@0: Makefile.PL; make; make install' as usual. I chose to build with JPEG
michael@0: support, but without FreeType, XPM and GIF support. I had a test error when
michael@0: running 'make test', but it works fine for my purposes. I then installed
michael@0: 'GD::Text' and 'GD::Graph' from the CPAN shell.]
michael@0:
michael@0: 5) To set this up with Apache, create a directory in the cgi-bin for the web
michael@0: server called e.g. 'page-loader'.
michael@0:
michael@0: 5a) For Apache 1.x/mod_perl 1.x, place this in the Apache httpd.conf file,
michael@0: and skip to step 5c.
michael@0:
michael@0: --8<--------------------------------------------------------------------
michael@0: Alias /page-loader/ /var/www/cgi-bin/page-loader/
michael@0:
michael@0: SetHandler perl-script
michael@0: PerlHandler Apache::Registry
michael@0: PerlSendHeader On
michael@0: Options +ExecCGI
michael@0:
michael@0: --8<--------------------------------------------------------------------
michael@0:
michael@0: [MacOSX note: The CGI folder lives in /Library/WebServer/CGI-Executables/
michael@0: so the Alias line above should instead read:
michael@0:
michael@0: Alias /page-loader/ /Library/WebServer/CGI-Executables/page-loader
michael@0:
michael@0: Case is important (even though the file system is case-insensitive) and
michael@0: if you type it incorrectly you will get "Forbidden" HTTP errors.
michael@0:
michael@0: In addition, perl (and mod_perl) aren't enabled by default. You need to
michael@0: uncomment two lines in httpd.conf:
michael@0: LoadModule perl_module libexec/httpd/libperl.so
michael@0: AddModule mod_perl.c
michael@0: (basically just search for "perl" and uncomment the lines you find).]
michael@0:
michael@0: 5b) If you're using Apache 2.x and mod_perl 1.99/2.x (tested with Red Hat 9),
michael@0: place this in your perl.conf or httpd.conf:
michael@0:
michael@0: --8<--------------------------------------------------------------------
michael@0: Alias /page-loader/ /var/www/cgi-bin/page-loader/
michael@0:
michael@0:
michael@0: SetHandler perl-script
michael@0: PerlResponseHandler ModPerl::RegistryPrefork
michael@0: PerlOptions +ParseHeaders
michael@0: Options +ExecCGI
michael@0:
michael@0: --8<--------------------------------------------------------------------
michael@0:
michael@0: If your mod_perl version is less than 1.99_09, then copy RegistryPrefork.pm
michael@0: to your vendor_perl ModPerl directory (for example, on Red Hat 9, this is
michael@0: /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi/ModPerl).
michael@0:
michael@0: If you are using mod_perl 1.99_09 or above, grab RegistryPrefork.pm from
michael@0: http://perl.apache.org/docs/2.0/user/porting/compat.html#C_Apache__Registry___C_Apache__PerlRun__and_Friends
michael@0: and copy it to the vendor_perl directory as described above.
michael@0:
michael@0: 5c) When you're finished, restart Apache. Now you can run this as
michael@0: 'http://yourserver.domain.com/page-loader/loader.pl'
michael@0:
michael@0: 6) You need to create a subdirectory call 'db' under the 'page-loader'
michael@0: directory. This subdirectory 'db' must be writeable by UID that Apache
michael@0: executes as (e.g., 'nobody' or 'apache'). [You may want to figure out some
michael@0: other way to do this if this web server is not behind a firewall].
michael@0:
michael@0: 7) You need to assemble a set of content pages, with all images, included JS
michael@0: and CSS pulled to the same directory. These pages can live anywhere on the
michael@0: same HTTP server that is running this app. The app assumes that each page
michael@0: is in its own sub-directory, with included content below that
michael@0: directory. You can set the location and the list of pages in the file
michael@0: 'urllist.txt'. [See 'urllist.txt' for further details on what needs to be
michael@0: set there.]
michael@0:
michael@0: There are various tools that will pull in complete copies of web pages
michael@0: (e.g. 'wget' or something handrolled from LWP::UserAgent). You should edit
michael@0: the pages to remove any redirects, popup windows, and possibly any platform
michael@0: specific JS rules (e.g., Mac specific CSS included with
michael@0: 'document.write("LINK...'). You should also check that for missing content,
michael@0: or URLs that did not get changed to point to the local content. [One way to
michael@0: check for this is tweak this simple proxy server to check your links:
michael@0: http://www.stonehenge.com/merlyn/WebTechniques/col34.listing.txt)
michael@0:
michael@0: [MacOSX note: The web files live in /Library/WebServer/Documents, so you will
michael@0: need to modify urllist.txt to have the appropriate FILEBASE and HTTPBASE.]
michael@0:
michael@0: 8) The "hook" into the content is a single line in each top-level document like this:
michael@0:
michael@0: which should be placed immediately after the opening element. The script uses
michael@0: this as the way to substitute a BASE HREF and some JS into the page which will control
michael@0: the exectution of the test.
michael@0:
michael@0: 9) You will most likely need to remove all load event handlers from your
michael@0: test documents (onload attribute on body and handlers added with
michael@0: addEventListener).
michael@0:
michael@0: 10) Because the system uses (X)HTML base, and some XML constructs are not
michael@0: subject to that (for example xml-stylesheet processing instructions),
michael@0: you may need to provide the absolute path to external resources.
michael@0:
michael@0: 11) If your documents are tranformed on the client side with XSLT, you will
michael@0: need to add this snippet of XSLT to your stylesheet (and possibly make
michael@0: sure it does not conflict with your other rules):
michael@0: --8<--------------------------------------------------------------------
michael@0:
michael@0:
michael@0:
michael@0:
michael@0:
michael@0:
michael@0:
michael@0:
michael@0:
michael@0: --8<--------------------------------------------------------------------
michael@0: And near the top of your output rules add:
michael@0:
michael@0: Finally make sure you define the XHTML namespace in the stylesheet
michael@0: with "html" prefix.
michael@0:
michael@0: 12) I've probably left some stuff out. Bug jrgm@netscape.com for the missing stuff.