tools/page-loader/README.txt

changeset 0
6474c204b198
     1.1 --- /dev/null	Thu Jan 01 00:00:00 1970 +0000
     1.2 +++ b/tools/page-loader/README.txt	Wed Dec 31 06:09:35 2014 +0100
     1.3 @@ -0,0 +1,206 @@
     1.4 +# This Source Code Form is subject to the terms of the Mozilla Public
     1.5 +# License, v. 2.0. If a copy of the MPL was not distributed with this
     1.6 +# file, You can obtain one at http://mozilla.org/MPL/2.0/.
     1.7 +
     1.8 +Rough notes on setting up this test app. jrgm@netscape.com 2001/08/05
     1.9 +
    1.10 +1) this is intended to be run as a mod_perl application under an Apache web
    1.11 +   server. [It is possible to run it as a cgi-bin, but then you will be paying
    1.12 +   the cost of forking perl and re-compiling all the required modules on each
    1.13 +   page load].
    1.14 +
    1.15 +2) it should be possible to run this under Apache on win32, but I expect that
    1.16 +   there are *nix-oriented assumptions that have crept in. (You would also need
    1.17 +   a replacement for Time::HiRes, probably by using Win32::API to directly
    1.18 +   call into the system to Windows 'GetLocalTime()'.)
    1.19 +
    1.20 +3) You need to have a few "non-standard" Perl Modules installed. This script
    1.21 +   will tell you which ones are not installed (let me know if I have left some
    1.22 +   out of this test).
    1.23 +
    1.24 +--8<--------------------------------------------------------------------
    1.25 +#!/usr/bin/perl 
    1.26 +my @modules = qw{
    1.27 +    LWP::UserAgent   SQL::Statement     Text::CSV_XS      DBD::CSV          
    1.28 +    DBI              Time::HiRes        CGI::Request      URI               
    1.29 +    MIME::Base64     HTML::Parser       HTML::Tagset      Digest::MD5      
    1.30 +    };
    1.31 +for (@modules) {
    1.32 +    printf "%20s", $_;
    1.33 +    eval "use $_;";
    1.34 +    if ($@) {
    1.35 +        print ", I don't have that.\n";
    1.36 +    } else {
    1.37 +        print ", version: ", eval "\$" . "$_" . "::VERSION", "\n";
    1.38 +    }
    1.39 +}
    1.40 +--8<--------------------------------------------------------------------
    1.41 +
    1.42 +   For modules that are missing, you can find them at http://www.cpan.org/.
    1.43 +   Download the .tar.gz files you need, and then (for the most part) just 
    1.44 +   do 'perl Makefile.PL; make; make test; make install'.
    1.45 +
    1.46 +   [Update: 28-Mar-2003] I recently installed Redhat 7.2, as server, which
    1.47 +   installed Apache 1.3.20 with mod_perl 1.24 and perl 5.6.0. I then ran the
    1.48 +   CPAN shell (`perl -MCPAN -e shell') and after completing configuration, I
    1.49 +   did 'install Bundle::CPAN', 'install Bundle::LWP' and 'install DBI' to
    1.50 +   upgrade tose modules and their dependencies. These instructions work on OSX
    1.51 +   as well, make sure you run the CPAN shell with sudo so you have sufficient
    1.52 +   privs to install the files.
    1.53 +
    1.54 +   CGI::Request seems to have disappeared from CPAN, but you can get a copy
    1.55 +   from <http://stein.cshl.org/WWW/software/CGI::modules/> and then install
    1.56 +   with the standard `perl Makefile.PL; make; make test; make install'.
    1.57 +
    1.58 +   To install the SQL::Statement, Text::CSV_XS, and DBD::CSV modules, there is
    1.59 +   a bundle available on CPAN, so you can use the CPAN shell and just enter
    1.60 +   'install Bundle::DBD::CSV'.
    1.61 +
    1.62 +   At the end of this, the output for the test program above was the
    1.63 +   following.  (Note: you don't necessarily have to have the exact version
    1.64 +   numbers for these modules, as far as I know, but something close would be
    1.65 +   safest).
    1.66 +
    1.67 +      LWP::UserAgent, version: 2.003
    1.68 +      SQL::Statement, version: 1.005
    1.69 +        Text::CSV_XS, version: 0.23
    1.70 +            DBD::CSV, version: 0.2002
    1.71 +                 DBI, version: 1.35
    1.72 +         Time::HiRes, version: 1.43
    1.73 +        CGI::Request, version: 2.75
    1.74 +                 URI, version: 1.23
    1.75 +        MIME::Base64, version: 2.18
    1.76 +        HTML::Parser, version: 3.27
    1.77 +        HTML::Tagset, version: 3.03
    1.78 +         Digest::MD5, version: 2.24
    1.79 +
    1.80 +4) There is code to draw a sorted graph of the final results, but I have
    1.81 +   disabled the place in 'report.pl' where its use would be triggered (look
    1.82 +   for the comment). This is so that you can run this without having gone
    1.83 +   through the additional setup of the 'gd' library, and the modules GD and
    1.84 +   GD::Graph. If you have those in place, you can turn this on by just
    1.85 +   reenabling the print statement in report.pl
    1.86 +
    1.87 +   [Note - 28-Mar-2003: with Redhat 7.2, libgd.so.1.8.4 is preinstalled to
    1.88 +   /usr/lib. The current GD.pm modules require libgd 2.0.5 or higher, but you
    1.89 +   use 1.8.4 if you install GD.pm version 1.40, which is available at
    1.90 +   <http://stein.cshl.org/WWW/software/GD/old/GD-1.40.tar.gz>. Just do 'perl
    1.91 +   Makefile.PL; make; make install' as usual. I chose to build with JPEG
    1.92 +   support, but without FreeType, XPM and GIF support. I had a test error when
    1.93 +   running 'make test', but it works fine for my purposes. I then installed
    1.94 +   'GD::Text' and 'GD::Graph' from the CPAN shell.]
    1.95 +
    1.96 +5) To set this up with Apache, create a directory in the cgi-bin for the web
    1.97 +   server called e.g. 'page-loader'.
    1.98 +
    1.99 +5a) For Apache 1.x/mod_perl 1.x, place this in the Apache httpd.conf file,
   1.100 +    and skip to step 5c.
   1.101 +
   1.102 +--8<--------------------------------------------------------------------
   1.103 +Alias /page-loader/  /var/www/cgi-bin/page-loader/
   1.104 +<Location /page-loader>
   1.105 +SetHandler  perl-script
   1.106 +PerlHandler Apache::Registry
   1.107 +PerlSendHeader On
   1.108 +Options +ExecCGI
   1.109 +</Location>
   1.110 +--8<--------------------------------------------------------------------
   1.111 +
   1.112 +    [MacOSX note: The CGI folder lives in /Library/WebServer/CGI-Executables/
   1.113 +    so the Alias line above should instead read:
   1.114 +
   1.115 +      Alias /page-loader/  /Library/WebServer/CGI-Executables/page-loader
   1.116 +    
   1.117 +    Case is important (even though the file system is case-insensitive) and
   1.118 +    if you type it incorrectly you will get "Forbidden" HTTP errors.
   1.119 +    
   1.120 +    In addition, perl (and mod_perl) aren't enabled by default. You need to 
   1.121 +    uncomment two lines in httpd.conf:
   1.122 +      LoadModule perl_module        libexec/httpd/libperl.so
   1.123 +      AddModule mod_perl.c
   1.124 +    (basically just search for "perl" and uncomment the lines you find).]
   1.125 +  
   1.126 +5b) If you're using Apache 2.x and mod_perl 1.99/2.x (tested with Red Hat 9),
   1.127 +    place this in your perl.conf or httpd.conf:
   1.128 +
   1.129 +--8<--------------------------------------------------------------------
   1.130 +Alias /page-loader/  /var/www/cgi-bin/page-loader/
   1.131 +
   1.132 +<Location /page-loader>
   1.133 +SetHandler perl-script
   1.134 +PerlResponseHandler ModPerl::RegistryPrefork
   1.135 +PerlOptions +ParseHeaders
   1.136 +Options +ExecCGI
   1.137 +</Location>
   1.138 +--8<--------------------------------------------------------------------
   1.139 +
   1.140 +   If your mod_perl version is less than 1.99_09, then copy RegistryPrefork.pm
   1.141 +   to your vendor_perl ModPerl directory (for example, on Red Hat 9, this is
   1.142 +   /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi/ModPerl).
   1.143 +
   1.144 +   If you are using mod_perl 1.99_09 or above, grab RegistryPrefork.pm from
   1.145 +   http://perl.apache.org/docs/2.0/user/porting/compat.html#C_Apache__Registry___C_Apache__PerlRun__and_Friends
   1.146 +   and copy it to the vendor_perl directory as described above.
   1.147 +
   1.148 +5c) When you're finished, restart Apache.  Now you can run this as
   1.149 +    'http://yourserver.domain.com/page-loader/loader.pl'
   1.150 +    
   1.151 +6) You need to create a subdirectory call 'db' under the 'page-loader'
   1.152 +   directory. This subdirectory 'db' must be writeable by UID that Apache
   1.153 +   executes as (e.g., 'nobody' or 'apache'). [You may want to figure out some
   1.154 +   other way to do this if this web server is not behind a firewall].
   1.155 +
   1.156 +7) You need to assemble a set of content pages, with all images, included JS
   1.157 +   and CSS pulled to the same directory. These pages can live anywhere on the
   1.158 +   same HTTP server that is running this app. The app assumes that each page
   1.159 +   is in its own sub-directory, with included content below that
   1.160 +   directory. You can set the location and the list of pages in the file
   1.161 +   'urllist.txt'. [See 'urllist.txt' for further details on what needs to be
   1.162 +   set there.]
   1.163 +
   1.164 +   There are various tools that will pull in complete copies of web pages
   1.165 +   (e.g. 'wget' or something handrolled from LWP::UserAgent). You should edit
   1.166 +   the pages to remove any redirects, popup windows, and possibly any platform
   1.167 +   specific JS rules (e.g., Mac specific CSS included with
   1.168 +   'document.write("LINK...'). You should also check that for missing content,
   1.169 +   or URLs that did not get changed to point to the local content. [One way to
   1.170 +   check for this is tweak this simple proxy server to check your links:
   1.171 +   http://www.stonehenge.com/merlyn/WebTechniques/col34.listing.txt)
   1.172 +   
   1.173 +   [MacOSX note: The web files live in /Library/WebServer/Documents, so you will
   1.174 +   need to modify urllist.txt to have the appropriate FILEBASE and HTTPBASE.]
   1.175 +
   1.176 +8) The "hook" into the content is a single line in each top-level document like this:
   1.177 +      <!-- MOZ_INSERT_CONTENT_HOOK -->
   1.178 +   which should be placed immediately after the opening <HEAD> element. The script uses
   1.179 +   this as the way to substitute a BASE HREF and some JS into the page which will control
   1.180 +   the exectution of the test.
   1.181 +
   1.182 +9) You will most likely need to remove all load event handlers from your
   1.183 +   test documents (onload attribute on body and handlers added with
   1.184 +   addEventListener).
   1.185 +
   1.186 +10) Because the system uses (X)HTML base, and some XML constructs are not
   1.187 +    subject to that (for example xml-stylesheet processing instructions),
   1.188 +    you may need to provide the absolute path to external resources.
   1.189 +
   1.190 +11) If your documents are tranformed on the client side with XSLT, you will
   1.191 +    need to add this snippet of XSLT to your stylesheet (and possibly make
   1.192 +    sure it does not conflict with your other rules):
   1.193 +--8<--------------------------------------------------------------------
   1.194 +<!-- Page Loader -->
   1.195 +<xsl:template match="html:script">
   1.196 +  <xsl:copy>
   1.197 +  <xsl:apply-templates/>
   1.198 +  </xsl:copy>
   1.199 +  <xsl:for-each select="@*">
   1.200 +    <xsl:copy/>
   1.201 +  </xsl:for-each>
   1.202 +</xsl:template>
   1.203 +--8<--------------------------------------------------------------------
   1.204 +    And near the top of your output rules add:
   1.205 +       <xsl:apply-templates select="html:script"/>
   1.206 +    Finally make sure you define the XHTML namespace in the stylesheet
   1.207 +    with "html" prefix.
   1.208 +
   1.209 +12) I've probably left some stuff out. Bug jrgm@netscape.com for the missing stuff.

mercurial