tools/page-loader/README.txt

Fri, 16 Jan 2015 18:13:44 +0100

author
Michael Schloh von Bennewitz <michael@schloh.com>
date
Fri, 16 Jan 2015 18:13:44 +0100
branch
TOR_BUG_9701
changeset 14
925c144e1f1f
permissions
-rw-r--r--

Integrate suggestion from review to improve consistency with existing code.

     1 # This Source Code Form is subject to the terms of the Mozilla Public
     2 # License, v. 2.0. If a copy of the MPL was not distributed with this
     3 # file, You can obtain one at http://mozilla.org/MPL/2.0/.
     5 Rough notes on setting up this test app. jrgm@netscape.com 2001/08/05
     7 1) this is intended to be run as a mod_perl application under an Apache web
     8    server. [It is possible to run it as a cgi-bin, but then you will be paying
     9    the cost of forking perl and re-compiling all the required modules on each
    10    page load].
    12 2) it should be possible to run this under Apache on win32, but I expect that
    13    there are *nix-oriented assumptions that have crept in. (You would also need
    14    a replacement for Time::HiRes, probably by using Win32::API to directly
    15    call into the system to Windows 'GetLocalTime()'.)
    17 3) You need to have a few "non-standard" Perl Modules installed. This script
    18    will tell you which ones are not installed (let me know if I have left some
    19    out of this test).
    21 --8<--------------------------------------------------------------------
    22 #!/usr/bin/perl 
    23 my @modules = qw{
    24     LWP::UserAgent   SQL::Statement     Text::CSV_XS      DBD::CSV          
    25     DBI              Time::HiRes        CGI::Request      URI               
    26     MIME::Base64     HTML::Parser       HTML::Tagset      Digest::MD5      
    27     };
    28 for (@modules) {
    29     printf "%20s", $_;
    30     eval "use $_;";
    31     if ($@) {
    32         print ", I don't have that.\n";
    33     } else {
    34         print ", version: ", eval "\$" . "$_" . "::VERSION", "\n";
    35     }
    36 }
    37 --8<--------------------------------------------------------------------
    39    For modules that are missing, you can find them at http://www.cpan.org/.
    40    Download the .tar.gz files you need, and then (for the most part) just 
    41    do 'perl Makefile.PL; make; make test; make install'.
    43    [Update: 28-Mar-2003] I recently installed Redhat 7.2, as server, which
    44    installed Apache 1.3.20 with mod_perl 1.24 and perl 5.6.0. I then ran the
    45    CPAN shell (`perl -MCPAN -e shell') and after completing configuration, I
    46    did 'install Bundle::CPAN', 'install Bundle::LWP' and 'install DBI' to
    47    upgrade tose modules and their dependencies. These instructions work on OSX
    48    as well, make sure you run the CPAN shell with sudo so you have sufficient
    49    privs to install the files.
    51    CGI::Request seems to have disappeared from CPAN, but you can get a copy
    52    from <http://stein.cshl.org/WWW/software/CGI::modules/> and then install
    53    with the standard `perl Makefile.PL; make; make test; make install'.
    55    To install the SQL::Statement, Text::CSV_XS, and DBD::CSV modules, there is
    56    a bundle available on CPAN, so you can use the CPAN shell and just enter
    57    'install Bundle::DBD::CSV'.
    59    At the end of this, the output for the test program above was the
    60    following.  (Note: you don't necessarily have to have the exact version
    61    numbers for these modules, as far as I know, but something close would be
    62    safest).
    64       LWP::UserAgent, version: 2.003
    65       SQL::Statement, version: 1.005
    66         Text::CSV_XS, version: 0.23
    67             DBD::CSV, version: 0.2002
    68                  DBI, version: 1.35
    69          Time::HiRes, version: 1.43
    70         CGI::Request, version: 2.75
    71                  URI, version: 1.23
    72         MIME::Base64, version: 2.18
    73         HTML::Parser, version: 3.27
    74         HTML::Tagset, version: 3.03
    75          Digest::MD5, version: 2.24
    77 4) There is code to draw a sorted graph of the final results, but I have
    78    disabled the place in 'report.pl' where its use would be triggered (look
    79    for the comment). This is so that you can run this without having gone
    80    through the additional setup of the 'gd' library, and the modules GD and
    81    GD::Graph. If you have those in place, you can turn this on by just
    82    reenabling the print statement in report.pl
    84    [Note - 28-Mar-2003: with Redhat 7.2, libgd.so.1.8.4 is preinstalled to
    85    /usr/lib. The current GD.pm modules require libgd 2.0.5 or higher, but you
    86    use 1.8.4 if you install GD.pm version 1.40, which is available at
    87    <http://stein.cshl.org/WWW/software/GD/old/GD-1.40.tar.gz>. Just do 'perl
    88    Makefile.PL; make; make install' as usual. I chose to build with JPEG
    89    support, but without FreeType, XPM and GIF support. I had a test error when
    90    running 'make test', but it works fine for my purposes. I then installed
    91    'GD::Text' and 'GD::Graph' from the CPAN shell.]
    93 5) To set this up with Apache, create a directory in the cgi-bin for the web
    94    server called e.g. 'page-loader'.
    96 5a) For Apache 1.x/mod_perl 1.x, place this in the Apache httpd.conf file,
    97     and skip to step 5c.
    99 --8<--------------------------------------------------------------------
   100 Alias /page-loader/  /var/www/cgi-bin/page-loader/
   101 <Location /page-loader>
   102 SetHandler  perl-script
   103 PerlHandler Apache::Registry
   104 PerlSendHeader On
   105 Options +ExecCGI
   106 </Location>
   107 --8<--------------------------------------------------------------------
   109     [MacOSX note: The CGI folder lives in /Library/WebServer/CGI-Executables/
   110     so the Alias line above should instead read:
   112       Alias /page-loader/  /Library/WebServer/CGI-Executables/page-loader
   114     Case is important (even though the file system is case-insensitive) and
   115     if you type it incorrectly you will get "Forbidden" HTTP errors.
   117     In addition, perl (and mod_perl) aren't enabled by default. You need to 
   118     uncomment two lines in httpd.conf:
   119       LoadModule perl_module        libexec/httpd/libperl.so
   120       AddModule mod_perl.c
   121     (basically just search for "perl" and uncomment the lines you find).]
   123 5b) If you're using Apache 2.x and mod_perl 1.99/2.x (tested with Red Hat 9),
   124     place this in your perl.conf or httpd.conf:
   126 --8<--------------------------------------------------------------------
   127 Alias /page-loader/  /var/www/cgi-bin/page-loader/
   129 <Location /page-loader>
   130 SetHandler perl-script
   131 PerlResponseHandler ModPerl::RegistryPrefork
   132 PerlOptions +ParseHeaders
   133 Options +ExecCGI
   134 </Location>
   135 --8<--------------------------------------------------------------------
   137    If your mod_perl version is less than 1.99_09, then copy RegistryPrefork.pm
   138    to your vendor_perl ModPerl directory (for example, on Red Hat 9, this is
   139    /usr/lib/perl5/vendor_perl/5.8.0/i386-linux-thread-multi/ModPerl).
   141    If you are using mod_perl 1.99_09 or above, grab RegistryPrefork.pm from
   142    http://perl.apache.org/docs/2.0/user/porting/compat.html#C_Apache__Registry___C_Apache__PerlRun__and_Friends
   143    and copy it to the vendor_perl directory as described above.
   145 5c) When you're finished, restart Apache.  Now you can run this as
   146     'http://yourserver.domain.com/page-loader/loader.pl'
   148 6) You need to create a subdirectory call 'db' under the 'page-loader'
   149    directory. This subdirectory 'db' must be writeable by UID that Apache
   150    executes as (e.g., 'nobody' or 'apache'). [You may want to figure out some
   151    other way to do this if this web server is not behind a firewall].
   153 7) You need to assemble a set of content pages, with all images, included JS
   154    and CSS pulled to the same directory. These pages can live anywhere on the
   155    same HTTP server that is running this app. The app assumes that each page
   156    is in its own sub-directory, with included content below that
   157    directory. You can set the location and the list of pages in the file
   158    'urllist.txt'. [See 'urllist.txt' for further details on what needs to be
   159    set there.]
   161    There are various tools that will pull in complete copies of web pages
   162    (e.g. 'wget' or something handrolled from LWP::UserAgent). You should edit
   163    the pages to remove any redirects, popup windows, and possibly any platform
   164    specific JS rules (e.g., Mac specific CSS included with
   165    'document.write("LINK...'). You should also check that for missing content,
   166    or URLs that did not get changed to point to the local content. [One way to
   167    check for this is tweak this simple proxy server to check your links:
   168    http://www.stonehenge.com/merlyn/WebTechniques/col34.listing.txt)
   170    [MacOSX note: The web files live in /Library/WebServer/Documents, so you will
   171    need to modify urllist.txt to have the appropriate FILEBASE and HTTPBASE.]
   173 8) The "hook" into the content is a single line in each top-level document like this:
   174       <!-- MOZ_INSERT_CONTENT_HOOK -->
   175    which should be placed immediately after the opening <HEAD> element. The script uses
   176    this as the way to substitute a BASE HREF and some JS into the page which will control
   177    the exectution of the test.
   179 9) You will most likely need to remove all load event handlers from your
   180    test documents (onload attribute on body and handlers added with
   181    addEventListener).
   183 10) Because the system uses (X)HTML base, and some XML constructs are not
   184     subject to that (for example xml-stylesheet processing instructions),
   185     you may need to provide the absolute path to external resources.
   187 11) If your documents are tranformed on the client side with XSLT, you will
   188     need to add this snippet of XSLT to your stylesheet (and possibly make
   189     sure it does not conflict with your other rules):
   190 --8<--------------------------------------------------------------------
   191 <!-- Page Loader -->
   192 <xsl:template match="html:script">
   193   <xsl:copy>
   194   <xsl:apply-templates/>
   195   </xsl:copy>
   196   <xsl:for-each select="@*">
   197     <xsl:copy/>
   198   </xsl:for-each>
   199 </xsl:template>
   200 --8<--------------------------------------------------------------------
   201     And near the top of your output rules add:
   202        <xsl:apply-templates select="html:script"/>
   203     Finally make sure you define the XHTML namespace in the stylesheet
   204     with "html" prefix.
   206 12) I've probably left some stuff out. Bug jrgm@netscape.com for the missing stuff.

mercurial