Article 11537 of comp.infosystems.www:
Path: feenix.metronet.com!news.utdallas.edu!convex!cs.utexas.edu!howland.reston.ans.net!wupost!texbell.sbc.com!swuts!132.201.57.164!bm1822
From: bm1822@sbc.com (Brian Millett)
Newsgroups: comp.infosystems.www
Subject: wais.pl hack
Message-ID: <BM1822.94Mar28161423@adasv2.sbc.com>
Date: 28 Mar 94 22:14:23 GMT
Sender: usenet@swuts.sbc.com
Lines: 130

Well, I've hacked on the wais.pl script from ncsa to get it to work
for me.  That is, the return data is interpreted correctly and I can
fetch the correct document.  It looks like the 'pipes weren't hot
enough' (see 'Programming perl' pg 110).  I also had to hack the
variable '$headline'.  The version of freeWAIS-0.202 I am using, or
the way I am indexing, caused the file name to be first, followed by
the path.  This made the URL to find the file garbage.  BUT I am still
having the following show up in the browser window:

Searching shakespear.src...Initializing connection...Found 3 items.HTTP/1.0 200 OK Date:
Monday, 28-Mar-94 21:41:11 GMT Server: NCSA/1.1 MIME-version: 1.0 Content-type: text/html

How do I get rid of it?  

The two hacks can be found by looking for "HACK ALERT".  The new
script is (must change the waisq, waisd, & src variables) : 

#!/usr/local/bin/perl
#
# wais.pl -- WAIS search interface
#
# wais.pl,v 1.1 1993/12/31 09:30:56 robm Exp
#
# Tony Sanders <sanders@bsdi.com>, Nov 1993
#
# Example configuration (in local.conf):
#     map topdir wais.pl &do_wais($top, $path, $query, "database", "title")
#

## CHANGE THESE
$waisq = "/users/bm1822/development/web/freeWAIS-0.202/bin/waisq";
$waisd = "/users/bm1822/development/web/freeWAIS-0.202/wais-sources";
$src = "shakespear";
$title = "REPO Wais documentation";
#end vars

# PrintHeader
# Returns the magic line which tells WWW that we're an HTML document

sub PrintHeader {
	print "Content-type: text/html\n\n";
}

sub send_index {
    
    print "<HEAD>\n<TITLE>Index of ", $title, "</TITLE>\n</HEAD>\n";
    print "<BODY>\n<H1>", $title, "</H1>\n";

    print "This is an index of the information on this server. Please\n";
    print "type a query in the search dialog.\n<P>";
    print "You may use compound searches, such as: <CODE>environment AND cgi</CODE>\n";
    print "<ISINDEX>";
}

sub do_wais {
#    local($top, $path, $query, $src, $title) = @_;

	local(@query) = @ARGV;
    local($pquery) = join(" ", @query);

##  HACK ALERT
##  FIX added here to flush the STDOUT to get the correct
##  content-type message delivered.
	select((select(STDOUT), $| = 1)[0]);
##  end ALERT

	open(WAISQ, "-|") || exec ($waisq, "-c", $waisd,
							   "-f", "-", "-S", "$src.src", "-g", @query);

    print "<HEAD>\n<TITLE>Search of ", $title, "</TITLE>\n</HEAD>\n";
    print "<BODY>\n<H1>", $title, "</H1>\n";

    print "Index \`$src\' contains the following\n";
    print "items relevant to \`$pquery\':<P>\n";
    print "<DL>\n";

	local($hits, $score, $headline, $lines, $bytes, $type, $date);
    while (<WAISQ>) {
        /:score\s+(\d+)/ && ($score = $1);
        /:number-of-lines\s+(\d+)/ && ($lines = $1);
        /:number-of-bytes\s+(\d+)/ && ($bytes = $1);
        /:type "(.*)"/ && ($type = $1);
        /:headline "(.*)"/ && ($headline = $1);         # XXX
        /:date "(\d+)"/ && ($date = $1, $hits++, &docdone);
    }
    close(WAISQ);
    print "</DL>\n";

    if ($hits == 0) {
        print "Nothing found.\n";
    }
    print "</BODY>\n";
}

sub docdone {
    if ($headline =~ /Search produced no result/) {
        print "<HR>";
        print $headline, "<P>\n<PRE>";
# the following was &'safeopen
        open(WAISCAT, "$waisd/$src.cat") || die "$src.cat: $!";
        while (<WAISCAT>) {
            s#(Catalog for database:)\s+.*#$1 <A HREF="/$top/$src.src">$src.src</A>#;
            s#Headline:\s+(.*)#Headline: <A HREF="$1">$1</A>#;
            print;
        }
        close(WAISCAT);
        print "\n</PRE>\n";
    } else {
		#
		# HACK ALERT!!!
		# But the headline looks like 
		#:headline "macbeth.sp   /users/bm1822/doc/shakespeare/Tragedies/"
		# soo how do I rotate the filename & dir path?
		($filename, $path) = split (' ',$headline);
        print "<DT><A HREF=\"$path$filename\">$filename</A>\n";
		# end ALERT

        print "<DD>Score: $score, Lines: $lines, Bytes: $bytes\n";
    }
    $score = $headline = $lines = $bytes = $type = $date = '';
}

&PrintHeader;
if ( !defined @ARGV ) { &send_index; } 
else { &do_wais; }
-- 
Brian Millett                    "The significant problems we face in life
Southwestern Bell Telephone Co.   cannot be solved at the same level of
(314) 235-3866                    thinking we were at when we created them"
bm1822@adasv2.sbc.com              Albert Einstein.


