[17/Sep/2009:19:20] Release 0.5 includes Open Flash Chart support.

[01/Jul/2009:10:50] Repoze.who authentication tutorial added

[22/Jun/2009:11:36] AJAX calculator tutorial added.

[01/May/2009:14:15] MVC/SQL based wiki tutorial added.

Contact Email:

view source

Download instructions
project page

Source tree search application deployment (FSCSI search)

This subsection describes how to deploy a "search" application similar to the search demo that searches the source code for the Python distribution.

Get the required components
Build the search index
Try out the search index from the command line
Run the search HTTP search interface on a test port
Install the search page as a CGI script
The demo page includes a link to a WHIFF application to search the Python source code tree, with the ability to differentiate different syntactic components in the source code. For example the search engine knows the difference between a comment, a string, and a variable name.

The fsCSI search components which implement the search demo are distributed as part of the WHIFF distribution under the INSTALL/whiff/servers/fsCSI directory. This tutorial describes how to use these components to build a similar search interface for any source code tree available on your computer.

Get the required components

The fsCSI components require the external packages These modules are not automatically installed -- follow the links to find out how to download the packages and install them. Furthermore WHIFF itself must be installed before the search components will work, of course. See the quickstart document if you need to install WHIFF.

Build the search index

Once the components are available you need to build an index for the source tree so the search form can find data quickly. In my case I want to be able to search the code files anywhere under the root directory /export/source/trunk/ and I will create an archive for searching this code in the new directory /export/search. To execute the build I change directory to the fsCSI installation location and run the build command:
$ cd INSTALL/whiff/servers/fsCSI
INSTALL/whiff/servers/fsCSI$ python build /export/search /export/source/trunk
Since the /export/source/trunk directory contains about a gigabyte of source code, the process of building the index takes about a half hour. By doing this build all at once the search interface does not need to look at the file content directly in order to find files containing patterns of interest.  
The trace output is provided for debugging purposes so you can tell if (for example) the build process is trying to read from an NSF partition which is malfunctioning. If the output freezes for an appreciable length of time, something is probably wrong: kill the process.

Try out the search index from the command line

Now that the index is built, we may look for files containing patterns using the command line find interface to the index. For example the search below looks for files containing both "hibernate" and "switch" anywhere.
INSTALL/whiff/servers/fsCSI$ python find \
    /export/search /export/source/trunk hibernate switch
searching for 'hibernate' anywhere
searching for 'switch' anywhere


found 22 paths

Run the search HTTP search interface on a test port

Now that we know that the search index works properly we can use the script to view the web search form running on the localhost at the test port 8888.
INSTALL/whiff/servers/fsCSI$ python serve /export/search /export/source/trunk
search start page at http://localhost:8888/search
serving wsgi on port 8888
Point a browser at http://localhost:8888/search and the search interface pops up:

Install the search page as a CGI script

The tested web search interface may now be installed under any web server configuration able to serve WSGI pages. In my case I will use an Apache server and the CGI interface to the server.

The CGI script which sets up the search interface /usr/local/bin/apache2/cgi-bin/code.cgi has the following content:


import wsgiref.handlers
import whiff.resolver
from whiff.servers.fsCSI import fsCSI

# create a search application using the /export/search index and the /export/source/trunk tree
searcher = fsCSI.getApplication("/export/search", "/export/source/trunk")

# wrap the search interface as a WHIFF root application
application = whiff.resolver.moduleRootApplication("/cgi-bin/code.cgi", searcher)

# serve a CGI request using the directory
As always, the code.cgi CGI script must be made executable before Apache will use it
..$ chmod a+x code.cgi
Note that the search application must be wrapped as a WHIFF root application because it requires WHIFF name resolution services.

Now the search interface becomes automatically available on my Apache server at the URL

Care to comment?
name: (required)
- email (not published):
comment: (required)

<< security number? >>