
Get the required componentsThe demo page includes a link to a WHIFF application to search the Python source code tree, with the ability to differentiate different syntactic components in the source code. For example the search engine knows the difference between a comment, a string, and a variable name.
Build the search index
Try out the search index from the command line
Run the search HTTP search interface on a test port
Install the search page as a CGI script
The fsCSI search components which implement
the search demo are distributed as part of the WHIFF
distribution under the INSTALL/whiff/servers/fsCSI
directory. This tutorial describes how to use these
components to build a similar search interface for any source code
tree available on your computer.
/export/source/trunk/
and I will create an archive for searching this code in the new directory
/export/search. To execute the build I change directory to the
fsCSI installation location and run the build command:
$ cd INSTALL/whiff/servers/fsCSI
INSTALL/whiff/servers/fsCSI$ python fsCSI.py build /export/search /export/source/trunk
... A LOT OF BUILD TRACE OUTPUT FOLLOWS ...
/export/source/trunk directory contains about a gigabyte
of source code, the process of building the index takes about a half hour.
By doing this build all at once the search interface does not need to look
at the file content directly in order to find files containing patterns of
interest.
fsCSI.py find interface to the index.
For example the search below looks for files containing both "hibernate" and "switch"
anywhere.
INSTALL/whiff/servers/fsCSI$ python fsCSI.py find \
/export/search /export/source/trunk hibernate switch
searching for 'hibernate' anywhere
searching for 'switch' anywhere
'/export/source/trunk/course-management/cm-impl/rutgers-impl/impl/src/java/edu/rutgers/sakai/coursemanagement/service/RutgersCourseManagementServiceImpl.java'
'/export/source/trunk/entitybroker/api/src/java/org/sakaiproject/entitybroker/DeveloperHelperService.java'
'/export/source/trunk/entitybroker/api/target/classes/src/java/org/sakaiproject/entitybroker/DeveloperHelperService.java'
'/export/source/trunk/gradebook/app/standalone-app/src/test/spring-hib-test.xml'
'/export/source/trunk/gradebook/app/standalone-app/target/test-classes/spring-hib-test.xml'
'/export/source/trunk/kernel/conversion/masterpom.patch'
'/export/source/trunk/kernel/pom.xml'
'/export/source/trunk/master/pom.xml'
'/export/source/trunk/msgcntr/messageforums-app/src/java/org/sakaiproject/tool/messageforums/PrivateMessagesTool.java'
'/export/source/trunk/osp/matrix/api-impl/src/java/org/theospi/portfolio/matrix/HibernateMatrixManagerImpl.java'
'/export/source/trunk/osp/presentation/api-impl/src/java/org/theospi/portfolio/presentation/model/impl/PresentationManagerImpl.java'
'/export/source/trunk/osp/presentation/tool/src/java/org/theospi/portfolio/presentation/control/AddTemplateController.java'
'/export/source/trunk/osp/wizard/api-impl/src/java/org/theospi/portfolio/wizard/mgt/impl/WizardManagerImpl.java'
'/export/source/trunk/reference/docs/conversion/sakai_2_6_0_oracle_conversion.sql'
'/export/source/trunk/reference/docs/docbook/Installation.xml'
'/export/source/trunk/rwiki/xdocs/srs.xml'
'/export/source/trunk/sam/samigo-app/src/java/org/sakaiproject/tool/assessment/ui/bean/evaluation/TotalScoresBean.java'
'/export/source/trunk/sam/samigo-hibernate/src/java/org/sakaiproject/tool/assessment/data/dao/authz/AuthorizationData.java'
'/export/source/trunk/sections/sections-impl/standalone/src/conf/org/sakaiproject/component/using/spring-hib.xml'
'/export/source/trunk/sections/sections-impl/standalone/src/java/org/sakaiproject/component/using/SectionManagerHibernateImpl.java'
'/export/source/trunk/sections/sections-impl/standalone/target/classes/org/sakaiproject/component/using/spring-hib.xml'
'/export/source/trunk/sections/xdocs/README.txt'
found 22 paths
INSTALL/whiff/servers/fsCSI$
fsCSI.py script to view the web search
form running on the localhost at the test port 8888.
INSTALL/whiff/servers/fsCSI$ python fsCSI.py serve /export/search /export/source/trunk
search start page at http://localhost:8888/search
serving wsgi on port 8888
http://localhost:8888/search and the search interface
pops up:
The CGI script which sets up the search interface
/usr/local/bin/apache2/cgi-bin/code.cgi has
the following content:
#!/Library/Frameworks/Python.framework/Versions/Current/bin/python
import wsgiref.handlers
import whiff.resolver
from whiff.servers.fsCSI import fsCSI
# create a search application using the /export/search index and the /export/source/trunk tree
searcher = fsCSI.getApplication("/export/search", "/export/source/trunk")
# wrap the search interface as a WHIFF root application
application = whiff.resolver.moduleRootApplication("/cgi-bin/code.cgi", searcher)
# serve a CGI request using the directory
wsgiref.handlers.CGIHandler().run(application)
code.cgi CGI script must be made executable before Apache will use it
..$ chmod a+x code.cgi
Now the search interface becomes automatically available on my Apache server at the URL http://aaron.oirt.rutgers.edu/cgi-bin/code.cgi/search.