深圳升蓝软件
数据库开发 .Net技术  |  ASP技术 PHP技术 JSP技术 应用技术类   
Hiblue Software

用ASP建立一个简单的搜索引擎


March 25,2004
By Scott Mitchell

As a web site grows, finding content on the site becomes increasingly difficult. To combat the difficulty
of finding relevant information on a large site, many developers turn to writing a search engine for their
site. This article discusses how to implement such a system using Active Server Pages and SQL Server.

There are two "types" of search engines. Both take a search string from the user to begin, but what,
exactly, they search differs. A completely dynamic search engine for a completely dynamic web site will
hit a database table which ties an article URL to the articles description. The database can then compare
the user's search request to the descriptions of the available articles and return the relevant URLs.

Another approach is to do an actual text search through each of the files. For example, say that the user
searched for "Microsoft." Your search engine would then look through all of your HTML files and return the
URLs of those which had the word "Microsoft" somewhere in the document. Such a system is used for this web
site's search engine. In my opinion, it is much easier to write such a system described in Perl (which
this system is written in), than in Active Server Pages; however, it is quite possible to write a text-
finding search system in ASP.

In this article I plan to implement the former search engine, the dynamic search engine. For this example
I will make a table called ArticleURL, which will have the following definition:


ArticleURL
ArticleURLID  int   PK  
URL  varchar(255)  
Title  varchar(100)  
Description  varchar(255)  

Now that we've got our table definition, let's look at how our web visitors will enter their queries.

Search Querying
A search engine is rather useless unless queries can be made, and the results are returned. Let's examine
how we will code the first needed part, the user search requests. All we will need is a simple HTML FORM
which takes input from the user and passes it on to an ASP page. Here is an example of a file we'll call
SearchStart.htm:


<HTML>
<BODY>

<FORM METHOD=POST ACTION="Search.asp&ID=0">
> Search for: <INPUT TYPE=TEXT NAME="txtSearchString" SIZE="50">
<P>
<INPUT TYPE=SUBMIT>
</FORM>

</BODY>
</HTML>
This, of course, is not a pretty HTML page, but its functionality is there. There are many things which
could be done to enhance this page. It is recommended that JavaScript functions be present to make sure
the user is searching something (i.e. not just clicking Submit when there is no search string).

Now that we have the Query, we need to look at the second phase of any search engine: retrieving the data
and presenting it to the user. Here is where the real fun begins!

Retrieving the Data and Presenting It:
Our ASP page Search.asp must do a few steps. First, it must parse the FORM variable txtSearchString. Right
now, I am assuming that each word in the txtSearchString separated by a space will be ANDed together. You
can alter this (have it ORed), or, to make it more professional, you can give the user the option of which
boolean to put inbetween each spaced word.

Next, Search.asp will need to hit the database table ArticleURL and return the data in a user-friendly
fashion. Also, we will want to display the results only 10 records at a time, so logic will need to be
implemented to handle this as well. Let's look at some code.


<%

'Connect to Database
Dim Conn
Set Conn = Server.CreateObject("ADODB.Connection")
Conn.Open Application("MyConnectString")

'Set these up to your preference
DefaultBoolean = "AND"
RecordsPerPage = 10

'Get our form variable
Dim strSearch
strSearch = Request.form("txtSearchString")

'Get our current ID. This let's us know where we are Dim ID
ID = Request.QueryString("ID")

'Set up our SQL Statement
Dim strSQL, tmpSQL
strSQL = "SELECT * FROM ArticleURL WHERE "
tmpSQL = "(Description LIKE "

'OK, we need to parse our string here
Dim Pos
Pos = 1
While Pos > 0
      Pos = InStr(1, strSearch," ")
      If Pos = 0 Then
            'We have hit the end
            tmpSQL = tmpSQL & "'%" & strSearch & "%')"
      Else
            tmpSQL = tmpSQL & "'%" & Mid(strSearch,1,Pos) & "%' " & DefaultBoolean & " Description LIKE "
            strSearch = Mid(strSearch,Pos+1,len(strSearch))
      End If
Wend

'Now, we've got to make sure we only get the right records
strSQL = strSQL & tmpSQL & " AND ArticleURLID > " & ID
strSQL = strSQL & " ORDER BY ID"    'Important!

'Make our Recordset variable and get the results
Dim rsResults
Set rsResults = Server.CreateObject("ADODB.Recordset")

'Get the right number of records per page
rsResults.MaxRecords = RecordsPerPage

'Set our recordset properties (include ADOVBS.inc for the constant definitions!)
rsResults.CursorType = adForwardOnly

'Get our data
rsResults.Open strSQL

'OK, we've got the data, let's display it in HTML
'First, though, let's get the total number of records
Dim rsTotalRecords
strSQL = "SELECT COUNT(*) FROM ArticleURL WHERE " & tmpSQL
Set rsTotalRecords = Conn.Execute(strSQL)

'We also need the max ID value for our search Dim rsMaxID
>strSQL = "SELECT MAX(ArticleURLID) FROM ArticleURL WHERE " & tmpSQL
Set rsMaxID = Conn.Execute(strSQL)

%>


<HTML>
<BODY>

<% if rsResults.EOF then  'No matches found
%>

No matches found! Try broadening your search criteria.<P>
<A HREF="SearchStart.htm">Return to Search</A>

<% Else

Dim iCurrentID
While Not rsResults.EOF
iCurrentID = rsResults("ArticleURLID") %>
<A HREF="<%=rsResults("URL")%>"> <%=rsResults("Title")%></A>
<%=rsResults("Description")%>

<% rsResults.MoveNext
Wend %>

<P>
<%=rsTotalRecords(0)%> Found!<BR>


<% if iCurrentID < rsMaxID(0) then %>

<!-- We have at least another record... -->< BR><FORM METHOD=POSTACTION="Search.asp?ID=<%=iCurrentID%>">
<INPUT TYPE=HIDDEN NAME="txtSearchString" VALUE="<%=Request.form("strSearchString")%>">
<INPUT TYPE=SUBMIT VALUE="Next">
</FORM>

<% end if

end if 'End if for .EOF clause above %>

</BODY>
</HTML>
Note: Please forgive me if there are many errors or typos. I wrote this code while writing this article.
It has not been fully tested. In theory it should work. More important than running source code are the
ideas behind the code. Source code is a mere transformation of ideas into something a computer can
understand. If you truly understand the ideas, the code should write itself.

Hopefully you can understand what this code is doing. This file, Search.asp, will be called the first time
a search is executed and each time the user wants to view the next N records. To start out, the file gets
the search string and the current ID. The current ID is an important value, it tells this page which
records we've already seen. The SQL searches for records who have an ArticleURLID greater than the passed
in ID. To start off, we pass in an ID of 0, so all records (assuming ArticleURLID was set as an IDENTITY
(1,n))) will be included.

Next we parse out our Search String into a string variable called tmpSQL. If the user searched on "Magnum
P I", tmpSQL would contain (Description LIKE '%Magnum%' AND Description LIKE '%P%' AND Description LIKE '%
I%'). We then add to our WHERE clause ArticleURLID > ID, where ID is the ID we pass into Search.asp.

Next, we create an instance of an ADO Recordset object, and set the MaxRecords property to N, where N is
the number of rows we want to display per page. This will only return N records to our recordset object.

Finally we get the total number of records which match our search criteria and the maximum ID which
matches our criteria. We need the maximum ID to determine if we are currently on the last recordset. Once
we have all of this data we are ready to display our information.

We start out by seeing if we have any information in the first place! You'll not the if rsResults.EOF
then. If no records are found then we inform the user that we could find no results and provide a link
back to the SearchStart.htm page from which they came. If, however, rsResults is not empty, we iterate
through the recordset. We then check to see if our last ArticleURLID is less than the maximum ID. If it
is, then we know we have at least one more record to show, so we display the "Next" button which will
display the next N records.

Areas for Improvement:
As I'm sure you can note, this search engine solution leaves a lot to be desired as far as functionality
goes when we compare it to standard internet search engines. For example, there is no Back button, only a
forward. Also, you cannot do any complex boolean searches, such as: "Microsoft AND 'Active Server Pages'
AND NOT (VBScript OR JScript)". These can both be accomplished, though!

Personally, I have written a parser which accepted complex boolean searches similar to the one shown above
and transformed it into a SQL WHERE clause. To implement a Back feature, I would recommend a dynamic Array
(or stack). You would need to put this in a Session-level variable. Each time the user hits Next, you will
want to push the Request.QueryString("ID") onto the stack. When they hit the "Back" button you'll want to
pop the last ID off the stack and pass it as ID to Search.asp.

Conclusion:
In this article we've examined how to implement a simplistic dynamic search engine using Active Server
Pages and SQL Server. While the model implemented in this article is not exactly "feature-ful," it does
search, and presents the basic ideas behind a search engine. Without major modifications, this system
could be transformed into a very impressive, professional looking search engine.

Happy Programming!
Copyright © 2001-2008 Shenzhen Hiblue Software Team All rights reserved