SCRIPTS CATEGORIES:



NEWS ARCHIVE >>
SOFTPEDIA REVIEWS >>

7-DAY TOP DOWNLOAD

#
Program
aircrack 1.0RC3
2,462
Scary Maze
1,783
JDownload 0.9
1,397
Virtual Floppy Drive
2.1

1,378
QuickPlay 3.8.4
1,313
szewo referer logger
3.0

1,137
Flash Flip Book
2.0.1

1,098
DjVu 0.8.0.9
975
Netcat 0.7.1
951
Scary Maze 2 1.1
727
Home / Scripts / Search Engines

searchdb

Download button

Downloads: 385  Add to download basket  Tell us about an update
User Rating:
Rated by:
Good (3.7/5)
20 user(s)
Developer:

License / Price:


Platforms:

Databases:

Language:

Last Updated:

Category:
webconcerns.co.uk | More scripts
Free Trial
Commercial License ($49.00) 
Windows
MSSQL / Access
ASP.NET
July 3rd, 2007, 13:32 GMT
C: \ Search Engines

 Read user reviews (0)  Add a review  Refer to a friend  Subscribe

 

searchdb description

 

searchdb is an ASP.NET search engine written in VB.NET.

searchdb is an ASP.NET search engine written in VB.NET. It incorporates a webcrawler, indexer and site search engine.

The program uses a database to store the crawled pages and extracted words and the results are displayed in a way similar to popular internet search engines.

The program is capable of indexing static text web pages and also dynamic pages which are normally extracted from a database and are of the form 'default.asp?name=value'

Formats such as Adobe pdf, Microsoft word, Macromedia Flash etc are not supported.

The engine is intended for small to medium sized web sites. For example, one web site has about 50 searchable web pages which creates an Access database file of about 600 kbytes and can be searched in less the 0.5 seconds.

Another web site has just over 1000 searchable web pages which creates an Access database file of about 13 Mbytes and takes about up to 0.8 seconds to search.

Features
- Crawls and indexes static and dynamic web pages.
- Able to crawl multiple sites.
- Stores the crawled URLs and associated words in database tables.
- The word indexer extracts title, meta data, alt text and visible text from the web page.
- Common words are excluded by the word indexer and search engine.
- Search results are displayed in order of word hits in a way similar to popular internet engines. 
- Works with either Microsoft Access or SQL Server databases.  
- Set up via password protected management displays.
The Crawler

The webcrawler starts crawling from a given page extracting a list of url links. It then spiders each link, extracting further links. Eventually all pages for the domain are listed in the database.

As each url link is found, the words on the page are extracted including meta tag keywords, meta tag descriptions and image alt text. These are stored in the database with the occurrence of each word.

All words of more than one character are indexed except those defined in the exclude word list. Also, punctuation marks are removed so you may see words such as asp.net being stored as aspnet within the database.

The same parsing is done on the search side as well as on the indexing side, so searching for asp.net will return the correct results.

The current version does not obey the noindex and nofollow meta tag keywords which may appear in the head of a web page. If you wish to exclude certain areas of your site then you can do so by entering the directory names into the list of directories to be excluded.

Then all files within the directories and any sub directories will not be indexed.

As each page is indexed, its file size is stored in the database. This is so that you may re-index only those pages that have changed in file size rather than re-indexing the complete site.

The Search Engine

To search the site you enter one or more words into a text box. Any words of one character are ignored, as are common noise words such as 'them', 'they', etc.

The search system is based on the word count within the pages. So if you do a search for 'cycling in Scotland' it will do a sql group by query based on 'cycling' and 'Scotland' and sorted by the word count.

The word 'in' will be excluded as it is an exclude word. So a page which has the word cycling and the word Scotland several times will have a higher word count and hence higher relevance and will appear further up the top of the search results.

The speed of searching is usually less than 0.5 seconds. As the number of web pages increase, the time to search does increase but not significantly because all the processing has been done during the index, and the search method is based on an efficient sql query.

Management Displays

In order to set up and configure the system, a set of password protected web pages are provided.

Requirements:

· web server with Microsoft .NET framework installed





TAGS:

search engine | crawler script | search engine script | search | engine | crawler



HTML code for linking to this page:


Go to top

Windows tabGames tabDrivers tabMac tabLinux tabScripts tabMobile tabHandheld tabGadgets tabNews tab

SUBMIT PROGRAM   |   ADVERTISE   |   GET HELP   |   SEND US FEEDBACK   |   RSS FEEDS   |   ENTER NEWS SITE   |   ENGLISH BOARD   |   ROMANIAN FORUM