26: Advanced Desktop Search

Transcription

26: Advanced Desktop Search
International Journal of Conceptions on Computing and Information Technology
Vol.2, Issue 1, Jan’ 2014; ISSN: 2345 - 9808
Advanced Desktop Search
M Venkatesh, Paul Bharath Bhushan Petlu
Devaki Pendlimarri
Dept. of Information Technology,
Sri Venkatesa Perumal College of Engineering &
Technology,
Puttur, Andhra Pradesh, India.
{venkatesh.mabbu, trishipaul}@gmail.com
Dept. of Computer Science and Engineering,
Sri Venkatesa Perumal College of Engineering &
Technology,
Puttur, Andhra Pradesh, India.
[email protected]
Abstract— Desktop search is the name for the field of search tools
which search the contents of a user's own computer files, rather
than searching the Internet. These tools are designed to find
information on the user's PC. One of the main advantages of
desktop search programs is that search results arrive in a few
seconds when compared with the regular Windows search
companion. Desktop search is emerging as a concern for large
firms for two main reasons: untapped productivity and security.
Desktop search engines build and maintain an index database to
achieve
reasonable
performance
when
searching
several gigabytes of data. Our Product is extremely a very
competitive with the existing products in the market. This is the
only Desktop search program from India. It provides very fast
and accurate results and we included a vast variety of index
filters to find the right file what user is looking for.
Keywords- Desktop-search, Performance,
results, ease of use, a must use desktop utility
Comprehensive
I. INTRODUCTION
One of the main advantages of desktop search programs is
that search results arrive in a few seconds. “Windows search
companion” can be some help, but it searches through
Windows files and folders only, not e-mail or contact
databases, and unless you enable the Indexing Service (in
Windows 2000 or XP), the Windows search tool is extremely
slow. Windows Vista enables the indexing service by default.
Desktop search is emerging as a concern for large firms for
two main reasons: untapped productivity and security. A
commonly cited statistic states that 80% of a company's data is
locked up inside unstructured data — the information stored on
an end user's PC, the files and directories they've created on
a network, documents stored in repositories such as
corporate intranets and a multitude of other locations.
Moreover, many companies have structured or unstructured
information stored in older file formats to which they don't
have ready access.
Companies doing business in the United States are
frequently required under regulatory mandates like SarbanesOxley, HIPAA and FERPA to make sure that access to
sensitive information is 100% controlled. This creates a
challenge for IT organizations, which may not have a desktop
search standard, or lack strict central control over end
users downloading tools from the Internet. Some consumeroriented desktop search tools make it possible to generate
indexes outside the corporate firewall and share those indexes
with unauthorized users. In some cases, end users are able to
index — but not preview — items they should not even know
exist.
Historically, full desktop search come from work of Apple
Computer's Advanced Technology Group, resulting in
underlying AppleSearch technology in early 1990s. It was used
to build Sherlock search engine and then developed
into Spotlight, which brought automated, non-timer based full
indexing into operating system.
II.
TECHNOLOGIES
A Desktop search engines is one which build and maintain
an index database to achieve reasonable performance when
searching several gigabytes of data. Indexing usually takes
place when the computer is idle. When indexing the files,
desktop search tools collect three types of information about
files:
 file and directory names
 metadata, such as titles, authors, comments in file types
such as MP3, PDF and JPEG
Fig. 1: Traditional windows search
 Content of supported documents.
120 | 1 2 4
International Journal of Conceptions on Computing and Information Technology
Vol.2, Issue 1, Jan’ 2014; ISSN: 2345 - 9808
Besides programs that use indexing, there are many
programs that open and search files instantly. Their
disadvantage is that they can search only a certain directory,
not the entire computer, but their great advantage is that they
do not load the resources of computer with indexing.
Furthermore, they always use the current status of the
documents.
index the hard disk before performing any search. The screen
shot below shows you the very first look of the program.
In fig.2, we may observe the message that was displayed in
red asking the user to index before performing any search
operation.
To search within documents, the tools need to be able to
parse many different types of documents. This is achieved by
using filters that interpret selected file formats. For example,
a Microsoft Office Filter might be used to search
inside Microsoft Office documents.
Long-term goals for desktop search include the ability to
search the contents of image files, sound files and video by
context.
The sector attracted considerable attention from the
struggle between Microsoft and Google. According to market
analysts, both companies are attempting to leverage their
monopolies
(of web
browsers and search
engines,
respectively) to strengthen their dominance.
When comes to the technologies used in our project is, We
build the entire project using java that is jdk1.5 and traces of
html is also used to write the results on to the webpage. We
specially chosen java because; it is open source and platform
independent.
III. FEATURES
A. Include all File types -
Fig. 2: Initial Screen
4.1. Indexing On clicking the index button on main window you are
directed to the File index options window. Firstly the window
shows you with all the possible locations including external
hard disks, USB memory devices (this feature was not
supported in many other programs).
Most of the Desktop Search programs that are available in
the market will support only the limited number of file types
mostly they support all the popular file types like
doc,mp3,jpg,etc,. But our product do support to all the possible
file types. The one most important feature is, it also supports
folder names and it considers folders also as one kind of file
type and we may search even with folder names as keywords
what most of the programs won’t support this.
B. Search Filters –
The most important feature of this program is, It provides
vast variety of search filters and we may perform search the file
on basis using this wide range of search filters that is based on
file type, file size, locations include, locations exclude, etc,.
C. High speed & User friendly –
Fig. 3: Index options
We may alter the locations to be indexed. That is, we may
add/remove the locations to be indexed. On submit, The
program starts indexing the given locations with a window
showing you the progress of indexing as shown in the below
screen shot.
As discussed earlier this program will return results in less
than a second and the results were displayed in the web
browser in the very user friendly manner by displaying the file
size, complete path, and there is also a provision to open the
file containing folder also.
IV. HOW IT WORKS
When we open program firstly it displays a message about
the status of file indexing in the main window and it asks to
121 | 1 2 4
International Journal of Conceptions on Computing and Information Technology
Vol.2, Issue 1, Jan’ 2014; ISSN: 2345 - 9808
Fig. 4: Index status
Here the above screenshot is showing you the index status
that 169782 files were indexed so far. The locations C:\ and D:\
are completely indexed and currently indexing the location E:\.
After the initial indexing is completed, the User can start
searching for files. After performing searches, results can also
be returned in an Internet browser much like the results for
Google Web searches.
This Desktop Search program can index several different
types of files and it allows the user to control which types of
data are indexed by the program.
Google Desktop only indexes 100,000 files per drive during
the initial indexing period. If users have more than 100,000
files in a particular drive, Google Desktop won't index all of
them during this initial period. However, Google Desktop adds
files to the index during real-time indexing when users move or
open them but, our program don’t have such restrictions.
Fig. 6: Results screen
Here the above results page shows you the sample output
for the “java” keyword we provided and it shows that there are
“713 results found for the search keyword(s) java”.
This is the regular search and the program has a feature
called “Advanced Search”, in which we may experience the
real power of this program.
4.2.1.
Advanced Search –
Now we will see how we use the advanced search features
of this program. On clicking the “Advanced Search” button on
the main window we are directed to the Advanced Search
screen which looks like below.
4.2. Searching Now we will see how actually the search will perform in
this program and how to provide the keyword on what file we
are searching for. Here the below screenshot shows you the
Regular search screen and we provided the keyword “java” in
the text field of search box.
Fig. 7: Advanced Search
Here is the screen showing you the every possible search
filter what a normal user may expect. Now we discuss about
these filters in detailed.
4.2.1.1. Contains/Doesn’t contains –
This filter takes all the possible keyword(s) from the user
about what are the keyword(s) he is expecting in the file name
that he is looking for.
Fig. 5: Regular search window
This is the Regular search window which takes a keyword
related to the file/folder name we are searching for. On clicking
the button “Search” the Search engine looks for all the files and
folders that contains the keyword “java” and the results was
displayed in a web browser in the following format.
The “Doesn’t contain” feature is provided so that the user
may exclude the keyword(s) what he is not interested to be
present in the file he is looking for.
4.2.1.2. Location(s) include/exclude –
This feature helps the user to enter the location(s) where he
wants to perform search. Here we may include the folder/drive
names where search has to be performed and the most
122 | 1 2 4
International Journal of Conceptions on Computing and Information Technology
Vol.2, Issue 1, Jan’ 2014; ISSN: 2345 - 9808
interesting part is we may include multiple locations at once by
separating with ‘,’ among them.
For instance we may enter like this “D:\java\,C:\program
files\java\”. This means that the search will only look into the
paths we specified.
Similarly the equivalent “exclude location(s)” option is
there to specify on which locations the results has to be
omitted.
4.2.1.3. File type(s) –
Here in this text-field we may specify what is the type of
the file we are looking for that is mp3, jpg,doc,etc,. Similarly
we had an option to specify what file type(s) has to be omitted.
4.2.1.4. File size –
In this text field we may specify the file size (in KB) of file
name we are looking for in easier manner using <,>, =
operators. For example, We may simply specify as “>5000”
which means that we are looking for the file which is greater
than 5000 KB and there is also a provision to submit the file
size which has to be excluded from the result.
Now we will see a sample output we got for the search we
perform using the keyword “java” using search filters.
Our program is supporting wide variety of search filters
with which an user finds it easier to locate the file in faster &
easier manner.
5.3. Search performed with part of file nameThis is a very important and interesting feature. Our
program allows the user to enter even the part of filename
especially when the user is not aware of the exact pattern of the
file name and most importantly many desktop programs won’t
have this flexibility to enter the part of a file name.
5.4. No unwanted results –
Many Desktop Search programs that are available in the
market will return hundreds and thousands of results in front of
the user in which most of the results are irrelevant and the user
will experience difficulty to search among the results page for
the required file and again this become a painstaking work to
the user
5.4.1. The Reason for unwanted results –
Now we will see why some desktop search programs will
return so many irrelevant and unwanted results.
The main reason is, suppose we are searching with
keyword “java” which we are expected to be present in a file
name on your hard disk, Un fortunately many of the desktop
search programs will return all the files that are present inside
the folders that containing the name “java” The files present in
its all of the sub folders are also returned to the results page.
Because, the program founds “java” keyword in the path but
not in the filename and thereby the results page is fed up with
some hundreds and thousands of irrelevant results.
Many Desktop search programs are largely concentrates
and they believe that the number of results the search engine
will returns as its performance, rather than the number of
relevant results it returns.
Fig. 8: Results page after using filters
Here we may experience the change in the number of
results from “713” in regular search to just “10” after using the
filters. By using this many filters we can find the appropriate
file what exactly we required.
V.
BENEFITS OVER E XISTING PRODUCTS
5.1. Include all file types As discussed earlier, this program will index all the files
that are present in the memory irrespective of its type. It also
indexes the folder names also which many desktop programs is
not supporting.
5.2. Search Filters –
VI. CURRENT LIMITS
The project which we demonstrated does have some
limitations at this stage. It doesn’t included the feature, to index
the meta data that is looking inside the contents of files (i.e. for
html, doc, rtf, etc,.)
Excepting that, the remaining features are ultimate when
compared with any desktop search programs that are available
in the market.
VII. CONCLUSION
The Project which we made is a completely innovative one
and which is one step ahead over all the existing technologies
that exist in the market. It does have many features when
compared with the products even from the giant corporations
which currently ruling the market. In the near future we are
extending the capabilities of our product to the next level by
overcoming the existing limitations so that our product will
become the first choice for the computer users who are starving
for best desktop search program.
123 | 1 2 4
International Journal of Conceptions on Computing and Information Technology
Vol.2, Issue 1, Jan’ 2014; ISSN: 2345 - 9808
REFERENCES
[1]
[2]
http://en.wikipedia.org/wiki/Desktop_search
http://www.copernic.com/en/products/desktop-search/index.html
[3]
[4]
[5]
124 | 1 2 4
http://en.wikipedia.org/wiki/Google_Desktop
http://www.top-windows-tutorials.com/windows-7-tutorial15.html
http://terrier.org/docs/v2.2.1/terrier_desktop.html