26: Advanced Desktop Search
Transcription
26: Advanced Desktop Search
International Journal of Conceptions on Computing and Information Technology Vol.2, Issue 1, Jan’ 2014; ISSN: 2345 - 9808 Advanced Desktop Search M Venkatesh, Paul Bharath Bhushan Petlu Devaki Pendlimarri Dept. of Information Technology, Sri Venkatesa Perumal College of Engineering & Technology, Puttur, Andhra Pradesh, India. {venkatesh.mabbu, trishipaul}@gmail.com Dept. of Computer Science and Engineering, Sri Venkatesa Perumal College of Engineering & Technology, Puttur, Andhra Pradesh, India. [email protected] Abstract— Desktop search is the name for the field of search tools which search the contents of a user's own computer files, rather than searching the Internet. These tools are designed to find information on the user's PC. One of the main advantages of desktop search programs is that search results arrive in a few seconds when compared with the regular Windows search companion. Desktop search is emerging as a concern for large firms for two main reasons: untapped productivity and security. Desktop search engines build and maintain an index database to achieve reasonable performance when searching several gigabytes of data. Our Product is extremely a very competitive with the existing products in the market. This is the only Desktop search program from India. It provides very fast and accurate results and we included a vast variety of index filters to find the right file what user is looking for. Keywords- Desktop-search, Performance, results, ease of use, a must use desktop utility Comprehensive I. INTRODUCTION One of the main advantages of desktop search programs is that search results arrive in a few seconds. “Windows search companion” can be some help, but it searches through Windows files and folders only, not e-mail or contact databases, and unless you enable the Indexing Service (in Windows 2000 or XP), the Windows search tool is extremely slow. Windows Vista enables the indexing service by default. Desktop search is emerging as a concern for large firms for two main reasons: untapped productivity and security. A commonly cited statistic states that 80% of a company's data is locked up inside unstructured data — the information stored on an end user's PC, the files and directories they've created on a network, documents stored in repositories such as corporate intranets and a multitude of other locations. Moreover, many companies have structured or unstructured information stored in older file formats to which they don't have ready access. Companies doing business in the United States are frequently required under regulatory mandates like SarbanesOxley, HIPAA and FERPA to make sure that access to sensitive information is 100% controlled. This creates a challenge for IT organizations, which may not have a desktop search standard, or lack strict central control over end users downloading tools from the Internet. Some consumeroriented desktop search tools make it possible to generate indexes outside the corporate firewall and share those indexes with unauthorized users. In some cases, end users are able to index — but not preview — items they should not even know exist. Historically, full desktop search come from work of Apple Computer's Advanced Technology Group, resulting in underlying AppleSearch technology in early 1990s. It was used to build Sherlock search engine and then developed into Spotlight, which brought automated, non-timer based full indexing into operating system. II. TECHNOLOGIES A Desktop search engines is one which build and maintain an index database to achieve reasonable performance when searching several gigabytes of data. Indexing usually takes place when the computer is idle. When indexing the files, desktop search tools collect three types of information about files: file and directory names metadata, such as titles, authors, comments in file types such as MP3, PDF and JPEG Fig. 1: Traditional windows search Content of supported documents. 120 | 1 2 4 International Journal of Conceptions on Computing and Information Technology Vol.2, Issue 1, Jan’ 2014; ISSN: 2345 - 9808 Besides programs that use indexing, there are many programs that open and search files instantly. Their disadvantage is that they can search only a certain directory, not the entire computer, but their great advantage is that they do not load the resources of computer with indexing. Furthermore, they always use the current status of the documents. index the hard disk before performing any search. The screen shot below shows you the very first look of the program. In fig.2, we may observe the message that was displayed in red asking the user to index before performing any search operation. To search within documents, the tools need to be able to parse many different types of documents. This is achieved by using filters that interpret selected file formats. For example, a Microsoft Office Filter might be used to search inside Microsoft Office documents. Long-term goals for desktop search include the ability to search the contents of image files, sound files and video by context. The sector attracted considerable attention from the struggle between Microsoft and Google. According to market analysts, both companies are attempting to leverage their monopolies (of web browsers and search engines, respectively) to strengthen their dominance. When comes to the technologies used in our project is, We build the entire project using java that is jdk1.5 and traces of html is also used to write the results on to the webpage. We specially chosen java because; it is open source and platform independent. III. FEATURES A. Include all File types - Fig. 2: Initial Screen 4.1. Indexing On clicking the index button on main window you are directed to the File index options window. Firstly the window shows you with all the possible locations including external hard disks, USB memory devices (this feature was not supported in many other programs). Most of the Desktop Search programs that are available in the market will support only the limited number of file types mostly they support all the popular file types like doc,mp3,jpg,etc,. But our product do support to all the possible file types. The one most important feature is, it also supports folder names and it considers folders also as one kind of file type and we may search even with folder names as keywords what most of the programs won’t support this. B. Search Filters – The most important feature of this program is, It provides vast variety of search filters and we may perform search the file on basis using this wide range of search filters that is based on file type, file size, locations include, locations exclude, etc,. C. High speed & User friendly – Fig. 3: Index options We may alter the locations to be indexed. That is, we may add/remove the locations to be indexed. On submit, The program starts indexing the given locations with a window showing you the progress of indexing as shown in the below screen shot. As discussed earlier this program will return results in less than a second and the results were displayed in the web browser in the very user friendly manner by displaying the file size, complete path, and there is also a provision to open the file containing folder also. IV. HOW IT WORKS When we open program firstly it displays a message about the status of file indexing in the main window and it asks to 121 | 1 2 4 International Journal of Conceptions on Computing and Information Technology Vol.2, Issue 1, Jan’ 2014; ISSN: 2345 - 9808 Fig. 4: Index status Here the above screenshot is showing you the index status that 169782 files were indexed so far. The locations C:\ and D:\ are completely indexed and currently indexing the location E:\. After the initial indexing is completed, the User can start searching for files. After performing searches, results can also be returned in an Internet browser much like the results for Google Web searches. This Desktop Search program can index several different types of files and it allows the user to control which types of data are indexed by the program. Google Desktop only indexes 100,000 files per drive during the initial indexing period. If users have more than 100,000 files in a particular drive, Google Desktop won't index all of them during this initial period. However, Google Desktop adds files to the index during real-time indexing when users move or open them but, our program don’t have such restrictions. Fig. 6: Results screen Here the above results page shows you the sample output for the “java” keyword we provided and it shows that there are “713 results found for the search keyword(s) java”. This is the regular search and the program has a feature called “Advanced Search”, in which we may experience the real power of this program. 4.2.1. Advanced Search – Now we will see how we use the advanced search features of this program. On clicking the “Advanced Search” button on the main window we are directed to the Advanced Search screen which looks like below. 4.2. Searching Now we will see how actually the search will perform in this program and how to provide the keyword on what file we are searching for. Here the below screenshot shows you the Regular search screen and we provided the keyword “java” in the text field of search box. Fig. 7: Advanced Search Here is the screen showing you the every possible search filter what a normal user may expect. Now we discuss about these filters in detailed. 4.2.1.1. Contains/Doesn’t contains – This filter takes all the possible keyword(s) from the user about what are the keyword(s) he is expecting in the file name that he is looking for. Fig. 5: Regular search window This is the Regular search window which takes a keyword related to the file/folder name we are searching for. On clicking the button “Search” the Search engine looks for all the files and folders that contains the keyword “java” and the results was displayed in a web browser in the following format. The “Doesn’t contain” feature is provided so that the user may exclude the keyword(s) what he is not interested to be present in the file he is looking for. 4.2.1.2. Location(s) include/exclude – This feature helps the user to enter the location(s) where he wants to perform search. Here we may include the folder/drive names where search has to be performed and the most 122 | 1 2 4 International Journal of Conceptions on Computing and Information Technology Vol.2, Issue 1, Jan’ 2014; ISSN: 2345 - 9808 interesting part is we may include multiple locations at once by separating with ‘,’ among them. For instance we may enter like this “D:\java\,C:\program files\java\”. This means that the search will only look into the paths we specified. Similarly the equivalent “exclude location(s)” option is there to specify on which locations the results has to be omitted. 4.2.1.3. File type(s) – Here in this text-field we may specify what is the type of the file we are looking for that is mp3, jpg,doc,etc,. Similarly we had an option to specify what file type(s) has to be omitted. 4.2.1.4. File size – In this text field we may specify the file size (in KB) of file name we are looking for in easier manner using <,>, = operators. For example, We may simply specify as “>5000” which means that we are looking for the file which is greater than 5000 KB and there is also a provision to submit the file size which has to be excluded from the result. Now we will see a sample output we got for the search we perform using the keyword “java” using search filters. Our program is supporting wide variety of search filters with which an user finds it easier to locate the file in faster & easier manner. 5.3. Search performed with part of file nameThis is a very important and interesting feature. Our program allows the user to enter even the part of filename especially when the user is not aware of the exact pattern of the file name and most importantly many desktop programs won’t have this flexibility to enter the part of a file name. 5.4. No unwanted results – Many Desktop Search programs that are available in the market will return hundreds and thousands of results in front of the user in which most of the results are irrelevant and the user will experience difficulty to search among the results page for the required file and again this become a painstaking work to the user 5.4.1. The Reason for unwanted results – Now we will see why some desktop search programs will return so many irrelevant and unwanted results. The main reason is, suppose we are searching with keyword “java” which we are expected to be present in a file name on your hard disk, Un fortunately many of the desktop search programs will return all the files that are present inside the folders that containing the name “java” The files present in its all of the sub folders are also returned to the results page. Because, the program founds “java” keyword in the path but not in the filename and thereby the results page is fed up with some hundreds and thousands of irrelevant results. Many Desktop search programs are largely concentrates and they believe that the number of results the search engine will returns as its performance, rather than the number of relevant results it returns. Fig. 8: Results page after using filters Here we may experience the change in the number of results from “713” in regular search to just “10” after using the filters. By using this many filters we can find the appropriate file what exactly we required. V. BENEFITS OVER E XISTING PRODUCTS 5.1. Include all file types As discussed earlier, this program will index all the files that are present in the memory irrespective of its type. It also indexes the folder names also which many desktop programs is not supporting. 5.2. Search Filters – VI. CURRENT LIMITS The project which we demonstrated does have some limitations at this stage. It doesn’t included the feature, to index the meta data that is looking inside the contents of files (i.e. for html, doc, rtf, etc,.) Excepting that, the remaining features are ultimate when compared with any desktop search programs that are available in the market. VII. CONCLUSION The Project which we made is a completely innovative one and which is one step ahead over all the existing technologies that exist in the market. It does have many features when compared with the products even from the giant corporations which currently ruling the market. In the near future we are extending the capabilities of our product to the next level by overcoming the existing limitations so that our product will become the first choice for the computer users who are starving for best desktop search program. 123 | 1 2 4 International Journal of Conceptions on Computing and Information Technology Vol.2, Issue 1, Jan’ 2014; ISSN: 2345 - 9808 REFERENCES [1] [2] http://en.wikipedia.org/wiki/Desktop_search http://www.copernic.com/en/products/desktop-search/index.html [3] [4] [5] 124 | 1 2 4 http://en.wikipedia.org/wiki/Google_Desktop http://www.top-windows-tutorials.com/windows-7-tutorial15.html http://terrier.org/docs/v2.2.1/terrier_desktop.html