Computers Press Release Information technologies


by ALENA SIAMESHKA - Date: 2007-03-12 - Word Count: 2016 Share This!

Information technologies have long settled in the corporate sector. It's a rare thing for the company not to have a well-organized local network and various specialized software that would provide a proper control of information flow, document storing and information structuring with convenient reports about the work process. Information Diversity Any company's information can be roughly divided into three types depending on its "virtual"/ physical location and its use in the work process. Starting with files from the user disk (plus electronic mail and logs of various instant messaging programs, like ICQ or MSN Messenger) and on to the corporate information, the documents of different file types & electronic mail (MS Exchange, for instance), or a file information archive on the company server. And finally - the data in various information systems: DMS, PDM, CRM, etc. This may include everything from the system objects found in a file archive or in the database like MS SQL to "external" electronic messages and documents used in the work of the system. Search? Considering such a vast variety of information, the conclusion follows that the problem of information search has lately become that of high priority. Common problems with information search are physical data volume, lack of proper organization of data and a vast variety of file types containing the needed data. As a result the demand for perfect search and information processing tools keeps growing. However besides search management (whether it's a file archive, corporate electronic mail or document management system), there is quite a number of other requirements that the corporate software has to comply with. This, obviously, includes working with local networks, which implies client-server software architecture; compliance with information security policies and user access management; as well as working not instead of some already installed system, but rather with it, without violating previously set business processes. Let us look more carefully at these requirements. Critical requirements to the corporate software Ability to work with the local network implies client-server software architecture, flexible network policy settings, different types of operating system support, etc. One of the latest trends is having a web-interface for the client part of the corporate software - it rids of the problems of additional workstations when extending the information structure. This version may be more expensive because when using web-interface the number of workstations is unlimited. Yet the choice between web-interface and an independent client program depends solely on the needs and problems to be solved by the software being purchased. The next critical factor of search software's work within the company is compliance with the information security policies and access management. Any information system should be a structure with clearly defined channels of information exchange both between the users and with the "outside world." Thus any corporate software must measure up to the strict requirements of information security: user access differentiation, multi-level access to different sorts of information, authorization system and a flexible structure of the security policies adjustment depending on the client's query. Another factor is the feature of corporate software that lets you work with company's previously installed software products of various types. As it has already been mentioned, the information in any organization can be stored in files both on disc or in DBMS and in various information systems (whether it's PDM, CRM or an accounting program). That is exactly why the third feature of any information system is the ability to function not instead of an already existing in the company software, but rather simultaneously with it. It's even more crucial for the corporate search engine because organization of search from all company's information resources is the main goal of the nominal search software application. Search Functions Besides the listed requirements, which put various search systems on the same level with the corporate software, there are also requirements to the functional capacities of this software. That is, directly to the main functions of the program, responsible for that very high speed and efficient search, the demand for which only grows. Firstly, the old generation of "straight search" (simple blind search) and search strictly by document attributes is replaced by the full-text search with a prior indexing. It's more than convenient as it's faster even when the search process is a dozen times more complicated. Secondly, it's the support of different file formats (both widely used and specialized ones) as well as flawless work with various types of DBMS, information systems etc. This list shouldn't neglect irreplaceable means of electronic mail (TheBat! or MS Exchange, for instance) and instant messaging programs like ICQ or MSN Messenger. Another must-have attribute of a quality program is a set of search features: various types of search (by phrase or by separate words), search with due consideration to stemming and synonyms and so on and so forth. And, of course, specifically for the corporate sector with its gigantic volumes of information, high performance speed (both in data indexing and in the search itself) - are not just wants, but needs. Progress looking for compromise Now that we are clear on the requirements imposed on corporate search software, the only thing left is to actually find a program/system that would meet these requirements. Obviously, it's impossible to satisfy all the needs without exception - there'll always be black wholes, lack of functions, bugs, which will either have to be dealt with or covered with add-on programs. Thus we can forget about the ideal, nothing stands still...that which seemed perfect yesterday may very well be discarded before tonight is over. In general, developments in the field of full-text search are in full bloom these days: Internet leading (Google being the evidence) while the corporate sector is catching up. All these developments are mainly conducted either by companies that have recently developed into popular online search engines or by search-based pages that started working on this technology 15-20 years ago. Verity, iSYS and dtSearch - companies, developing corporate search systems, make a good example. Solutions Modern search technologies are based on two root processes: indexing of available information and query processing followed by display of results. What concerns the former, any program creates its own area of search. That is, it processes documents and creates the index of those documents (an organized structure that contains information on the processed data). Later on this created index will be used by the program for quickly getting the list of documents relating to the query. Latest tests of software from dtSearch, ISYS, Verity, SearchInform and others have shown their capacities to be quite amazing. The indexing speed was quite high (in some search tools it even reaches 30 Gigabytes an hour) while the size of created index remaining small enough not to take up the whole of your drive space (SearchInform, for instance, makes 15-30% of "clean" text information volume). Yet the requirements don't stop there. As we've already figured out, one of the critical requirements is a precise and smoothly-running work with the local network. In this case the corporate version of such tools as SearchInform, dtSearch, ISYS, Google can offer a client-server architecture, indexing files from all accessible (and if there's administrator's permission in all) folders on all the computers in the local network as well as indexing and subsequent search on all the connected network disks, and user access management system based on NTFS authentication. That way the user can only search in the network resources that he has permission to access. Of course it's possible that when functioning in a big enterprise certain laps will occur from time to time, but from the technical point of view there are no complaints. The third main requirement concerns working not only with the information on discs, but also with other sources of data. Standard packages of SearchInform and Verity for instance, include ability to index and search in MS Access databases. At that, the procedure of connecting this data source is just as simple as, say, work with the electronic letters or mail clients: you only need to select the data source (in this case it would be MS Access database and show the program which fields should be indexed or simply leave that to the program and it'll index all the fields automatically). The example with Access - is only a single case. It's more than enough for any enterprise to organize the search in all its information under the management of one program. Now let's look at the search capacities and functions. First of all it's the number of supported file formats: most search engines index standard formats like txt, doc, rtf, html, CHM, Open Office etc.; a few also support multimedia files (audio and video), various specialized "programmers'" formats, a dozen archive types and in logs of instant messaging programs (MSN Messenger, ICQ, Trillian). Standard phrase search usually includes search with due consideration to stemming and synonyms, fuzzy search (with mistakes) and phrase search or search by separate words that the phrase contains, search by attributes, etc. In reality the main features that should be used are, of course, stemming search and search in found. However in each search tool there are some peculiarities that shouldn't be left without due attention. Copernic, for instance, offers an interesting search system where the user can select the type of file (graphics, audio, video etc.), enter search query and pick the features common only for that particular file type. For instance, for audio files it might be the features of mp3 tags (singer, album, data etc.), for graphics you can choose their size (by extension). Afterwards quite an extensive list of information appears in the result window and if files of types different from your specification also happened to fit the query, you can open them as well by clicking on a certain link. ISYS Desktop offers templates for creating index by folder: "My Documents", "Mail", "Specific Folder", "Folder with the choice of file types" etc. and if when creating your index you checked "Folder with the choice of file types", you have an option of choosing types of files for manual indexing (by extension). The program also lets you sort documents by certain criteria (by default they are sorted by relevancy) and look thought already found files selecting separate folders (especially convenient when the result displays a big number of documents). A unique feature of dtSearch is sound search, which is something totally untypical even for professional search engines. The main catch is that the program will look for words that sound similar to the query - exceptionally useful when searching in recorded calls database. SearchInform is known for its search for documents similar in their content to the query text, so to say "similar search". This type of search is a lot more ‘intellectual' than simple phrase search. In actual practice it helps solve quite a few problems, like those related to the duration of the search session, for example (continuously having to pick new keywords, looking over and over and comparing all the documents already existing in the company's database to see whether there are duplicates, etc.). The practice shows that combining simple phrase search and "similar document" search allows you to successfully and with a greater benefit apply the full-text search software in information systems from DMS to ERP and PDM. All in all, tools like dtSearch and ISYS mostly target the average business, while SearchInform and Verity find their market namely in the corporate sector. Copernic doesn't quite suit the corporate sector and is best put to use on the home PC, so a speaking name of Desktop Search reserves the field of desktop search engines for it. Google is also a player on the market, although its developers do not prioritize the corporate sector and their key area of development is still the Internet search. Thus there are quite a few solutions to choose from when solving the essential today problem of corporate search. Most of the mentioned tools are able to live up and satisfy at least the nominal demands of the corporate user; the game here depends on what it is you are looking for.


Related Tags: search, searchinform, search system, similar search

Your Article Search Directory : Find in Articles

© The article above is copyrighted by it's author. You're allowed to distribute this work according to the Creative Commons Attribution-NoDerivs license.
 

Recent articles in this category:



Most viewed articles in this category: