Drori, Offer., "Identifying the Subject of Documents in Digital Libraries Automatically Using Frequently-Occurring Words - Study and Findings Technical Report No. 2002-40 of the Leibniz Center for Research in Computer Science, School of Computer Science a

עפר דרורי Technical Report No. 2002-40 of the Leibniz Center for Research in Computer Science, School of Compu 01.06.2002 18:42
Drori, Offer., "Identifying the Subject of Documents in Digital Libraries Automatically Using Frequently-Occurring Words - Study and Findings Technical Report No. 2002-40 of the Leibniz Center for Research in Computer Science, School of Computer Science a


(73) Contemporary information databases contain millions of electronic documents. The immense number of documents makes it difficult to conduct efficient searches on the Internet. Several studies have found that associating documents with a subject or list of topics can make them easier to locate online [5] [6] [7]. Effective cataloging of information is performed manually, requiring extensive resources. Consequently, at present most information is not cataloged. This paper will present the findings of a study based on a software tool (TextAnalysis) that automatically identifies the subject of a document. We tested documents in two subject categories: geography and family studies. The present study follows an earlier one that examined the subject categories of industrial management and general management.





attachment drori062002b.pdf