A Technique for Extracting Sub-Source Similarities from Information Sources Having Different Formats
Academic Article
Publication Date:
2003
Short description:
A Technique for Extracting Sub-Source Similarities from Information Sources Having Different Formats / Rosaci, D., Terracina, G., Ursino, D.. - In: WORLD WIDE WEB. - ISSN 1386-145X. - 6:4(2003), pp. 375-399. [10.1023/A:1025614005307]
abstract:
In this paper we propose a semi-automatic technique for deriving the similarity degree between two portions of heterogeneous information sources (hereafter, sub-sources). The proposed technique consists in two phases: the first one selects the most promising pairs of sub-sources, whereas the second one computes the similarity degree relative to each promising pair. We show that the detection of sub-source similarities is a special case (and a very interesting one, for semi-structured information sources) of the more general problem of Scheme Match. In addition, we present a real example case to clarify the proposed technique, a set of experiments we have conducted to verify the quality of its results, a discussion about its computational complexity and its classification in the context of related literature. Finally, we discuss some possible applications which can benefit by derived similarities.
Iris type:
1.1 Articolo in rivista
Keywords:
Extraction of inter-source properties, Scheme Match, Semi-structured information sources, Sub-source similarities
List of contributors:
Rosaci, Domenico; Terracina, G; Ursino, D
Published in: