Document (#28096)

Prime-Claverie, C.
Beigbeder, M.
Lafouge, T.
Transposition of the cocitation method with a view to classifying Web pages
Journal of the American Society for Information Science and Technology. 55(2004) no.14, S.1282-1289
The Web is a huge source of information, and one of the main problems facing users is finding documents which correspond to their requirements. Apart from the problem of thematic relevance, the documents retrieved by search engines do not always meet the users' expectations. The document may be too general, or conversely too specialized, or of a different type from what the user is looking for, and so forth. We think that adding metadata to pages can considerably improve the process of searching for information an the Web. This article presents a possible typology for Web sites and pages, as weIl as a method for propagating metadata values, based an the study of the Web graph and more specifically the method of cocitation in this graph.
Beitrag in einem Themenheft über Webometrics
Citation indexing

Similar documents (content)

  1. Villela Dantas, J.R.; Muniz Farias, P.F.: Conceptual navigation in knowledge management environments using NavCon (2010) 0.09
    0.09400454 = sum of:
      0.09400454 = product of:
        0.4700227 = sum of:
          0.0466186 = weight(abstract_txt:meet in 4230) [ClassicSimilarity], result of:
            0.0466186 = score(doc=4230,freq=1.0), product of:
              0.12514399 = queryWeight, product of:
                5.9603148 = idf(docFreq=309, maxDocs=44218)
                0.020996206 = queryNorm
              0.37251967 = fieldWeight in 4230, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9603148 = idf(docFreq=309, maxDocs=44218)
                0.0625 = fieldNorm(doc=4230)
          0.030823728 = weight(abstract_txt:documents in 4230) [ClassicSimilarity], result of:
            0.030823728 = score(doc=4230,freq=1.0), product of:
              0.119665965 = queryWeight, product of:
                1.3829144 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.020996206 = queryNorm
              0.2575814 = fieldWeight in 4230, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.0625 = fieldNorm(doc=4230)
          0.051212266 = weight(abstract_txt:metadata in 4230) [ClassicSimilarity], result of:
            0.051212266 = score(doc=4230,freq=1.0), product of:
              0.16786617 = queryWeight, product of:
                1.6379158 = boost
                4.881247 = idf(docFreq=911, maxDocs=44218)
                0.020996206 = queryNorm
              0.30507794 = fieldWeight in 4230, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.881247 = idf(docFreq=911, maxDocs=44218)
                0.0625 = fieldNorm(doc=4230)
          0.17683664 = weight(abstract_txt:graph in 4230) [ClassicSimilarity], result of:
            0.17683664 = score(doc=4230,freq=2.0), product of:
              0.30438182 = queryWeight, product of:
                2.2055624 = boost
                6.572923 = idf(docFreq=167, maxDocs=44218)
                0.020996206 = queryNorm
              0.5809698 = fieldWeight in 4230, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.572923 = idf(docFreq=167, maxDocs=44218)
                0.0625 = fieldNorm(doc=4230)
          0.16453147 = weight(abstract_txt:pages in 4230) [ClassicSimilarity], result of:
            0.16453147 = score(doc=4230,freq=2.0), product of:
              0.33207303 = queryWeight, product of:
                2.8214505 = boost
                5.6055775 = idf(docFreq=441, maxDocs=44218)
                0.020996206 = queryNorm
              0.49546772 = fieldWeight in 4230, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.6055775 = idf(docFreq=441, maxDocs=44218)
                0.0625 = fieldNorm(doc=4230)
        0.2 = coord(5/25)
  2. Yang, C.C.; Liu, N.: Web site topic-hierarchy generation based on link structure (2009) 0.08
    0.08442736 = sum of:
      0.08442736 = product of:
        0.527671 = sum of:
          0.052257784 = weight(abstract_txt:always in 2738) [ClassicSimilarity], result of:
            0.052257784 = score(doc=2738,freq=1.0), product of:
              0.13504273 = queryWeight, product of:
                1.0387968 = boost
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.020996206 = queryNorm
              0.38697222 = fieldWeight in 2738, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.0625 = fieldNorm(doc=2738)
          0.09430196 = weight(abstract_txt:correspond in 2738) [ClassicSimilarity], result of:
            0.09430196 = score(doc=2738,freq=1.0), product of:
              0.20016326 = queryWeight, product of:
                1.264699 = boost
                7.538004 = idf(docFreq=63, maxDocs=44218)
                0.020996206 = queryNorm
              0.47112525 = fieldWeight in 2738, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.538004 = idf(docFreq=63, maxDocs=44218)
                0.0625 = fieldNorm(doc=2738)
          0.21657978 = weight(abstract_txt:graph in 2738) [ClassicSimilarity], result of:
            0.21657978 = score(doc=2738,freq=3.0), product of:
              0.30438182 = queryWeight, product of:
                2.2055624 = boost
                6.572923 = idf(docFreq=167, maxDocs=44218)
                0.020996206 = queryNorm
              0.7115398 = fieldWeight in 2738, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                6.572923 = idf(docFreq=167, maxDocs=44218)
                0.0625 = fieldNorm(doc=2738)
          0.16453147 = weight(abstract_txt:pages in 2738) [ClassicSimilarity], result of:
            0.16453147 = score(doc=2738,freq=2.0), product of:
              0.33207303 = queryWeight, product of:
                2.8214505 = boost
                5.6055775 = idf(docFreq=441, maxDocs=44218)
                0.020996206 = queryNorm
              0.49546772 = fieldWeight in 2738, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.6055775 = idf(docFreq=441, maxDocs=44218)
                0.0625 = fieldNorm(doc=2738)
        0.16 = coord(4/25)
  3. Gomez, I.: Coping with the problem of subject classification diversity (1996) 0.08
    0.0787557 = sum of:
      0.0787557 = product of:
        0.3937785 = sum of:
          0.06992789 = weight(abstract_txt:meet in 5074) [ClassicSimilarity], result of:
            0.06992789 = score(doc=5074,freq=1.0), product of:
              0.12514399 = queryWeight, product of:
                5.9603148 = idf(docFreq=309, maxDocs=44218)
                0.020996206 = queryNorm
              0.5587795 = fieldWeight in 5074, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.9603148 = idf(docFreq=309, maxDocs=44218)
                0.09375 = fieldNorm(doc=5074)
          0.08063326 = weight(abstract_txt:specialized in 5074) [ClassicSimilarity], result of:
            0.08063326 = score(doc=5074,freq=1.0), product of:
              0.1376108 = queryWeight, product of:
                1.0486275 = boost
                6.2501497 = idf(docFreq=231, maxDocs=44218)
                0.020996206 = queryNorm
              0.58595157 = fieldWeight in 5074, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2501497 = idf(docFreq=231, maxDocs=44218)
                0.09375 = fieldNorm(doc=5074)
          0.10664247 = weight(abstract_txt:thematic in 5074) [ClassicSimilarity], result of:
            0.10664247 = score(doc=5074,freq=1.0), product of:
              0.16580456 = queryWeight, product of:
                1.1510475 = boost
                6.8606052 = idf(docFreq=125, maxDocs=44218)
                0.020996206 = queryNorm
              0.64318174 = fieldWeight in 5074, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8606052 = idf(docFreq=125, maxDocs=44218)
                0.09375 = fieldNorm(doc=5074)
          0.04623559 = weight(abstract_txt:documents in 5074) [ClassicSimilarity], result of:
            0.04623559 = score(doc=5074,freq=1.0), product of:
              0.119665965 = queryWeight, product of:
                1.3829144 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.020996206 = queryNorm
              0.38637212 = fieldWeight in 5074, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.09375 = fieldNorm(doc=5074)
          0.09033929 = weight(abstract_txt:method in 5074) [ClassicSimilarity], result of:
            0.09033929 = score(doc=5074,freq=1.0), product of:
              0.21409237 = queryWeight, product of:
                2.265459 = boost
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.020996206 = queryNorm
              0.42196405 = fieldWeight in 5074, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.09375 = fieldNorm(doc=5074)
        0.2 = coord(5/25)
  4. Baker, T.: Languages for Dublin Core (1998) 0.08
    0.07823058 = sum of:
      0.07823058 = product of:
        0.32596076 = sum of:
          0.04120541 = weight(abstract_txt:meet in 1257) [ClassicSimilarity], result of:
            0.04120541 = score(doc=1257,freq=2.0), product of:
              0.12514399 = queryWeight, product of:
                5.9603148 = idf(docFreq=309, maxDocs=44218)
                0.020996206 = queryNorm
              0.329264 = fieldWeight in 1257, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.9603148 = idf(docFreq=309, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1257)
          0.0475136 = weight(abstract_txt:specialized in 1257) [ClassicSimilarity], result of:
            0.0475136 = score(doc=1257,freq=2.0), product of:
              0.1376108 = queryWeight, product of:
                1.0486275 = boost
                6.2501497 = idf(docFreq=231, maxDocs=44218)
                0.020996206 = queryNorm
              0.34527525 = fieldWeight in 1257, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.2501497 = idf(docFreq=231, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1257)
          0.036742292 = weight(abstract_txt:looking in 1257) [ClassicSimilarity], result of:
            0.036742292 = score(doc=1257,freq=1.0), product of:
              0.14607011 = queryWeight, product of:
                1.0803778 = boost
                6.439392 = idf(docFreq=191, maxDocs=44218)
                0.020996206 = queryNorm
              0.25153875 = fieldWeight in 1257, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.439392 = idf(docFreq=191, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1257)
          0.01926483 = weight(abstract_txt:documents in 1257) [ClassicSimilarity], result of:
            0.01926483 = score(doc=1257,freq=1.0), product of:
              0.119665965 = queryWeight, product of:
                1.3829144 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.020996206 = queryNorm
              0.16098839 = fieldWeight in 1257, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1257)
          0.07840245 = weight(abstract_txt:metadata in 1257) [ClassicSimilarity], result of:
            0.07840245 = score(doc=1257,freq=6.0), product of:
              0.16786617 = queryWeight, product of:
                1.6379158 = boost
                4.881247 = idf(docFreq=911, maxDocs=44218)
                0.020996206 = queryNorm
              0.46705332 = fieldWeight in 1257, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                4.881247 = idf(docFreq=911, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1257)
          0.10283217 = weight(abstract_txt:pages in 1257) [ClassicSimilarity], result of:
            0.10283217 = score(doc=1257,freq=2.0), product of:
              0.33207303 = queryWeight, product of:
                2.8214505 = boost
                5.6055775 = idf(docFreq=441, maxDocs=44218)
                0.020996206 = queryNorm
              0.30966732 = fieldWeight in 1257, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.6055775 = idf(docFreq=441, maxDocs=44218)
                0.0390625 = fieldNorm(doc=1257)
        0.24 = coord(6/25)
  5. Collins-Thompson, K.; Callan, J.: Predicting reading difficulty with statistical language models (2005) 0.08
    0.077876754 = sum of:
      0.077876754 = product of:
        0.38938376 = sum of:
          0.048201054 = weight(abstract_txt:weil in 4579) [ClassicSimilarity], result of:
            0.048201054 = score(doc=4579,freq=1.0), product of:
              0.12796019 = queryWeight, product of:
                1.0111892 = boost
                6.027006 = idf(docFreq=289, maxDocs=44218)
                0.020996206 = queryNorm
              0.37668788 = fieldWeight in 4579, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.027006 = idf(docFreq=289, maxDocs=44218)
                0.0625 = fieldNorm(doc=4579)
          0.06303679 = weight(abstract_txt:classifying in 4579) [ClassicSimilarity], result of:
            0.06303679 = score(doc=4579,freq=1.0), product of:
              0.15302648 = queryWeight, product of:
                1.1058043 = boost
                6.590942 = idf(docFreq=164, maxDocs=44218)
                0.020996206 = queryNorm
              0.41193387 = fieldWeight in 4579, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.590942 = idf(docFreq=164, maxDocs=44218)
                0.0625 = fieldNorm(doc=4579)
          0.05338826 = weight(abstract_txt:documents in 4579) [ClassicSimilarity], result of:
            0.05338826 = score(doc=4579,freq=3.0), product of:
              0.119665965 = queryWeight, product of:
                1.3829144 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.020996206 = queryNorm
              0.44614407 = fieldWeight in 4579, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.0625 = fieldNorm(doc=4579)
          0.06022619 = weight(abstract_txt:method in 4579) [ClassicSimilarity], result of:
            0.06022619 = score(doc=4579,freq=1.0), product of:
              0.21409237 = queryWeight, product of:
                2.265459 = boost
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.020996206 = queryNorm
              0.28130937 = fieldWeight in 4579, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.0625 = fieldNorm(doc=4579)
          0.16453147 = weight(abstract_txt:pages in 4579) [ClassicSimilarity], result of:
            0.16453147 = score(doc=4579,freq=2.0), product of:
              0.33207303 = queryWeight, product of:
                2.8214505 = boost
                5.6055775 = idf(docFreq=441, maxDocs=44218)
                0.020996206 = queryNorm
              0.49546772 = fieldWeight in 4579, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.6055775 = idf(docFreq=441, maxDocs=44218)
                0.0625 = fieldNorm(doc=4579)
        0.2 = coord(5/25)