Document (#31927)

Author
Henzinger, M.R.
Title
Link analysis in Web information retrieval
Source
IEEE data engineering bulletin. 23(2000) no.3, S.3-8
Year
2000
Abstract
The analysis of the hyperlink structure of the web has led to significant improvements in web information retrieval. This survey describes two successful link analysis algorithms and the state-of-the art of the field.
Content
The goal of information retrieval is to find all documents relevant for a user query in a collection of documents. Decades of research in information retrieval were successful in developing and refining techniques that are solely word-based (see e.g., [2]). With the advent of the web new sources of information became available, one of them being the hyperlinks between documents and records of user behavior. To be precise, hypertexts (i.e., collections of documents connected by hyperlinks) have existed and have been studied for a long time. What was new was the large number of hyperlinks created by independent individuals. Hyperlinks provide a valuable source of information for web information retrieval as we will show in this article. This area of information retrieval is commonly called link analysis. Why would one expect hyperlinks to be useful? Ahyperlink is a reference of a web page B that is contained in a web page A. When the hyperlink is clicked on in a web browser, the browser displays page B. This functionality alone is not helpful for web information retrieval. However, the way hyperlinks are typically used by authors of web pages can give them valuable information content. Typically, authors create links because they think they will be useful for the readers of the pages. Thus, links are usually either navigational aids that, for example, bring the reader back to the homepage of the site, or links that point to pages whose content augments the content of the current page. The second kind of links tend to point to high-quality pages that might be on the same topic as the page containing the link.
Theme
Retrievalalgorithmen
Object
Google

Similar documents (author)

  1. Henzinger, M.R.: Hyperlink analysis for the Web (2001) 6.19
    6.190705 = sum of:
      6.190705 = weight(author_txt:henzinger in 8) [ClassicSimilarity], result of:
        6.190705 = fieldWeight in 8, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.905128 = idf(docFreq=5, maxDocs=44218)
          0.625 = fieldNorm(doc=8)
    
  2. Dean, J.; Henzinger, M.R.: Finding related pages in the World Wide Web (1999) 4.95
    4.952564 = sum of:
      4.952564 = weight(author_txt:henzinger in 6284) [ClassicSimilarity], result of:
        4.952564 = fieldWeight in 6284, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.905128 = idf(docFreq=5, maxDocs=44218)
          0.5 = fieldNorm(doc=6284)
    
  3. Henzinger, M.; Pöppe, C.: "Qualität der Suchergebnisse ist unser höchstes Ziel" : Suchmaschine Google (2002) 4.95
    4.952564 = sum of:
      4.952564 = weight(author_txt:henzinger in 851) [ClassicSimilarity], result of:
        4.952564 = fieldWeight in 851, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.905128 = idf(docFreq=5, maxDocs=44218)
          0.5 = fieldNorm(doc=851)
    
  4. Henzinger, M.; Wiesemann, M.: Google-Forschungschefin Monika Henzinger beklagt Manipulationen von Suchmaschinen : "Tricks der Porno-Branche" (2002) 4.95
    4.952564 = sum of:
      4.952564 = weight(author_txt:henzinger in 1137) [ClassicSimilarity], result of:
        4.952564 = fieldWeight in 1137, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.905128 = idf(docFreq=5, maxDocs=44218)
          0.5 = fieldNorm(doc=1137)
    

Similar documents (content)

  1. Rasmussen, E.: Clustering algorithms (1992) 0.38
    0.37879914 = sum of:
      0.37879914 = product of:
        0.81171244 = sum of:
          0.039890856 = weight(abstract_txt:structure in 3513) [ClassicSimilarity], result of:
            0.039890856 = score(doc=3513,freq=1.0), product of:
              0.14645566 = queryWeight, product of:
                1.8060372 = boost
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.018607683 = queryNorm
              0.27237496 = fieldWeight in 3513, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.0625 = fieldNorm(doc=3513)
          0.061779536 = weight(abstract_txt:field in 3513) [ClassicSimilarity], result of:
            0.061779536 = score(doc=3513,freq=2.0), product of:
              0.15560028 = queryWeight, product of:
                1.8615675 = boost
                4.491995 = idf(docFreq=1345, maxDocs=44218)
                0.018607683 = queryNorm
              0.39704 = fieldWeight in 3513, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.491995 = idf(docFreq=1345, maxDocs=44218)
                0.0625 = fieldNorm(doc=3513)
          0.013677204 = weight(abstract_txt:information in 3513) [ClassicSimilarity], result of:
            0.013677204 = score(doc=3513,freq=1.0), product of:
              0.09039245 = queryWeight, product of:
                2.0065718 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.018607683 = queryNorm
              0.15130915 = fieldWeight in 3513, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0625 = fieldNorm(doc=3513)
          0.12675403 = weight(abstract_txt:algorithms in 3513) [ClassicSimilarity], result of:
            0.12675403 = score(doc=3513,freq=2.0), product of:
              0.2512398 = queryWeight, product of:
                2.365472 = boost
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.018607683 = queryNorm
              0.5045141 = fieldWeight in 3513, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.0625 = fieldNorm(doc=3513)
          0.070067905 = weight(abstract_txt:retrieval in 3513) [ClassicSimilarity], result of:
            0.070067905 = score(doc=3513,freq=3.0), product of:
              0.1862543 = queryWeight, product of:
                2.8803267 = boost
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.018607683 = queryNorm
              0.37619486 = fieldWeight in 3513, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.4751394 = idf(docFreq=3720, maxDocs=44218)
                0.0625 = fieldNorm(doc=3513)
          0.14102839 = weight(abstract_txt:analysis in 3513) [ClassicSimilarity], result of:
            0.14102839 = score(doc=3513,freq=4.0), product of:
              0.30880338 = queryWeight, product of:
                4.542294 = boost
                3.6535451 = idf(docFreq=3112, maxDocs=44218)
                0.018607683 = queryNorm
              0.45669314 = fieldWeight in 3513, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.6535451 = idf(docFreq=3112, maxDocs=44218)
                0.0625 = fieldNorm(doc=3513)
          0.35851455 = weight(abstract_txt:link in 3513) [ClassicSimilarity], result of:
            0.35851455 = score(doc=3513,freq=4.0), product of:
              0.5024796 = queryWeight, product of:
                4.730944 = boost
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.018607683 = queryNorm
              0.7134907 = fieldWeight in 3513, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.0625 = fieldNorm(doc=3513)
        0.46666667 = coord(7/15)
    
  2. Henzinger, M.R.: Hyperlink analysis for the Web (2001) 0.36
    0.35815203 = sum of:
      0.35815203 = product of:
        1.34307 = sum of:
          0.041031614 = weight(abstract_txt:information in 8) [ClassicSimilarity], result of:
            0.041031614 = score(doc=8,freq=1.0), product of:
              0.09039245 = queryWeight, product of:
                2.0065718 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.018607683 = queryNorm
              0.45392746 = fieldWeight in 8, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.1875 = fieldNorm(doc=8)
          0.38026208 = weight(abstract_txt:algorithms in 8) [ClassicSimilarity], result of:
            0.38026208 = score(doc=8,freq=2.0), product of:
              0.2512398 = queryWeight, product of:
                2.365472 = boost
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.018607683 = queryNorm
              1.5135423 = fieldWeight in 8, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.1875 = fieldNorm(doc=8)
          0.7102337 = weight(abstract_txt:hyperlink in 8) [ClassicSimilarity], result of:
            0.7102337 = score(doc=8,freq=1.0), product of:
              0.48007667 = queryWeight, product of:
                3.2698581 = boost
                7.890225 = idf(docFreq=44, maxDocs=44218)
                0.018607683 = queryNorm
              1.4794172 = fieldWeight in 8, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.890225 = idf(docFreq=44, maxDocs=44218)
                0.1875 = fieldNorm(doc=8)
          0.21154258 = weight(abstract_txt:analysis in 8) [ClassicSimilarity], result of:
            0.21154258 = score(doc=8,freq=1.0), product of:
              0.30880338 = queryWeight, product of:
                4.542294 = boost
                3.6535451 = idf(docFreq=3112, maxDocs=44218)
                0.018607683 = queryNorm
              0.6850397 = fieldWeight in 8, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6535451 = idf(docFreq=3112, maxDocs=44218)
                0.1875 = fieldNorm(doc=8)
        0.26666668 = coord(4/15)
    
  3. Yang, P.; Gao, W.; Tan, Q.; Wong, K.-F.: ¬A link-bridged topic model for cross-domain document classification (2013) 0.30
    0.30315444 = sum of:
      0.30315444 = product of:
        0.7578861 = sum of:
          0.006771631 = weight(abstract_txt:this in 2706) [ClassicSimilarity], result of:
            0.006771631 = score(doc=2706,freq=1.0), product of:
              0.044900667 = queryWeight, product of:
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.018607683 = queryNorm
              0.1508136 = fieldWeight in 2706, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.0625 = fieldNorm(doc=2706)
          0.039890856 = weight(abstract_txt:structure in 2706) [ClassicSimilarity], result of:
            0.039890856 = score(doc=2706,freq=1.0), product of:
              0.14645566 = queryWeight, product of:
                1.8060372 = boost
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.018607683 = queryNorm
              0.27237496 = fieldWeight in 2706, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.3579993 = idf(docFreq=1538, maxDocs=44218)
                0.0625 = fieldNorm(doc=2706)
          0.05430513 = weight(abstract_txt:state in 2706) [ClassicSimilarity], result of:
            0.05430513 = score(doc=2706,freq=1.0), product of:
              0.17989448 = queryWeight, product of:
                2.001624 = boost
                4.829954 = idf(docFreq=959, maxDocs=44218)
                0.018607683 = queryNorm
              0.30187213 = fieldWeight in 2706, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.829954 = idf(docFreq=959, maxDocs=44218)
                0.0625 = fieldNorm(doc=2706)
          0.019342488 = weight(abstract_txt:information in 2706) [ClassicSimilarity], result of:
            0.019342488 = score(doc=2706,freq=2.0), product of:
              0.09039245 = queryWeight, product of:
                2.0065718 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.018607683 = queryNorm
              0.21398345 = fieldWeight in 2706, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0625 = fieldNorm(doc=2706)
          0.23674455 = weight(abstract_txt:hyperlink in 2706) [ClassicSimilarity], result of:
            0.23674455 = score(doc=2706,freq=1.0), product of:
              0.48007667 = queryWeight, product of:
                3.2698581 = boost
                7.890225 = idf(docFreq=44, maxDocs=44218)
                0.018607683 = queryNorm
              0.49313906 = fieldWeight in 2706, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.890225 = idf(docFreq=44, maxDocs=44218)
                0.0625 = fieldNorm(doc=2706)
          0.40083146 = weight(abstract_txt:link in 2706) [ClassicSimilarity], result of:
            0.40083146 = score(doc=2706,freq=5.0), product of:
              0.5024796 = queryWeight, product of:
                4.730944 = boost
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.018607683 = queryNorm
              0.7977069 = fieldWeight in 2706, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.0625 = fieldNorm(doc=2706)
        0.4 = coord(6/15)
    
  4. Thelwall, M.: ¬A comparison of link and URL citation counting (2011) 0.30
    0.29923457 = sum of:
      0.29923457 = product of:
        0.89770365 = sum of:
          0.009576532 = weight(abstract_txt:this in 4533) [ClassicSimilarity], result of:
            0.009576532 = score(doc=4533,freq=2.0), product of:
              0.044900667 = queryWeight, product of:
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.018607683 = queryNorm
              0.21328263 = fieldWeight in 4533, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.0625 = fieldNorm(doc=4533)
          0.07385221 = weight(abstract_txt:significant in 4533) [ClassicSimilarity], result of:
            0.07385221 = score(doc=4533,freq=2.0), product of:
              0.17526275 = queryWeight, product of:
                1.9756882 = boost
                4.76737 = idf(docFreq=1021, maxDocs=44218)
                0.018607683 = queryNorm
              0.42137998 = fieldWeight in 4533, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.76737 = idf(docFreq=1021, maxDocs=44218)
                0.0625 = fieldNorm(doc=4533)
          0.23674455 = weight(abstract_txt:hyperlink in 4533) [ClassicSimilarity], result of:
            0.23674455 = score(doc=4533,freq=1.0), product of:
              0.48007667 = queryWeight, product of:
                3.2698581 = boost
                7.890225 = idf(docFreq=44, maxDocs=44218)
                0.018607683 = queryNorm
              0.49313906 = fieldWeight in 4533, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.890225 = idf(docFreq=44, maxDocs=44218)
                0.0625 = fieldNorm(doc=4533)
          0.070514195 = weight(abstract_txt:analysis in 4533) [ClassicSimilarity], result of:
            0.070514195 = score(doc=4533,freq=1.0), product of:
              0.30880338 = queryWeight, product of:
                4.542294 = boost
                3.6535451 = idf(docFreq=3112, maxDocs=44218)
                0.018607683 = queryNorm
              0.22834657 = fieldWeight in 4533, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6535451 = idf(docFreq=3112, maxDocs=44218)
                0.0625 = fieldNorm(doc=4533)
          0.5070161 = weight(abstract_txt:link in 4533) [ClassicSimilarity], result of:
            0.5070161 = score(doc=4533,freq=8.0), product of:
              0.5024796 = queryWeight, product of:
                4.730944 = boost
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.018607683 = queryNorm
              1.0090282 = fieldWeight in 4533, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.0625 = fieldNorm(doc=4533)
        0.33333334 = coord(5/15)
    
  5. Thelwall, M.; Li, X.; Barjak, F.; Robinson, S.: Assessing the international web connectivity of research groups (2008) 0.30
    0.2984061 = sum of:
      0.2984061 = product of:
        0.7460152 = sum of:
          0.011728808 = weight(abstract_txt:this in 1401) [ClassicSimilarity], result of:
            0.011728808 = score(doc=1401,freq=3.0), product of:
              0.044900667 = queryWeight, product of:
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.018607683 = queryNorm
              0.2612168 = fieldWeight in 1401, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4130175 = idf(docFreq=10762, maxDocs=44218)
                0.0625 = fieldNorm(doc=1401)
          0.061779536 = weight(abstract_txt:field in 1401) [ClassicSimilarity], result of:
            0.061779536 = score(doc=1401,freq=2.0), product of:
              0.15560028 = queryWeight, product of:
                1.8615675 = boost
                4.491995 = idf(docFreq=1345, maxDocs=44218)
                0.018607683 = queryNorm
              0.39704 = fieldWeight in 1401, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.491995 = idf(docFreq=1345, maxDocs=44218)
                0.0625 = fieldNorm(doc=1401)
          0.013677204 = weight(abstract_txt:information in 1401) [ClassicSimilarity], result of:
            0.013677204 = score(doc=1401,freq=1.0), product of:
              0.09039245 = queryWeight, product of:
                2.0065718 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.018607683 = queryNorm
              0.15130915 = fieldWeight in 1401, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0625 = fieldNorm(doc=1401)
          0.33480734 = weight(abstract_txt:hyperlink in 1401) [ClassicSimilarity], result of:
            0.33480734 = score(doc=1401,freq=2.0), product of:
              0.48007667 = queryWeight, product of:
                3.2698581 = boost
                7.890225 = idf(docFreq=44, maxDocs=44218)
                0.018607683 = queryNorm
              0.6974039 = fieldWeight in 1401, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.890225 = idf(docFreq=44, maxDocs=44218)
                0.0625 = fieldNorm(doc=1401)
          0.070514195 = weight(abstract_txt:analysis in 1401) [ClassicSimilarity], result of:
            0.070514195 = score(doc=1401,freq=1.0), product of:
              0.30880338 = queryWeight, product of:
                4.542294 = boost
                3.6535451 = idf(docFreq=3112, maxDocs=44218)
                0.018607683 = queryNorm
              0.22834657 = fieldWeight in 1401, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6535451 = idf(docFreq=3112, maxDocs=44218)
                0.0625 = fieldNorm(doc=1401)
          0.25350806 = weight(abstract_txt:link in 1401) [ClassicSimilarity], result of:
            0.25350806 = score(doc=1401,freq=2.0), product of:
              0.5024796 = queryWeight, product of:
                4.730944 = boost
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.018607683 = queryNorm
              0.5045141 = fieldWeight in 1401, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.707926 = idf(docFreq=398, maxDocs=44218)
                0.0625 = fieldNorm(doc=1401)
        0.4 = coord(6/15)