Document (#26427)

Author
Bartell, B.T.
Cottrell, G.W.
Belew, R.K.
Title
Representing documents using an explicit model of their similarities
Source
Journal of the American Society for Information Science. 46(1995) no.4, S.254-271
Year
1995
Abstract
Proposes a method for creating vector space representations of documents based on modelling target interdocument similariyt values. The target similarity values are assumed to capture semantic relationships, or associations, between the documents. The vector representations are chosen so that the inner product similarities between document vector pairs closely match their target interdocument similarities. The method is closely related to the Latent Semantic Indexing approach
Object
Latent Semantic Indexing

Similar documents (content)

  1. Martin, D.I.; Berry, M.W.: Latent Semantic Indexing (2009) 0.23
    0.22589323 = sum of:
      0.22589323 = product of:
        0.8067615 = sum of:
          0.020405142 = weight(abstract_txt:between in 3834) [ClassicSimilarity], result of:
            0.020405142 = score(doc=3834,freq=1.0), product of:
              0.075413465 = queryWeight, product of:
                1.251295 = boost
                3.4633842 = idf(docFreq=3764, maxDocs=44218)
                0.017401574 = queryNorm
              0.2705769 = fieldWeight in 3834, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4633842 = idf(docFreq=3764, maxDocs=44218)
                0.078125 = fieldNorm(doc=3834)
          0.118945435 = weight(abstract_txt:latent in 3834) [ClassicSimilarity], result of:
            0.118945435 = score(doc=3834,freq=2.0), product of:
              0.15387486 = queryWeight, product of:
                1.2638749 = boost
                6.996407 = idf(docFreq=109, maxDocs=44218)
                0.017401574 = queryNorm
              0.7730011 = fieldWeight in 3834, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.996407 = idf(docFreq=109, maxDocs=44218)
                0.078125 = fieldNorm(doc=3834)
          0.087993205 = weight(abstract_txt:semantic in 3834) [ClassicSimilarity], result of:
            0.087993205 = score(doc=3834,freq=4.0), product of:
              0.12586412 = queryWeight, product of:
                1.6165391 = boost
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.017401574 = queryNorm
              0.6991127 = fieldWeight in 3834, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.078125 = fieldNorm(doc=3834)
          0.06333811 = weight(abstract_txt:method in 3834) [ClassicSimilarity], result of:
            0.06333811 = score(doc=3834,freq=2.0), product of:
              0.1273667 = queryWeight, product of:
                1.6261598 = boost
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.017401574 = queryNorm
              0.49728936 = fieldWeight in 3834, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.078125 = fieldNorm(doc=3834)
          0.10314837 = weight(abstract_txt:documents in 3834) [ClassicSimilarity], result of:
            0.10314837 = score(doc=3834,freq=4.0), product of:
              0.16017982 = queryWeight, product of:
                2.233494 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.017401574 = queryNorm
              0.64395356 = fieldWeight in 3834, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.078125 = fieldNorm(doc=3834)
          0.20427682 = weight(abstract_txt:vector in 3834) [ClassicSimilarity], result of:
            0.20427682 = score(doc=3834,freq=1.0), product of:
              0.40098888 = queryWeight, product of:
                3.5338414 = boost
                6.5207376 = idf(docFreq=176, maxDocs=44218)
                0.017401574 = queryNorm
              0.5094326 = fieldWeight in 3834, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5207376 = idf(docFreq=176, maxDocs=44218)
                0.078125 = fieldNorm(doc=3834)
          0.20865443 = weight(abstract_txt:similarities in 3834) [ClassicSimilarity], result of:
            0.20865443 = score(doc=3834,freq=1.0), product of:
              0.40669733 = queryWeight, product of:
                3.5589063 = boost
                6.5669885 = idf(docFreq=168, maxDocs=44218)
                0.017401574 = queryNorm
              0.51304597 = fieldWeight in 3834, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5669885 = idf(docFreq=168, maxDocs=44218)
                0.078125 = fieldNorm(doc=3834)
        0.28 = coord(7/25)
    
  2. Bartell, B.T.; Cottrell, G.W.; Belew, R.K.: Optimizing similarity using multi-query relevance feedback (1998) 0.19
    0.18973859 = sum of:
      0.18973859 = product of:
        0.6776378 = sum of:
          0.07742952 = weight(abstract_txt:similarity in 1152) [ClassicSimilarity], result of:
            0.07742952 = score(doc=1152,freq=4.0), product of:
              0.10644785 = queryWeight, product of:
                1.0512081 = boost
                5.8191514 = idf(docFreq=356, maxDocs=44218)
                0.017401574 = queryNorm
              0.7273939 = fieldWeight in 1152, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.8191514 = idf(docFreq=356, maxDocs=44218)
                0.0625 = fieldNorm(doc=1152)
          0.050875496 = weight(abstract_txt:match in 1152) [ClassicSimilarity], result of:
            0.050875496 = score(doc=1152,freq=1.0), product of:
              0.12771001 = queryWeight, product of:
                1.1514173 = boost
                6.373877 = idf(docFreq=204, maxDocs=44218)
                0.017401574 = queryNorm
              0.39836732 = fieldWeight in 1152, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.373877 = idf(docFreq=204, maxDocs=44218)
                0.0625 = fieldNorm(doc=1152)
          0.016324112 = weight(abstract_txt:between in 1152) [ClassicSimilarity], result of:
            0.016324112 = score(doc=1152,freq=1.0), product of:
              0.075413465 = queryWeight, product of:
                1.251295 = boost
                3.4633842 = idf(docFreq=3764, maxDocs=44218)
                0.017401574 = queryNorm
              0.21646151 = fieldWeight in 1152, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4633842 = idf(docFreq=3764, maxDocs=44218)
                0.0625 = fieldNorm(doc=1152)
          0.062058423 = weight(abstract_txt:method in 1152) [ClassicSimilarity], result of:
            0.062058423 = score(doc=1152,freq=3.0), product of:
              0.1273667 = queryWeight, product of:
                1.6261598 = boost
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.017401574 = queryNorm
              0.4872421 = fieldWeight in 1152, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.0625 = fieldNorm(doc=1152)
          0.07146328 = weight(abstract_txt:documents in 1152) [ClassicSimilarity], result of:
            0.07146328 = score(doc=1152,freq=3.0), product of:
              0.16017982 = queryWeight, product of:
                2.233494 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.017401574 = queryNorm
              0.44614407 = fieldWeight in 1152, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.0625 = fieldNorm(doc=1152)
          0.16342145 = weight(abstract_txt:vector in 1152) [ClassicSimilarity], result of:
            0.16342145 = score(doc=1152,freq=1.0), product of:
              0.40098888 = queryWeight, product of:
                3.5338414 = boost
                6.5207376 = idf(docFreq=176, maxDocs=44218)
                0.017401574 = queryNorm
              0.4075461 = fieldWeight in 1152, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5207376 = idf(docFreq=176, maxDocs=44218)
                0.0625 = fieldNorm(doc=1152)
          0.23606552 = weight(abstract_txt:target in 1152) [ClassicSimilarity], result of:
            0.23606552 = score(doc=1152,freq=2.0), product of:
              0.40669733 = queryWeight, product of:
                3.5589063 = boost
                6.5669885 = idf(docFreq=168, maxDocs=44218)
                0.017401574 = queryNorm
              0.58044523 = fieldWeight in 1152, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.5669885 = idf(docFreq=168, maxDocs=44218)
                0.0625 = fieldNorm(doc=1152)
        0.28 = coord(7/25)
    
  3. Liddy, E.D.: ¬An alternative representation for documents and queries (1993) 0.18
    0.17905675 = sum of:
      0.17905675 = product of:
        0.6394884 = sum of:
          0.04839345 = weight(abstract_txt:similarity in 7813) [ClassicSimilarity], result of:
            0.04839345 = score(doc=7813,freq=1.0), product of:
              0.10644785 = queryWeight, product of:
                1.0512081 = boost
                5.8191514 = idf(docFreq=356, maxDocs=44218)
                0.017401574 = queryNorm
              0.4546212 = fieldWeight in 7813, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8191514 = idf(docFreq=356, maxDocs=44218)
                0.078125 = fieldNorm(doc=7813)
          0.015491463 = weight(abstract_txt:their in 7813) [ClassicSimilarity], result of:
            0.015491463 = score(doc=7813,freq=1.0), product of:
              0.062760174 = queryWeight, product of:
                1.1415037 = boost
                3.1594994 = idf(docFreq=5101, maxDocs=44218)
                0.017401574 = queryNorm
              0.24683589 = fieldWeight in 7813, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1594994 = idf(docFreq=5101, maxDocs=44218)
                0.078125 = fieldNorm(doc=7813)
          0.043996602 = weight(abstract_txt:semantic in 7813) [ClassicSimilarity], result of:
            0.043996602 = score(doc=7813,freq=1.0), product of:
              0.12586412 = queryWeight, product of:
                1.6165391 = boost
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.017401574 = queryNorm
              0.34955636 = fieldWeight in 7813, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.078125 = fieldNorm(doc=7813)
          0.06333811 = weight(abstract_txt:method in 7813) [ClassicSimilarity], result of:
            0.06333811 = score(doc=7813,freq=2.0), product of:
              0.1273667 = queryWeight, product of:
                1.6261598 = boost
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.017401574 = queryNorm
              0.49728936 = fieldWeight in 7813, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.078125 = fieldNorm(doc=7813)
          0.10644078 = weight(abstract_txt:representations in 7813) [ClassicSimilarity], result of:
            0.10644078 = score(doc=7813,freq=1.0), product of:
              0.22682689 = queryWeight, product of:
                2.1701138 = boost
                6.006528 = idf(docFreq=295, maxDocs=44218)
                0.017401574 = queryNorm
              0.46925998 = fieldWeight in 7813, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.006528 = idf(docFreq=295, maxDocs=44218)
                0.078125 = fieldNorm(doc=7813)
          0.07293691 = weight(abstract_txt:documents in 7813) [ClassicSimilarity], result of:
            0.07293691 = score(doc=7813,freq=2.0), product of:
              0.16017982 = queryWeight, product of:
                2.233494 = boost
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.017401574 = queryNorm
              0.4553439 = fieldWeight in 7813, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1213026 = idf(docFreq=1949, maxDocs=44218)
                0.078125 = fieldNorm(doc=7813)
          0.28889108 = weight(abstract_txt:vector in 7813) [ClassicSimilarity], result of:
            0.28889108 = score(doc=7813,freq=2.0), product of:
              0.40098888 = queryWeight, product of:
                3.5338414 = boost
                6.5207376 = idf(docFreq=176, maxDocs=44218)
                0.017401574 = queryNorm
              0.7204466 = fieldWeight in 7813, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.5207376 = idf(docFreq=176, maxDocs=44218)
                0.078125 = fieldNorm(doc=7813)
        0.28 = coord(7/25)
    
  4. Shibata, N.; Kajikawa, Y.; Sakata, I.: Measuring relatedness between communities in a citation network (2011) 0.18
    0.17865439 = sum of:
      0.17865439 = product of:
        0.6380514 = sum of:
          0.04839345 = weight(abstract_txt:similarity in 4484) [ClassicSimilarity], result of:
            0.04839345 = score(doc=4484,freq=1.0), product of:
              0.10644785 = queryWeight, product of:
                1.0512081 = boost
                5.8191514 = idf(docFreq=356, maxDocs=44218)
                0.017401574 = queryNorm
              0.4546212 = fieldWeight in 4484, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8191514 = idf(docFreq=356, maxDocs=44218)
                0.078125 = fieldNorm(doc=4484)
          0.06316051 = weight(abstract_txt:capture in 4484) [ClassicSimilarity], result of:
            0.06316051 = score(doc=4484,freq=1.0), product of:
              0.1271285 = queryWeight, product of:
                1.1487929 = boost
                6.3593493 = idf(docFreq=207, maxDocs=44218)
                0.017401574 = queryNorm
              0.49682415 = fieldWeight in 4484, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.3593493 = idf(docFreq=207, maxDocs=44218)
                0.078125 = fieldNorm(doc=4484)
          0.020405142 = weight(abstract_txt:between in 4484) [ClassicSimilarity], result of:
            0.020405142 = score(doc=4484,freq=1.0), product of:
              0.075413465 = queryWeight, product of:
                1.251295 = boost
                3.4633842 = idf(docFreq=3764, maxDocs=44218)
                0.017401574 = queryNorm
              0.2705769 = fieldWeight in 4484, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4633842 = idf(docFreq=3764, maxDocs=44218)
                0.078125 = fieldNorm(doc=4484)
          0.043996602 = weight(abstract_txt:semantic in 4484) [ClassicSimilarity], result of:
            0.043996602 = score(doc=4484,freq=1.0), product of:
              0.12586412 = queryWeight, product of:
                1.6165391 = boost
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.017401574 = queryNorm
              0.34955636 = fieldWeight in 4484, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.078125 = fieldNorm(doc=4484)
          0.04478681 = weight(abstract_txt:method in 4484) [ClassicSimilarity], result of:
            0.04478681 = score(doc=4484,freq=1.0), product of:
              0.1273667 = queryWeight, product of:
                1.6261598 = boost
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.017401574 = queryNorm
              0.3516367 = fieldWeight in 4484, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.50095 = idf(docFreq=1333, maxDocs=44218)
                0.078125 = fieldNorm(doc=4484)
          0.20865443 = weight(abstract_txt:target in 4484) [ClassicSimilarity], result of:
            0.20865443 = score(doc=4484,freq=1.0), product of:
              0.40669733 = queryWeight, product of:
                3.5589063 = boost
                6.5669885 = idf(docFreq=168, maxDocs=44218)
                0.017401574 = queryNorm
              0.51304597 = fieldWeight in 4484, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5669885 = idf(docFreq=168, maxDocs=44218)
                0.078125 = fieldNorm(doc=4484)
          0.20865443 = weight(abstract_txt:similarities in 4484) [ClassicSimilarity], result of:
            0.20865443 = score(doc=4484,freq=1.0), product of:
              0.40669733 = queryWeight, product of:
                3.5589063 = boost
                6.5669885 = idf(docFreq=168, maxDocs=44218)
                0.017401574 = queryNorm
              0.51304597 = fieldWeight in 4484, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.5669885 = idf(docFreq=168, maxDocs=44218)
                0.078125 = fieldNorm(doc=4484)
        0.28 = coord(7/25)
    
  5. Dominich, S.; Kiezer, T.: ¬A measure theoretic approach to information retrieval (2007) 0.16
    0.16286245 = sum of:
      0.16286245 = product of:
        0.6785936 = sum of:
          0.052159198 = weight(abstract_txt:product in 445) [ClassicSimilarity], result of:
            0.052159198 = score(doc=445,freq=2.0), product of:
              0.112656884 = queryWeight, product of:
                1.0814317 = boost
                5.98646 = idf(docFreq=301, maxDocs=44218)
                0.017401574 = queryNorm
              0.46299165 = fieldWeight in 445, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.98646 = idf(docFreq=301, maxDocs=44218)
                0.0546875 = fieldNorm(doc=445)
          0.014283598 = weight(abstract_txt:between in 445) [ClassicSimilarity], result of:
            0.014283598 = score(doc=445,freq=1.0), product of:
              0.075413465 = queryWeight, product of:
                1.251295 = boost
                3.4633842 = idf(docFreq=3764, maxDocs=44218)
                0.017401574 = queryNorm
              0.18940382 = fieldWeight in 445, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4633842 = idf(docFreq=3764, maxDocs=44218)
                0.0546875 = fieldNorm(doc=445)
          0.058874983 = weight(abstract_txt:latent in 445) [ClassicSimilarity], result of:
            0.058874983 = score(doc=445,freq=1.0), product of:
              0.15387486 = queryWeight, product of:
                1.2638749 = boost
                6.996407 = idf(docFreq=109, maxDocs=44218)
                0.017401574 = queryNorm
              0.382616 = fieldWeight in 445, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.996407 = idf(docFreq=109, maxDocs=44218)
                0.0546875 = fieldNorm(doc=445)
          0.14415224 = weight(abstract_txt:inner in 445) [ClassicSimilarity], result of:
            0.14415224 = score(doc=445,freq=2.0), product of:
              0.22186294 = queryWeight, product of:
                1.5176185 = boost
                8.401051 = idf(docFreq=26, maxDocs=44218)
                0.017401574 = queryNorm
              0.64973557 = fieldWeight in 445, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.401051 = idf(docFreq=26, maxDocs=44218)
                0.0546875 = fieldNorm(doc=445)
          0.030797621 = weight(abstract_txt:semantic in 445) [ClassicSimilarity], result of:
            0.030797621 = score(doc=445,freq=1.0), product of:
              0.12586412 = queryWeight, product of:
                1.6165391 = boost
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.017401574 = queryNorm
              0.24468945 = fieldWeight in 445, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.4743214 = idf(docFreq=1369, maxDocs=44218)
                0.0546875 = fieldNorm(doc=445)
          0.3783259 = weight(abstract_txt:vector in 445) [ClassicSimilarity], result of:
            0.3783259 = score(doc=445,freq=7.0), product of:
              0.40098888 = queryWeight, product of:
                3.5338414 = boost
                6.5207376 = idf(docFreq=176, maxDocs=44218)
                0.017401574 = queryNorm
              0.94348234 = fieldWeight in 445, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                6.5207376 = idf(docFreq=176, maxDocs=44218)
                0.0546875 = fieldNorm(doc=445)
        0.24 = coord(6/25)