Document (#11950)

Author
Srinivasan, P.
Title
On generalizing the Two-Poisson Model
Source
Journal of the American Society for Information Science. 41(1990) no.1, S.61-66
Year
1990
Abstract
Automatic indexing is one of the important functions of a modern document retrieval system. Numerous techniques for this function have been proposed in the literature ranging from purely statistical to linguistically complex mechanisms. Most result from examining properties of terms. Examines term distribution within the framework of the Poisson models. Specifically examines the effectiveness of the Two-Poisson and the Three-Poisson model to see if generalisation results in increased effectiveness. The results show that the Two-Poisson model is only moderately effective in identifying index terms. In addition, generalisation to the Three-Poisson does not give any additional power. The only Poisson model which consistently works well is the basic One-Poisson model. Also discusses term distribution information.
Theme
Automatisches Indexieren

Similar documents (author)

  1. Srinivasan, P.: Expert interface to Library of Congress Subject Headings (1990/91) 5.41
    5.4077277 = sum of:
      5.4077277 = weight(author_txt:srinivasan in 2209) [ClassicSimilarity], result of:
        5.4077277 = score(doc=2209,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            8.652365 = idf(docFreq=20, maxDocs=44218)
            0.115575336 = queryNorm
          5.407728 = fieldWeight in 2209, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            8.652365 = idf(docFreq=20, maxDocs=44218)
            0.625 = fieldNorm(doc=2209)
    
  2. Srinivasan, P.: Query expansion and MEDLINE (1996) 5.41
    5.4077277 = sum of:
      5.4077277 = weight(author_txt:srinivasan in 8453) [ClassicSimilarity], result of:
        5.4077277 = score(doc=8453,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            8.652365 = idf(docFreq=20, maxDocs=44218)
            0.115575336 = queryNorm
          5.407728 = fieldWeight in 8453, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            8.652365 = idf(docFreq=20, maxDocs=44218)
            0.625 = fieldNorm(doc=8453)
    
  3. Srinivasan, P.: Intelligent information retrieval using rough set approximations (1989) 5.41
    5.4077277 = sum of:
      5.4077277 = weight(author_txt:srinivasan in 2526) [ClassicSimilarity], result of:
        5.4077277 = score(doc=2526,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            8.652365 = idf(docFreq=20, maxDocs=44218)
            0.115575336 = queryNorm
          5.407728 = fieldWeight in 2526, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            8.652365 = idf(docFreq=20, maxDocs=44218)
            0.625 = fieldNorm(doc=2526)
    
  4. Srinivasan, P.: Optimal document-indexing vocabulary for MEDLINE (1996) 5.41
    5.4077277 = sum of:
      5.4077277 = weight(author_txt:srinivasan in 6634) [ClassicSimilarity], result of:
        5.4077277 = score(doc=6634,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            8.652365 = idf(docFreq=20, maxDocs=44218)
            0.115575336 = queryNorm
          5.407728 = fieldWeight in 6634, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            8.652365 = idf(docFreq=20, maxDocs=44218)
            0.625 = fieldNorm(doc=6634)
    
  5. Srinivasan, P.: Thesaurus construction (1992) 5.41
    5.4077277 = sum of:
      5.4077277 = weight(author_txt:srinivasan in 3504) [ClassicSimilarity], result of:
        5.4077277 = score(doc=3504,freq=1.0), product of:
          0.99999994 = queryWeight, product of:
            8.652365 = idf(docFreq=20, maxDocs=44218)
            0.115575336 = queryNorm
          5.407728 = fieldWeight in 3504, product of:
            1.0 = tf(freq=1.0), with freq of:
              1.0 = termFreq=1.0
            8.652365 = idf(docFreq=20, maxDocs=44218)
            0.625 = fieldNorm(doc=3504)
    

Similar documents (content)

  1. Lassalle, E.; Lassalle, E.: Semantic models in information retrieval (2012) 0.16
    0.1603029 = sum of:
      0.1603029 = product of:
        0.6679288 = sum of:
          0.018513355 = weight(abstract_txt:properties in 97) [ClassicSimilarity], result of:
            0.018513355 = score(doc=97,freq=1.0), product of:
              0.050301667 = queryWeight, product of:
                1.0383923 = boost
                5.888745 = idf(docFreq=332, maxDocs=44218)
                0.0082261795 = queryNorm
              0.36804655 = fieldWeight in 97, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.888745 = idf(docFreq=332, maxDocs=44218)
                0.0625 = fieldNorm(doc=97)
          0.0119905025 = weight(abstract_txt:terms in 97) [ClassicSimilarity], result of:
            0.0119905025 = score(doc=97,freq=1.0), product of:
              0.047441732 = queryWeight, product of:
                1.426151 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0082261795 = queryNorm
              0.25274166 = fieldWeight in 97, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=97)
          0.013769122 = weight(abstract_txt:only in 97) [ClassicSimilarity], result of:
            0.013769122 = score(doc=97,freq=1.0), product of:
              0.052024323 = queryWeight, product of:
                1.4934424 = boost
                4.234672 = idf(docFreq=1740, maxDocs=44218)
                0.0082261795 = queryNorm
              0.264667 = fieldWeight in 97, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.234672 = idf(docFreq=1740, maxDocs=44218)
                0.0625 = fieldNorm(doc=97)
          0.07087221 = weight(abstract_txt:generalizing in 97) [ClassicSimilarity], result of:
            0.07087221 = score(doc=97,freq=1.0), product of:
              0.12309571 = queryWeight, product of:
                1.6243954 = boost
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.0082261795 = queryNorm
              0.5757488 = fieldWeight in 97, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.211981 = idf(docFreq=11, maxDocs=44218)
                0.0625 = fieldNorm(doc=97)
          0.057425562 = weight(abstract_txt:model in 97) [ClassicSimilarity], result of:
            0.057425562 = score(doc=97,freq=4.0), product of:
              0.11524775 = queryWeight, product of:
                3.5145643 = boost
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.0082261795 = queryNorm
              0.49827924 = fieldWeight in 97, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.0625 = fieldNorm(doc=97)
          0.49535805 = weight(abstract_txt:poisson in 97) [ClassicSimilarity], result of:
            0.49535805 = score(doc=97,freq=1.0), product of:
              0.89998466 = queryWeight, product of:
                12.423181 = boost
                8.806516 = idf(docFreq=17, maxDocs=44218)
                0.0082261795 = queryNorm
              0.55040723 = fieldWeight in 97, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.806516 = idf(docFreq=17, maxDocs=44218)
                0.0625 = fieldNorm(doc=97)
        0.24 = coord(6/25)
    
  2. Kuperman, V.: Productivity in the Internet mailing lists : a bibliometric analysis (2006) 0.16
    0.15986867 = sum of:
      0.15986867 = product of:
        0.79934335 = sum of:
          0.007657572 = weight(abstract_txt:results in 4907) [ClassicSimilarity], result of:
            0.007657572 = score(doc=4907,freq=1.0), product of:
              0.03518274 = queryWeight, product of:
                1.2281463 = boost
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.0082261795 = queryNorm
              0.21765138 = fieldWeight in 4907, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.482422 = idf(docFreq=3693, maxDocs=44218)
                0.0625 = fieldNorm(doc=4907)
          0.017481213 = weight(abstract_txt:examines in 4907) [ClassicSimilarity], result of:
            0.017481213 = score(doc=4907,freq=1.0), product of:
              0.060998153 = queryWeight, product of:
                1.617125 = boost
                4.5853753 = idf(docFreq=1225, maxDocs=44218)
                0.0082261795 = queryNorm
              0.28658596 = fieldWeight in 4907, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5853753 = idf(docFreq=1225, maxDocs=44218)
                0.0625 = fieldNorm(doc=4907)
          0.04494975 = weight(abstract_txt:distribution in 4907) [ClassicSimilarity], result of:
            0.04494975 = score(doc=4907,freq=2.0), product of:
              0.0908679 = queryWeight, product of:
                1.9737425 = boost
                5.596568 = idf(docFreq=445, maxDocs=44218)
                0.0082261795 = queryNorm
              0.4946714 = fieldWeight in 4907, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.596568 = idf(docFreq=445, maxDocs=44218)
                0.0625 = fieldNorm(doc=4907)
          0.028712781 = weight(abstract_txt:model in 4907) [ClassicSimilarity], result of:
            0.028712781 = score(doc=4907,freq=1.0), product of:
              0.11524775 = queryWeight, product of:
                3.5145643 = boost
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.0082261795 = queryNorm
              0.24913962 = fieldWeight in 4907, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.0625 = fieldNorm(doc=4907)
          0.70054203 = weight(abstract_txt:poisson in 4907) [ClassicSimilarity], result of:
            0.70054203 = score(doc=4907,freq=2.0), product of:
              0.89998466 = queryWeight, product of:
                12.423181 = boost
                8.806516 = idf(docFreq=17, maxDocs=44218)
                0.0082261795 = queryNorm
              0.7783933 = fieldWeight in 4907, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.806516 = idf(docFreq=17, maxDocs=44218)
                0.0625 = fieldNorm(doc=4907)
        0.2 = coord(5/25)
    
  3. Kim, W.; Wilbur, W.J.: Corpus-based statistical screening for content-bearing terms (2001) 0.15
    0.15115586 = sum of:
      0.15115586 = product of:
        0.6298161 = sum of:
          0.017985754 = weight(abstract_txt:terms in 5188) [ClassicSimilarity], result of:
            0.017985754 = score(doc=5188,freq=4.0), product of:
              0.047441732 = queryWeight, product of:
                1.426151 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0082261795 = queryNorm
              0.37911248 = fieldWeight in 5188, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.046875 = fieldNorm(doc=5188)
          0.010326841 = weight(abstract_txt:only in 5188) [ClassicSimilarity], result of:
            0.010326841 = score(doc=5188,freq=1.0), product of:
              0.052024323 = queryWeight, product of:
                1.4934424 = boost
                4.234672 = idf(docFreq=1740, maxDocs=44218)
                0.0082261795 = queryNorm
              0.19850025 = fieldWeight in 5188, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.234672 = idf(docFreq=1740, maxDocs=44218)
                0.046875 = fieldNorm(doc=5188)
          0.02619006 = weight(abstract_txt:three in 5188) [ClassicSimilarity], result of:
            0.02619006 = score(doc=5188,freq=5.0), product of:
              0.05657993 = queryWeight, product of:
                1.5574584 = boost
                4.41619 = idf(docFreq=1451, maxDocs=44218)
                0.0082261795 = queryNorm
              0.46288604 = fieldWeight in 5188, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.41619 = idf(docFreq=1451, maxDocs=44218)
                0.046875 = fieldNorm(doc=5188)
          0.026068706 = weight(abstract_txt:term in 5188) [ClassicSimilarity], result of:
            0.026068706 = score(doc=5188,freq=3.0), product of:
              0.06687555 = queryWeight, product of:
                1.6932416 = boost
                4.8012047 = idf(docFreq=987, maxDocs=44218)
                0.0082261795 = queryNorm
              0.38980925 = fieldWeight in 5188, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.8012047 = idf(docFreq=987, maxDocs=44218)
                0.046875 = fieldNorm(doc=5188)
          0.023838203 = weight(abstract_txt:distribution in 5188) [ClassicSimilarity], result of:
            0.023838203 = score(doc=5188,freq=1.0), product of:
              0.0908679 = queryWeight, product of:
                1.9737425 = boost
                5.596568 = idf(docFreq=445, maxDocs=44218)
                0.0082261795 = queryNorm
              0.26233912 = fieldWeight in 5188, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.596568 = idf(docFreq=445, maxDocs=44218)
                0.046875 = fieldNorm(doc=5188)
          0.52540654 = weight(abstract_txt:poisson in 5188) [ClassicSimilarity], result of:
            0.52540654 = score(doc=5188,freq=2.0), product of:
              0.89998466 = queryWeight, product of:
                12.423181 = boost
                8.806516 = idf(docFreq=17, maxDocs=44218)
                0.0082261795 = queryNorm
              0.583795 = fieldWeight in 5188, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.806516 = idf(docFreq=17, maxDocs=44218)
                0.046875 = fieldNorm(doc=5188)
        0.24 = coord(6/25)
    
  4. Huber, J.C.: ¬A new model that generated Lotka's law (2002) 0.12
    0.118498206 = sum of:
      0.118498206 = product of:
        0.7406138 = sum of:
          0.019520916 = weight(abstract_txt:three in 248) [ClassicSimilarity], result of:
            0.019520916 = score(doc=248,freq=1.0), product of:
              0.05657993 = queryWeight, product of:
                1.5574584 = boost
                4.41619 = idf(docFreq=1451, maxDocs=44218)
                0.0082261795 = queryNorm
              0.34501487 = fieldWeight in 248, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.41619 = idf(docFreq=1451, maxDocs=44218)
                0.078125 = fieldNorm(doc=248)
          0.039730344 = weight(abstract_txt:distribution in 248) [ClassicSimilarity], result of:
            0.039730344 = score(doc=248,freq=1.0), product of:
              0.0908679 = queryWeight, product of:
                1.9737425 = boost
                5.596568 = idf(docFreq=445, maxDocs=44218)
                0.0082261795 = queryNorm
              0.4372319 = fieldWeight in 248, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.596568 = idf(docFreq=445, maxDocs=44218)
                0.078125 = fieldNorm(doc=248)
          0.062164992 = weight(abstract_txt:model in 248) [ClassicSimilarity], result of:
            0.062164992 = score(doc=248,freq=3.0), product of:
              0.11524775 = queryWeight, product of:
                3.5145643 = boost
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.0082261795 = queryNorm
              0.5394031 = fieldWeight in 248, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.078125 = fieldNorm(doc=248)
          0.61919755 = weight(abstract_txt:poisson in 248) [ClassicSimilarity], result of:
            0.61919755 = score(doc=248,freq=1.0), product of:
              0.89998466 = queryWeight, product of:
                12.423181 = boost
                8.806516 = idf(docFreq=17, maxDocs=44218)
                0.0082261795 = queryNorm
              0.688009 = fieldWeight in 248, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.806516 = idf(docFreq=17, maxDocs=44218)
                0.078125 = fieldNorm(doc=248)
        0.16 = coord(4/25)
    
  5. Lee, C.; Lee, G.G.: Probabilistic information retrieval model for a dependence structured indexing system (2005) 0.12
    0.11526691 = sum of:
      0.11526691 = product of:
        0.7204182 = sum of:
          0.014988128 = weight(abstract_txt:terms in 1004) [ClassicSimilarity], result of:
            0.014988128 = score(doc=1004,freq=1.0), product of:
              0.047441732 = queryWeight, product of:
                1.426151 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0082261795 = queryNorm
              0.3159271 = fieldWeight in 1004, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.078125 = fieldNorm(doc=1004)
          0.035475016 = weight(abstract_txt:term in 1004) [ClassicSimilarity], result of:
            0.035475016 = score(doc=1004,freq=2.0), product of:
              0.06687555 = queryWeight, product of:
                1.6932416 = boost
                4.8012047 = idf(docFreq=987, maxDocs=44218)
                0.0082261795 = queryNorm
              0.53046316 = fieldWeight in 1004, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.8012047 = idf(docFreq=987, maxDocs=44218)
                0.078125 = fieldNorm(doc=1004)
          0.0507575 = weight(abstract_txt:model in 1004) [ClassicSimilarity], result of:
            0.0507575 = score(doc=1004,freq=2.0), product of:
              0.11524775 = queryWeight, product of:
                3.5145643 = boost
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.0082261795 = queryNorm
              0.44042078 = fieldWeight in 1004, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.986234 = idf(docFreq=2231, maxDocs=44218)
                0.078125 = fieldNorm(doc=1004)
          0.61919755 = weight(abstract_txt:poisson in 1004) [ClassicSimilarity], result of:
            0.61919755 = score(doc=1004,freq=1.0), product of:
              0.89998466 = queryWeight, product of:
                12.423181 = boost
                8.806516 = idf(docFreq=17, maxDocs=44218)
                0.0082261795 = queryNorm
              0.688009 = fieldWeight in 1004, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.806516 = idf(docFreq=17, maxDocs=44218)
                0.078125 = fieldNorm(doc=1004)
        0.16 = coord(4/25)