Document (#31308)

Author
Ng, K.B.
Kantor, P.B.
Strzalkowski, T.
Wacholder, N.
Tang, R.
Bai, B.
Rittman,
Song, P.
Sun, Y.
Title
Automated judgment of document qualities
Source
Journal of the American Society for Information Science and Technology. 57(2006) no.9, S.1155-1164
Year
2006
Abstract
The authors report on a series of experiments to automate the assessment of document qualities such as depth and objectivity. The primary purpose is to develop a quality-sensitive functionality, orthogonal to relevance, to select documents for an interactive question-answering system. The study consisted of two stages. In the classifier construction stage, nine document qualities deemed important by information professionals were identified and classifiers were developed to predict their values. In the confirmative evaluation stage, the performance of the developed methods was checked using a different document collection. The quality prediction methods worked well in the second stage. The results strongly suggest that the best way to predict document qualities automatically is to construct classifiers on a person-by-person basis.

Similar documents (author)

  1. Kelly, D.; Wacholder, N.; Rittman, R.; Sun, Y.; Kantor, P.; Small, S.; Strzalkowski, T.: Using interview data to identify evaluation criteria for interactive, analytical question-answering systems (2007) 1.96
    1.9583391 = sum of:
      1.9583391 = product of:
        3.2638984 = sum of:
          0.81715876 = weight(author_txt:kantor in 332) [ClassicSimilarity], result of:
            0.81715876 = score(doc=332,freq=1.0), product of:
              0.40005195 = queryWeight, product of:
                1.0234504 = boost
                8.1705265 = idf(docFreq=33, maxDocs=44218)
                0.047840927 = queryNorm
              2.0426316 = fieldWeight in 332, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.1705265 = idf(docFreq=33, maxDocs=44218)
                0.25 = fieldNorm(doc=332)
          1.2046586 = weight(author_txt:strzalkowski in 332) [ClassicSimilarity], result of:
            1.2046586 = score(doc=332,freq=1.0), product of:
              0.5181889 = queryWeight, product of:
                1.1648034 = boost
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.047840927 = queryNorm
              2.324748 = fieldWeight in 332, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.25 = fieldNorm(doc=332)
          1.242081 = weight(author_txt:wacholder in 332) [ClassicSimilarity], result of:
            1.242081 = score(doc=332,freq=1.0), product of:
              0.5288657 = queryWeight, product of:
                1.1767421 = boost
                9.394302 = idf(docFreq=9, maxDocs=44218)
                0.047840927 = queryNorm
              2.3485756 = fieldWeight in 332, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.394302 = idf(docFreq=9, maxDocs=44218)
                0.25 = fieldNorm(doc=332)
        0.6 = coord(3/5)
    
  2. Wacholder, N.; Kelly, D.; Kantor, P.; Rittman, R.; Sun, Y.; Bai, B.; Small, S.; Yamrom, B.; Strzalkowski, T.: ¬A model for quantitative evaluation of an end-to-end question-answering system (2007) 1.71
    1.7135469 = sum of:
      1.7135469 = product of:
        2.8559113 = sum of:
          0.7150139 = weight(author_txt:kantor in 435) [ClassicSimilarity], result of:
            0.7150139 = score(doc=435,freq=1.0), product of:
              0.40005195 = queryWeight, product of:
                1.0234504 = boost
                8.1705265 = idf(docFreq=33, maxDocs=44218)
                0.047840927 = queryNorm
              1.7873027 = fieldWeight in 435, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.1705265 = idf(docFreq=33, maxDocs=44218)
                0.21875 = fieldNorm(doc=435)
          1.0540762 = weight(author_txt:strzalkowski in 435) [ClassicSimilarity], result of:
            1.0540762 = score(doc=435,freq=1.0), product of:
              0.5181889 = queryWeight, product of:
                1.1648034 = boost
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.047840927 = queryNorm
              2.0341544 = fieldWeight in 435, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.21875 = fieldNorm(doc=435)
          1.086821 = weight(author_txt:wacholder in 435) [ClassicSimilarity], result of:
            1.086821 = score(doc=435,freq=1.0), product of:
              0.5288657 = queryWeight, product of:
                1.1767421 = boost
                9.394302 = idf(docFreq=9, maxDocs=44218)
                0.047840927 = queryNorm
              2.0550036 = fieldWeight in 435, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.394302 = idf(docFreq=9, maxDocs=44218)
                0.21875 = fieldNorm(doc=435)
        0.6 = coord(3/5)
    
  3. Tang, X.; Yang, C.C.; Song, M.: Understanding the evolution of multiple scientific research domains using a content and network approach (2013) 0.91
    0.91471833 = sum of:
      0.91471833 = product of:
        2.2867959 = sum of:
          1.1433979 = weight(author_txt:tang in 744) [ClassicSimilarity], result of:
            1.1433979 = score(doc=744,freq=1.0), product of:
              0.3819292 = queryWeight, product of:
                7.983315 = idf(docFreq=40, maxDocs=44218)
                0.047840927 = queryNorm
              2.9937432 = fieldWeight in 744, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.983315 = idf(docFreq=40, maxDocs=44218)
                0.375 = fieldNorm(doc=744)
          1.1433979 = weight(author_txt:song in 744) [ClassicSimilarity], result of:
            1.1433979 = score(doc=744,freq=1.0), product of:
              0.3819292 = queryWeight, product of:
                7.983315 = idf(docFreq=40, maxDocs=44218)
                0.047840927 = queryNorm
              2.9937432 = fieldWeight in 744, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.983315 = idf(docFreq=40, maxDocs=44218)
                0.375 = fieldNorm(doc=744)
        0.4 = coord(2/5)
    
  4. Wacholder, N.: Interactive query formulation (2011) 0.62
    0.6210405 = sum of:
      0.6210405 = product of:
        3.1052027 = sum of:
          3.1052027 = weight(author_txt:wacholder in 4196) [ClassicSimilarity], result of:
            3.1052027 = score(doc=4196,freq=1.0), product of:
              0.5288657 = queryWeight, product of:
                1.1767421 = boost
                9.394302 = idf(docFreq=9, maxDocs=44218)
                0.047840927 = queryNorm
              5.871439 = fieldWeight in 4196, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.394302 = idf(docFreq=9, maxDocs=44218)
                0.625 = fieldNorm(doc=4196)
        0.2 = coord(1/5)
    
  5. Strzalkowski, T.: Natural language information retrieval (1995) 0.60
    0.6023293 = sum of:
      0.6023293 = product of:
        3.0116465 = sum of:
          3.0116465 = weight(author_txt:strzalkowski in 1914) [ClassicSimilarity], result of:
            3.0116465 = score(doc=1914,freq=1.0), product of:
              0.5181889 = queryWeight, product of:
                1.1648034 = boost
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.047840927 = queryNorm
              5.81187 = fieldWeight in 1914, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.298992 = idf(docFreq=10, maxDocs=44218)
                0.625 = fieldNorm(doc=1914)
        0.2 = coord(1/5)
    

Similar documents (content)

  1. Barry, C.L.: Document representations and clues to document relevance (1998) 0.10
    0.098638155 = sum of:
      0.098638155 = product of:
        0.82198465 = sum of:
          0.13688314 = weight(abstract_txt:predict in 2325) [ClassicSimilarity], result of:
            0.13688314 = score(doc=2325,freq=2.0), product of:
              0.22851959 = queryWeight, product of:
                2.134533 = boost
                6.7769065 = idf(docFreq=136, maxDocs=44218)
                0.015797526 = queryNorm
              0.59899956 = fieldWeight in 2325, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.7769065 = idf(docFreq=136, maxDocs=44218)
                0.0625 = fieldNorm(doc=2325)
          0.18448688 = weight(abstract_txt:document in 2325) [ClassicSimilarity], result of:
            0.18448688 = score(doc=2325,freq=9.0), product of:
              0.22921495 = queryWeight, product of:
                3.3801239 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.015797526 = queryNorm
              0.80486405 = fieldWeight in 2325, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.0625 = fieldNorm(doc=2325)
          0.50061464 = weight(abstract_txt:qualities in 2325) [ClassicSimilarity], result of:
            0.50061464 = score(doc=2325,freq=3.0), product of:
              0.59704274 = queryWeight, product of:
                4.879315 = boost
                7.7456436 = idf(docFreq=51, maxDocs=44218)
                0.015797526 = queryNorm
              0.8384905 = fieldWeight in 2325, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.7456436 = idf(docFreq=51, maxDocs=44218)
                0.0625 = fieldNorm(doc=2325)
        0.12 = coord(3/25)
    
  2. Lykourentzou, I.; Giannoukos, I.; Mpardis, G.; Nikolopoulos, V.; Loumos, V.: Early and dynamic student achievement prediction in e-learning courses using neural networks (2009) 0.10
    0.09519666 = sum of:
      0.09519666 = product of:
        0.4759833 = sum of:
          0.11540034 = weight(abstract_txt:prediction in 2715) [ClassicSimilarity], result of:
            0.11540034 = score(doc=2715,freq=4.0), product of:
              0.12847191 = queryWeight, product of:
                1.1316972 = boost
                7.1860275 = idf(docFreq=90, maxDocs=44218)
                0.015797526 = queryNorm
              0.89825344 = fieldWeight in 2715, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                7.1860275 = idf(docFreq=90, maxDocs=44218)
                0.0625 = fieldNorm(doc=2715)
          0.026626978 = weight(abstract_txt:were in 2715) [ClassicSimilarity], result of:
            0.026626978 = score(doc=2715,freq=3.0), product of:
              0.0670205 = queryWeight, product of:
                1.1559657 = boost
                3.6700637 = idf(docFreq=3061, maxDocs=44218)
                0.015797526 = queryNorm
              0.39729604 = fieldWeight in 2715, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.6700637 = idf(docFreq=3061, maxDocs=44218)
                0.0625 = fieldNorm(doc=2715)
          0.0867146 = weight(abstract_txt:objectivity in 2715) [ClassicSimilarity], result of:
            0.0867146 = score(doc=2715,freq=1.0), product of:
              0.16855888 = queryWeight, product of:
                1.2962893 = boost
                8.231152 = idf(docFreq=31, maxDocs=44218)
                0.015797526 = queryNorm
              0.514447 = fieldWeight in 2715, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.231152 = idf(docFreq=31, maxDocs=44218)
                0.0625 = fieldNorm(doc=2715)
          0.09679099 = weight(abstract_txt:predict in 2715) [ClassicSimilarity], result of:
            0.09679099 = score(doc=2715,freq=1.0), product of:
              0.22851959 = queryWeight, product of:
                2.134533 = boost
                6.7769065 = idf(docFreq=136, maxDocs=44218)
                0.015797526 = queryNorm
              0.42355666 = fieldWeight in 2715, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.7769065 = idf(docFreq=136, maxDocs=44218)
                0.0625 = fieldNorm(doc=2715)
          0.15045038 = weight(abstract_txt:stage in 2715) [ClassicSimilarity], result of:
            0.15045038 = score(doc=2715,freq=2.0), product of:
              0.27860105 = queryWeight, product of:
                2.8865438 = boost
                6.1096387 = idf(docFreq=266, maxDocs=44218)
                0.015797526 = queryNorm
              0.5400209 = fieldWeight in 2715, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.1096387 = idf(docFreq=266, maxDocs=44218)
                0.0625 = fieldNorm(doc=2715)
        0.2 = coord(5/25)
    
  3. Kishida, K.: High-speed rough clustering for very large document collections (2010) 0.09
    0.088491894 = sum of:
      0.088491894 = product of:
        0.44245946 = sum of:
          0.022175048 = weight(abstract_txt:methods in 3463) [ClassicSimilarity], result of:
            0.022175048 = score(doc=3463,freq=1.0), product of:
              0.08556113 = queryWeight, product of:
                1.306109 = boost
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.015797526 = queryNorm
              0.259172 = fieldWeight in 3463, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.146752 = idf(docFreq=1900, maxDocs=44218)
                0.0625 = fieldNorm(doc=3463)
          0.022971341 = weight(abstract_txt:developed in 3463) [ClassicSimilarity], result of:
            0.022971341 = score(doc=3463,freq=1.0), product of:
              0.08759736 = queryWeight, product of:
                1.3215593 = boost
                4.195805 = idf(docFreq=1809, maxDocs=44218)
                0.015797526 = queryNorm
              0.26223782 = fieldWeight in 3463, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.195805 = idf(docFreq=1809, maxDocs=44218)
                0.0625 = fieldNorm(doc=3463)
          0.09757615 = weight(abstract_txt:checked in 3463) [ClassicSimilarity], result of:
            0.09757615 = score(doc=3463,freq=1.0), product of:
              0.18235566 = queryWeight, product of:
                1.3482976 = boost
                8.561393 = idf(docFreq=22, maxDocs=44218)
                0.015797526 = queryNorm
              0.53508705 = fieldWeight in 3463, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.561393 = idf(docFreq=22, maxDocs=44218)
                0.0625 = fieldNorm(doc=3463)
          0.21276897 = weight(abstract_txt:stage in 3463) [ClassicSimilarity], result of:
            0.21276897 = score(doc=3463,freq=4.0), product of:
              0.27860105 = queryWeight, product of:
                2.8865438 = boost
                6.1096387 = idf(docFreq=266, maxDocs=44218)
                0.015797526 = queryNorm
              0.76370484 = fieldWeight in 3463, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.1096387 = idf(docFreq=266, maxDocs=44218)
                0.0625 = fieldNorm(doc=3463)
          0.086967945 = weight(abstract_txt:document in 3463) [ClassicSimilarity], result of:
            0.086967945 = score(doc=3463,freq=2.0), product of:
              0.22921495 = queryWeight, product of:
                3.3801239 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.015797526 = queryNorm
              0.37941656 = fieldWeight in 3463, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.0625 = fieldNorm(doc=3463)
        0.2 = coord(5/25)
    
  4. Tang, R.; Solomon, P.: Use of relevance criteria across stages of document evaluation : on the complementarity of experimental and naturalistic studies (2001) 0.08
    0.084985375 = sum of:
      0.084985375 = product of:
        0.42492688 = sum of:
          0.034833282 = weight(abstract_txt:functionality in 5213) [ClassicSimilarity], result of:
            0.034833282 = score(doc=5213,freq=1.0), product of:
              0.1003108 = queryWeight, product of:
                6.3497796 = idf(docFreq=209, maxDocs=44218)
                0.015797526 = queryNorm
              0.34725356 = fieldWeight in 5213, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.3497796 = idf(docFreq=209, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5213)
          0.023298606 = weight(abstract_txt:were in 5213) [ClassicSimilarity], result of:
            0.023298606 = score(doc=5213,freq=3.0), product of:
              0.0670205 = queryWeight, product of:
                1.1559657 = boost
                3.6700637 = idf(docFreq=3061, maxDocs=44218)
                0.015797526 = queryNorm
              0.34763405 = fieldWeight in 5213, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.6700637 = idf(docFreq=3061, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5213)
          0.027409863 = weight(abstract_txt:quality in 5213) [ClassicSimilarity], result of:
            0.027409863 = score(doc=5213,freq=1.0), product of:
              0.10772074 = queryWeight, product of:
                1.4655168 = boost
                4.6528544 = idf(docFreq=1145, maxDocs=44218)
                0.015797526 = queryNorm
              0.25445297 = fieldWeight in 5213, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.6528544 = idf(docFreq=1145, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5213)
          0.26328817 = weight(abstract_txt:stage in 5213) [ClassicSimilarity], result of:
            0.26328817 = score(doc=5213,freq=8.0), product of:
              0.27860105 = queryWeight, product of:
                2.8865438 = boost
                6.1096387 = idf(docFreq=266, maxDocs=44218)
                0.015797526 = queryNorm
              0.94503653 = fieldWeight in 5213, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                6.1096387 = idf(docFreq=266, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5213)
          0.07609696 = weight(abstract_txt:document in 5213) [ClassicSimilarity], result of:
            0.07609696 = score(doc=5213,freq=2.0), product of:
              0.22921495 = queryWeight, product of:
                3.3801239 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.015797526 = queryNorm
              0.3319895 = fieldWeight in 5213, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.0546875 = fieldNorm(doc=5213)
        0.2 = coord(5/25)
    
  5. Gauch, S.; Chandramouli, A.; Ranganathan, S.: Training a hierarchical classifier using inter document relationships (2009) 0.08
    0.083725736 = sum of:
      0.083725736 = product of:
        0.52328587 = sum of:
          0.1044259 = weight(abstract_txt:classifier in 2697) [ClassicSimilarity], result of:
            0.1044259 = score(doc=2697,freq=2.0), product of:
              0.13050053 = queryWeight, product of:
                1.1405971 = boost
                7.24254 = idf(docFreq=85, maxDocs=44218)
                0.015797526 = queryNorm
              0.8001952 = fieldWeight in 2697, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                7.24254 = idf(docFreq=85, maxDocs=44218)
                0.078125 = fieldNorm(doc=2697)
          0.055376288 = weight(abstract_txt:quality in 2697) [ClassicSimilarity], result of:
            0.055376288 = score(doc=2697,freq=2.0), product of:
              0.10772074 = queryWeight, product of:
                1.4655168 = boost
                4.6528544 = idf(docFreq=1145, maxDocs=44218)
                0.015797526 = queryNorm
              0.51407266 = fieldWeight in 2697, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.6528544 = idf(docFreq=1145, maxDocs=44218)
                0.078125 = fieldNorm(doc=2697)
          0.28661415 = weight(abstract_txt:classifiers in 2697) [ClassicSimilarity], result of:
            0.28661415 = score(doc=2697,freq=3.0), product of:
              0.28156897 = queryWeight, product of:
                2.3693736 = boost
                7.5225 = idf(docFreq=64, maxDocs=44218)
                0.015797526 = queryNorm
              1.0179181 = fieldWeight in 2697, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.5225 = idf(docFreq=64, maxDocs=44218)
                0.078125 = fieldNorm(doc=2697)
          0.07686953 = weight(abstract_txt:document in 2697) [ClassicSimilarity], result of:
            0.07686953 = score(doc=2697,freq=1.0), product of:
              0.22921495 = queryWeight, product of:
                3.3801239 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.015797526 = queryNorm
              0.33536002 = fieldWeight in 2697, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.078125 = fieldNorm(doc=2697)
        0.16 = coord(4/25)