Document (#15669)

Goh, A.
Hui, S.C.
TES: a text extraction system
Microcomputers for information management. 13(1996) no.1, S.41-55
With the onset of the information explosion arising from digital libraries and access to a wealth of information through the Internet, the need to efficiently determine the relevance of a document becomes even more urgent. Describes a text extraction system (TES), which retrieves a set of sentences from a document to form an indicative abstract. Such an automated process enables information to be filtered more quickly. Discusses the combination of various text extraction techniques. Compares results with manually produced abstracts
Automatisches Abstracting

Similar documents (content)

  1. Goh, A.; Hui, S.C.; Chan, S.K.: ¬A text extraction system for news reports (1996) 0.26
    0.25992537 = sum of:
      0.25992537 = product of:
        0.92830485 = sum of:
          0.09878339 = weight(abstract_txt:abstracts in 6601) [ClassicSimilarity], result of:
            0.09878339 = score(doc=6601,freq=4.0), product of:
              0.13251631 = queryWeight, product of:
                1.0642885 = boost
                5.963546 = idf(docFreq=308, maxDocs=44218)
                0.020878794 = queryNorm
              0.7454432 = fieldWeight in 6601, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.963546 = idf(docFreq=308, maxDocs=44218)
                0.0625 = fieldNorm(doc=6601)
          0.09672491 = weight(abstract_txt:manually in 6601) [ClassicSimilarity], result of:
            0.09672491 = score(doc=6601,freq=2.0), product of:
              0.16463251 = queryWeight, product of:
                1.1862671 = boost
                6.6470313 = idf(docFreq=155, maxDocs=44218)
                0.020878794 = queryNorm
              0.5875201 = fieldWeight in 6601, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.6470313 = idf(docFreq=155, maxDocs=44218)
                0.0625 = fieldNorm(doc=6601)
          0.025261965 = weight(abstract_txt:system in 6601) [ClassicSimilarity], result of:
            0.025261965 = score(doc=6601,freq=2.0), product of:
              0.084750995 = queryWeight, product of:
                1.2036829 = boost
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.020878794 = queryNorm
              0.2980728 = fieldWeight in 6601, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.0625 = fieldNorm(doc=6601)
          0.112792544 = weight(abstract_txt:sentences in 6601) [ClassicSimilarity], result of:
            0.112792544 = score(doc=6601,freq=2.0), product of:
              0.1823939 = queryWeight, product of:
                1.2486187 = boost
                6.996407 = idf(docFreq=109, maxDocs=44218)
                0.020878794 = queryNorm
              0.6184009 = fieldWeight in 6601, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.996407 = idf(docFreq=109, maxDocs=44218)
                0.0625 = fieldNorm(doc=6601)
          0.17773576 = weight(abstract_txt:indicative in 6601) [ClassicSimilarity], result of:
            0.17773576 = score(doc=6601,freq=2.0), product of:
              0.24698652 = queryWeight, product of:
                1.4529856 = boost
                8.14154 = idf(docFreq=34, maxDocs=44218)
                0.020878794 = queryNorm
              0.71961725 = fieldWeight in 6601, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.14154 = idf(docFreq=34, maxDocs=44218)
                0.0625 = fieldNorm(doc=6601)
          0.046201058 = weight(abstract_txt:text in 6601) [ClassicSimilarity], result of:
            0.046201058 = score(doc=6601,freq=1.0), product of:
              0.18279953 = queryWeight, product of:
                2.1650746 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.020878794 = queryNorm
              0.25274166 = fieldWeight in 6601, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=6601)
          0.3708052 = weight(abstract_txt:extraction in 6601) [ClassicSimilarity], result of:
            0.3708052 = score(doc=6601,freq=5.0), product of:
              0.4285298 = queryWeight, product of:
                3.3149412 = boost
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.020878794 = queryNorm
              0.8652962 = fieldWeight in 6601, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.0625 = fieldNorm(doc=6601)
        0.28 = coord(7/25)
  2. Barrio, P.; Gravano, L.: Sampling strategies for information extraction over the deep web (2017) 0.14
    0.1371735 = sum of:
      0.1371735 = product of:
        0.6858675 = sum of:
          0.04699294 = weight(abstract_txt:enables in 3412) [ClassicSimilarity], result of:
            0.04699294 = score(doc=3412,freq=1.0), product of:
              0.14012526 = queryWeight, product of:
                1.0944172 = boost
                6.1323667 = idf(docFreq=260, maxDocs=44218)
                0.020878794 = queryNorm
              0.3353638 = fieldWeight in 3412, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1323667 = idf(docFreq=260, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3412)
          0.024534093 = weight(abstract_txt:information in 3412) [ClassicSimilarity], result of:
            0.024534093 = score(doc=3412,freq=8.0), product of:
              0.065516666 = queryWeight, product of:
                1.2961677 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.020878794 = queryNorm
              0.37447104 = fieldWeight in 3412, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3412)
          0.072082035 = weight(abstract_txt:document in 3412) [ClassicSimilarity], result of:
            0.072082035 = score(doc=3412,freq=5.0), product of:
              0.13731965 = queryWeight, product of:
                1.5321667 = boost
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.020878794 = queryNorm
              0.5249215 = fieldWeight in 3412, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.2926083 = idf(docFreq=1642, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3412)
          0.106956944 = weight(abstract_txt:text in 3412) [ClassicSimilarity], result of:
            0.106956944 = score(doc=3412,freq=7.0), product of:
              0.18279953 = queryWeight, product of:
                2.1650746 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.020878794 = queryNorm
              0.5851051 = fieldWeight in 3412, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3412)
          0.43530148 = weight(abstract_txt:extraction in 3412) [ClassicSimilarity], result of:
            0.43530148 = score(doc=3412,freq=9.0), product of:
              0.4285298 = queryWeight, product of:
                3.3149412 = boost
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.020878794 = queryNorm
              1.0158021 = fieldWeight in 3412, product of:
                3.0 = tf(freq=9.0), with freq of:
                  9.0 = termFreq=9.0
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.0546875 = fieldNorm(doc=3412)
        0.2 = coord(5/25)
  3. Reeve, L.H.; Han, H.; Brooks, A.D.: ¬The use of domain-specific concepts in biomedical text summarization (2007) 0.12
    0.12214726 = sum of:
      0.12214726 = product of:
        0.43624023 = sum of:
          0.053706218 = weight(abstract_txt:enables in 955) [ClassicSimilarity], result of:
            0.053706218 = score(doc=955,freq=1.0), product of:
              0.14012526 = queryWeight, product of:
                1.0944172 = boost
                6.1323667 = idf(docFreq=260, maxDocs=44218)
                0.020878794 = queryNorm
              0.38327292 = fieldWeight in 955, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1323667 = idf(docFreq=260, maxDocs=44218)
                0.0625 = fieldNorm(doc=955)
          0.055605665 = weight(abstract_txt:abstract in 955) [ClassicSimilarity], result of:
            0.055605665 = score(doc=955,freq=1.0), product of:
              0.14341 = queryWeight, product of:
                1.1071702 = boost
                6.203826 = idf(docFreq=242, maxDocs=44218)
                0.020878794 = queryNorm
              0.38773912 = fieldWeight in 955, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.203826 = idf(docFreq=242, maxDocs=44218)
                0.0625 = fieldNorm(doc=955)
          0.06839484 = weight(abstract_txt:manually in 955) [ClassicSimilarity], result of:
            0.06839484 = score(doc=955,freq=1.0), product of:
              0.16463251 = queryWeight, product of:
                1.1862671 = boost
                6.6470313 = idf(docFreq=155, maxDocs=44218)
                0.020878794 = queryNorm
              0.41543946 = fieldWeight in 955, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6470313 = idf(docFreq=155, maxDocs=44218)
                0.0625 = fieldNorm(doc=955)
          0.025261965 = weight(abstract_txt:system in 955) [ClassicSimilarity], result of:
            0.025261965 = score(doc=955,freq=2.0), product of:
              0.084750995 = queryWeight, product of:
                1.2036829 = boost
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.020878794 = queryNorm
              0.2980728 = fieldWeight in 955, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.0625 = fieldNorm(doc=955)
          0.112792544 = weight(abstract_txt:sentences in 955) [ClassicSimilarity], result of:
            0.112792544 = score(doc=955,freq=2.0), product of:
              0.1823939 = queryWeight, product of:
                1.2486187 = boost
                6.996407 = idf(docFreq=109, maxDocs=44218)
                0.020878794 = queryNorm
              0.6184009 = fieldWeight in 955, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.996407 = idf(docFreq=109, maxDocs=44218)
                0.0625 = fieldNorm(doc=955)
          0.017170288 = weight(abstract_txt:information in 955) [ClassicSimilarity], result of:
            0.017170288 = score(doc=955,freq=3.0), product of:
              0.065516666 = queryWeight, product of:
                1.2961677 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.020878794 = queryNorm
              0.26207513 = fieldWeight in 955, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0625 = fieldNorm(doc=955)
          0.103308715 = weight(abstract_txt:text in 955) [ClassicSimilarity], result of:
            0.103308715 = score(doc=955,freq=5.0), product of:
              0.18279953 = queryWeight, product of:
                2.1650746 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.020878794 = queryNorm
              0.5651476 = fieldWeight in 955, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=955)
        0.28 = coord(7/25)
  4. Wang, P.; Hao, T.; Yan, J.; Jin, L.: Large-scale extraction of drug-disease pairs from the medical literature (2017) 0.12
    0.12066178 = sum of:
      0.12066178 = product of:
        0.50275743 = sum of:
          0.049391694 = weight(abstract_txt:abstracts in 3927) [ClassicSimilarity], result of:
            0.049391694 = score(doc=3927,freq=1.0), product of:
              0.13251631 = queryWeight, product of:
                1.0642885 = boost
                5.963546 = idf(docFreq=308, maxDocs=44218)
                0.020878794 = queryNorm
              0.3727216 = fieldWeight in 3927, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.963546 = idf(docFreq=308, maxDocs=44218)
                0.0625 = fieldNorm(doc=3927)
          0.06839484 = weight(abstract_txt:manually in 3927) [ClassicSimilarity], result of:
            0.06839484 = score(doc=3927,freq=1.0), product of:
              0.16463251 = queryWeight, product of:
                1.1862671 = boost
                6.6470313 = idf(docFreq=155, maxDocs=44218)
                0.020878794 = queryNorm
              0.41543946 = fieldWeight in 3927, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6470313 = idf(docFreq=155, maxDocs=44218)
                0.0625 = fieldNorm(doc=3927)
          0.07520167 = weight(abstract_txt:efficiently in 3927) [ClassicSimilarity], result of:
            0.07520167 = score(doc=3927,freq=1.0), product of:
              0.175382 = queryWeight, product of:
                1.2243828 = boost
                6.8606052 = idf(docFreq=125, maxDocs=44218)
                0.020878794 = queryNorm
              0.42878783 = fieldWeight in 3927, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8606052 = idf(docFreq=125, maxDocs=44218)
                0.0625 = fieldNorm(doc=3927)
          0.00991327 = weight(abstract_txt:information in 3927) [ClassicSimilarity], result of:
            0.00991327 = score(doc=3927,freq=1.0), product of:
              0.065516666 = queryWeight, product of:
                1.2961677 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.020878794 = queryNorm
              0.15130915 = fieldWeight in 3927, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0625 = fieldNorm(doc=3927)
          0.065338165 = weight(abstract_txt:text in 3927) [ClassicSimilarity], result of:
            0.065338165 = score(doc=3927,freq=2.0), product of:
              0.18279953 = queryWeight, product of:
                2.1650746 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.020878794 = queryNorm
              0.3574307 = fieldWeight in 3927, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=3927)
          0.2345178 = weight(abstract_txt:extraction in 3927) [ClassicSimilarity], result of:
            0.2345178 = score(doc=3927,freq=2.0), product of:
              0.4285298 = queryWeight, product of:
                3.3149412 = boost
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.020878794 = queryNorm
              0.54726136 = fieldWeight in 3927, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.0625 = fieldNorm(doc=3927)
        0.24 = coord(6/25)
  5. Zhou, G.D.; Zhang, M.: Extracting relation information from text documents by exploring various types of knowledge (2007) 0.12
    0.11763886 = sum of:
      0.11763886 = product of:
        0.5881943 = sum of:
          0.053706218 = weight(abstract_txt:enables in 927) [ClassicSimilarity], result of:
            0.053706218 = score(doc=927,freq=1.0), product of:
              0.14012526 = queryWeight, product of:
                1.0944172 = boost
                6.1323667 = idf(docFreq=260, maxDocs=44218)
                0.020878794 = queryNorm
              0.38327292 = fieldWeight in 927, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.1323667 = idf(docFreq=260, maxDocs=44218)
                0.0625 = fieldNorm(doc=927)
          0.025261965 = weight(abstract_txt:system in 927) [ClassicSimilarity], result of:
            0.025261965 = score(doc=927,freq=2.0), product of:
              0.084750995 = queryWeight, product of:
                1.2036829 = boost
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.020878794 = queryNorm
              0.2980728 = fieldWeight in 927, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.3723085 = idf(docFreq=4123, maxDocs=44218)
                0.0625 = fieldNorm(doc=927)
          0.024282455 = weight(abstract_txt:information in 927) [ClassicSimilarity], result of:
            0.024282455 = score(doc=927,freq=6.0), product of:
              0.065516666 = queryWeight, product of:
                1.2961677 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.020878794 = queryNorm
              0.3706302 = fieldWeight in 927, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0625 = fieldNorm(doc=927)
          0.046201058 = weight(abstract_txt:text in 927) [ClassicSimilarity], result of:
            0.046201058 = score(doc=927,freq=1.0), product of:
              0.18279953 = queryWeight, product of:
                2.1650746 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.020878794 = queryNorm
              0.25274166 = fieldWeight in 927, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0625 = fieldNorm(doc=927)
          0.4387426 = weight(abstract_txt:extraction in 927) [ClassicSimilarity], result of:
            0.4387426 = score(doc=927,freq=7.0), product of:
              0.4285298 = queryWeight, product of:
                3.3149412 = boost
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.020878794 = queryNorm
              1.0238322 = fieldWeight in 927, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                6.1915555 = idf(docFreq=245, maxDocs=44218)
                0.0625 = fieldNorm(doc=927)
        0.2 = coord(5/25)