Document (#34454)

Author
Kang, I.-S.
Na, S.-H.
Lee, S.
Jung, H.
Kim, P.
Sung, W.-K.
Lee, J.-H.
Title
On co-authorship for author disambiguation
Source
Information processing and management. 45(2009) no.1, S.84-97
Year
2009
Abstract
Author name disambiguation deals with clustering the same-name authors into different individuals. To attack the problem, many studies have employed a variety of disambiguation features such as coauthors, titles of papers/publications, topics of articles, emails/affiliations, etc. Among these, co-authorship is the most easily accessible and influential, since inter-person acquaintances represented by co-authorship could discriminate the identities of authors more clearly than other features. This study attempts to explore the net effects of co-authorship on author clustering in bibliographic data. First, to handle the shortage of explicit coauthors listed in known citations, a web-assisted technique of acquiring implicit coauthors of the target author to be disambiguated is proposed. Then, a coauthor disambiguation hypothesis that the identity of an author can be determined by his/her coauthors is examined and confirmed through a variety of author disambiguation experiments.

Similar documents (author)

  1. Jung, R.: ¬Die Reform der alphabetischen Katalogisierung in Deutschland 1908-1976 : eine annotierte Auswahlbibliographie (1976) 2.11
    2.1108723 = sum of:
      2.1108723 = product of:
        4.2217445 = sum of:
          4.2217445 = weight(author_txt:jung in 5323) [ClassicSimilarity], result of:
            4.2217445 = score(doc=5323,freq=1.0), product of:
              0.7457406 = queryWeight, product of:
                1.0579855 = boost
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.07781869 = queryNorm
              5.661144 = fieldWeight in 5323, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.625 = fieldNorm(doc=5323)
        0.5 = coord(1/2)
    
  2. Jung, V.: Wissen, das produktiv wird : Mit Wissensmanagement zum lernenden Unternehmen (2000) 2.11
    2.1108723 = sum of:
      2.1108723 = product of:
        4.2217445 = sum of:
          4.2217445 = weight(author_txt:jung in 5057) [ClassicSimilarity], result of:
            4.2217445 = score(doc=5057,freq=1.0), product of:
              0.7457406 = queryWeight, product of:
                1.0579855 = boost
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.07781869 = queryNorm
              5.661144 = fieldWeight in 5057, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.625 = fieldNorm(doc=5057)
        0.5 = coord(1/2)
    
  3. Jung, R.: Bibliographie der Festschriften und Festschriftenbeiträge zum Buch und Bibliothekswesen : Deutschland, Österreich, Schweiz 1976-2000 (2002) 2.11
    2.1108723 = sum of:
      2.1108723 = product of:
        4.2217445 = sum of:
          4.2217445 = weight(author_txt:jung in 1089) [ClassicSimilarity], result of:
            4.2217445 = score(doc=1089,freq=1.0), product of:
              0.7457406 = queryWeight, product of:
                1.0579855 = boost
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.07781869 = queryNorm
              5.661144 = fieldWeight in 1089, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.625 = fieldNorm(doc=1089)
        0.5 = coord(1/2)
    
  4. Jung, R.: Methodik und Didaktik einer Einführung in die RAK nach vorausgegangenem Unterricht der Titelaufnahme nach den "Preußischen Instruktionen" (1976) 2.11
    2.1108723 = sum of:
      2.1108723 = product of:
        4.2217445 = sum of:
          4.2217445 = weight(author_txt:jung in 1803) [ClassicSimilarity], result of:
            4.2217445 = score(doc=1803,freq=1.0), product of:
              0.7457406 = queryWeight, product of:
                1.0579855 = boost
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.07781869 = queryNorm
              5.661144 = fieldWeight in 1803, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.625 = fieldNorm(doc=1803)
        0.5 = coord(1/2)
    
  5. Jung, J.J.: Contextualized query sampling to discover semantic resource descriptions on the web (2009) 2.11
    2.1108723 = sum of:
      2.1108723 = product of:
        4.2217445 = sum of:
          4.2217445 = weight(author_txt:jung in 4216) [ClassicSimilarity], result of:
            4.2217445 = score(doc=4216,freq=1.0), product of:
              0.7457406 = queryWeight, product of:
                1.0579855 = boost
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.07781869 = queryNorm
              5.661144 = fieldWeight in 4216, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.05783 = idf(docFreq=13, maxDocs=44218)
                0.625 = fieldNorm(doc=4216)
        0.5 = coord(1/2)
    

Similar documents (content)

  1. Kim, J.; Diesner, J.: Distortive effects of initial-based name disambiguation on measurements of large-scale coauthorship networks (2016) 0.33
    0.33310962 = sum of:
      0.33310962 = product of:
        1.0409676 = sum of:
          0.061114863 = weight(abstract_txt:identities in 2936) [ClassicSimilarity], result of:
            0.061114863 = score(doc=2936,freq=1.0), product of:
              0.119678676 = queryWeight, product of:
                1.2170691 = boost
                8.1705265 = idf(docFreq=33, maxDocs=44218)
                0.012035149 = queryNorm
              0.5106579 = fieldWeight in 2936, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.1705265 = idf(docFreq=33, maxDocs=44218)
                0.0625 = fieldNorm(doc=2936)
          0.081378035 = weight(abstract_txt:coauthor in 2936) [ClassicSimilarity], result of:
            0.081378035 = score(doc=2936,freq=1.0), product of:
              0.14485173 = queryWeight, product of:
                1.3389634 = boost
                8.988837 = idf(docFreq=14, maxDocs=44218)
                0.012035149 = queryNorm
              0.5618023 = fieldWeight in 2936, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.988837 = idf(docFreq=14, maxDocs=44218)
                0.0625 = fieldNorm(doc=2936)
          0.11508592 = weight(abstract_txt:disambiguated in 2936) [ClassicSimilarity], result of:
            0.11508592 = score(doc=2936,freq=2.0), product of:
              0.14485173 = queryWeight, product of:
                1.3389634 = boost
                8.988837 = idf(docFreq=14, maxDocs=44218)
                0.012035149 = queryNorm
              0.79450846 = fieldWeight in 2936, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.988837 = idf(docFreq=14, maxDocs=44218)
                0.0625 = fieldNorm(doc=2936)
          0.03183321 = weight(abstract_txt:authors in 2936) [ClassicSimilarity], result of:
            0.03183321 = score(doc=2936,freq=2.0), product of:
              0.07747695 = queryWeight, product of:
                1.3848672 = boost
                4.648501 = idf(docFreq=1150, maxDocs=44218)
                0.012035149 = queryNorm
              0.4108733 = fieldWeight in 2936, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.648501 = idf(docFreq=1150, maxDocs=44218)
                0.0625 = fieldNorm(doc=2936)
          0.085037224 = weight(abstract_txt:name in 2936) [ClassicSimilarity], result of:
            0.085037224 = score(doc=2936,freq=4.0), product of:
              0.11838998 = queryWeight, product of:
                1.7119037 = boost
                5.746245 = idf(docFreq=383, maxDocs=44218)
                0.012035149 = queryNorm
              0.7182806 = fieldWeight in 2936, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.746245 = idf(docFreq=383, maxDocs=44218)
                0.0625 = fieldNorm(doc=2936)
          0.053828437 = weight(abstract_txt:clustering in 2936) [ClassicSimilarity], result of:
            0.053828437 = score(doc=2936,freq=1.0), product of:
              0.138549 = queryWeight, product of:
                1.8519257 = boost
                6.2162485 = idf(docFreq=239, maxDocs=44218)
                0.012035149 = queryNorm
              0.38851553 = fieldWeight in 2936, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.2162485 = idf(docFreq=239, maxDocs=44218)
                0.0625 = fieldNorm(doc=2936)
          0.11727209 = weight(abstract_txt:author in 2936) [ClassicSimilarity], result of:
            0.11727209 = score(doc=2936,freq=2.0), product of:
              0.2665359 = queryWeight, product of:
                4.448979 = boost
                4.9778743 = idf(docFreq=827, maxDocs=44218)
                0.012035149 = queryNorm
              0.43998608 = fieldWeight in 2936, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.9778743 = idf(docFreq=827, maxDocs=44218)
                0.0625 = fieldNorm(doc=2936)
          0.49541774 = weight(abstract_txt:disambiguation in 2936) [ClassicSimilarity], result of:
            0.49541774 = score(doc=2936,freq=5.0), product of:
              0.4829475 = queryWeight, product of:
                5.466909 = boost
                7.3401785 = idf(docFreq=77, maxDocs=44218)
                0.012035149 = queryNorm
              1.0258211 = fieldWeight in 2936, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                7.3401785 = idf(docFreq=77, maxDocs=44218)
                0.0625 = fieldNorm(doc=2936)
        0.32 = coord(8/25)
    
  2. Pooja, K.M.; Mondal, S.; Chandra, J.: ¬A graph combination with edge pruning-based approach for author name disambiguation (2020) 0.19
    0.19251987 = sum of:
      0.19251987 = product of:
        0.80216616 = sum of:
          0.049002342 = weight(abstract_txt:handle in 59) [ClassicSimilarity], result of:
            0.049002342 = score(doc=59,freq=2.0), product of:
              0.0819823 = queryWeight, product of:
                1.0073187 = boost
                6.7624135 = idf(docFreq=138, maxDocs=44218)
                0.012035149 = queryNorm
              0.59771854 = fieldWeight in 59, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.7624135 = idf(docFreq=138, maxDocs=44218)
                0.0625 = fieldNorm(doc=59)
          0.03898756 = weight(abstract_txt:authors in 59) [ClassicSimilarity], result of:
            0.03898756 = score(doc=59,freq=3.0), product of:
              0.07747695 = queryWeight, product of:
                1.3848672 = boost
                4.648501 = idf(docFreq=1150, maxDocs=44218)
                0.012035149 = queryNorm
              0.50321496 = fieldWeight in 59, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                4.648501 = idf(docFreq=1150, maxDocs=44218)
                0.0625 = fieldNorm(doc=59)
          0.085037224 = weight(abstract_txt:name in 59) [ClassicSimilarity], result of:
            0.085037224 = score(doc=59,freq=4.0), product of:
              0.11838998 = queryWeight, product of:
                1.7119037 = boost
                5.746245 = idf(docFreq=383, maxDocs=44218)
                0.012035149 = queryNorm
              0.7182806 = fieldWeight in 59, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                5.746245 = idf(docFreq=383, maxDocs=44218)
                0.0625 = fieldNorm(doc=59)
          0.11727209 = weight(abstract_txt:author in 59) [ClassicSimilarity], result of:
            0.11727209 = score(doc=59,freq=2.0), product of:
              0.2665359 = queryWeight, product of:
                4.448979 = boost
                4.9778743 = idf(docFreq=827, maxDocs=44218)
                0.012035149 = queryNorm
              0.43998608 = fieldWeight in 59, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.9778743 = idf(docFreq=827, maxDocs=44218)
                0.0625 = fieldNorm(doc=59)
          0.2903094 = weight(abstract_txt:coauthors in 59) [ClassicSimilarity], result of:
            0.2903094 = score(doc=59,freq=1.0), product of:
              0.53684175 = queryWeight, product of:
                5.155372 = boost
                8.652365 = idf(docFreq=20, maxDocs=44218)
                0.012035149 = queryNorm
              0.5407728 = fieldWeight in 59, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.652365 = idf(docFreq=20, maxDocs=44218)
                0.0625 = fieldNorm(doc=59)
          0.22155756 = weight(abstract_txt:disambiguation in 59) [ClassicSimilarity], result of:
            0.22155756 = score(doc=59,freq=1.0), product of:
              0.4829475 = queryWeight, product of:
                5.466909 = boost
                7.3401785 = idf(docFreq=77, maxDocs=44218)
                0.012035149 = queryNorm
              0.45876116 = fieldWeight in 59, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.3401785 = idf(docFreq=77, maxDocs=44218)
                0.0625 = fieldNorm(doc=59)
        0.24 = coord(6/25)
    
  3. Kim, J.(im); Kim, J.(enna): Effect of forename string on author name disambiguation (2020) 0.19
    0.1898283 = sum of:
      0.1898283 = product of:
        0.9491415 = sum of:
          0.081378035 = weight(abstract_txt:disambiguated in 5930) [ClassicSimilarity], result of:
            0.081378035 = score(doc=5930,freq=1.0), product of:
              0.14485173 = queryWeight, product of:
                1.3389634 = boost
                8.988837 = idf(docFreq=14, maxDocs=44218)
                0.012035149 = queryNorm
              0.5618023 = fieldWeight in 5930, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.988837 = idf(docFreq=14, maxDocs=44218)
                0.0625 = fieldNorm(doc=5930)
          0.02250948 = weight(abstract_txt:authors in 5930) [ClassicSimilarity], result of:
            0.02250948 = score(doc=5930,freq=1.0), product of:
              0.07747695 = queryWeight, product of:
                1.3848672 = boost
                4.648501 = idf(docFreq=1150, maxDocs=44218)
                0.012035149 = queryNorm
              0.2905313 = fieldWeight in 5930, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.648501 = idf(docFreq=1150, maxDocs=44218)
                0.0625 = fieldNorm(doc=5930)
          0.0736444 = weight(abstract_txt:name in 5930) [ClassicSimilarity], result of:
            0.0736444 = score(doc=5930,freq=3.0), product of:
              0.11838998 = queryWeight, product of:
                1.7119037 = boost
                5.746245 = idf(docFreq=383, maxDocs=44218)
                0.012035149 = queryNorm
              0.6220493 = fieldWeight in 5930, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.746245 = idf(docFreq=383, maxDocs=44218)
                0.0625 = fieldNorm(doc=5930)
          0.18542345 = weight(abstract_txt:author in 5930) [ClassicSimilarity], result of:
            0.18542345 = score(doc=5930,freq=5.0), product of:
              0.2665359 = queryWeight, product of:
                4.448979 = boost
                4.9778743 = idf(docFreq=827, maxDocs=44218)
                0.012035149 = queryNorm
              0.69567907 = fieldWeight in 5930, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.9778743 = idf(docFreq=827, maxDocs=44218)
                0.0625 = fieldNorm(doc=5930)
          0.5861862 = weight(abstract_txt:disambiguation in 5930) [ClassicSimilarity], result of:
            0.5861862 = score(doc=5930,freq=7.0), product of:
              0.4829475 = queryWeight, product of:
                5.466909 = boost
                7.3401785 = idf(docFreq=77, maxDocs=44218)
                0.012035149 = queryNorm
              1.2137679 = fieldWeight in 5930, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                7.3401785 = idf(docFreq=77, maxDocs=44218)
                0.0625 = fieldNorm(doc=5930)
        0.2 = coord(5/25)
    
  4. Ferreira, A.A.; Veloso, A.; Gonçalves, M.A.; Laender, A.H.F.: Self-training author name disambiguation for information scarce scenarios (2014) 0.19
    0.18842433 = sum of:
      0.18842433 = product of:
        0.9421217 = sum of:
          0.02250948 = weight(abstract_txt:authors in 1292) [ClassicSimilarity], result of:
            0.02250948 = score(doc=1292,freq=1.0), product of:
              0.07747695 = queryWeight, product of:
                1.3848672 = boost
                4.648501 = idf(docFreq=1150, maxDocs=44218)
                0.012035149 = queryNorm
              0.2905313 = fieldWeight in 1292, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.648501 = idf(docFreq=1150, maxDocs=44218)
                0.0625 = fieldNorm(doc=1292)
          0.0601304 = weight(abstract_txt:name in 1292) [ClassicSimilarity], result of:
            0.0601304 = score(doc=1292,freq=2.0), product of:
              0.11838998 = queryWeight, product of:
                1.7119037 = boost
                5.746245 = idf(docFreq=383, maxDocs=44218)
                0.012035149 = queryNorm
              0.5079011 = fieldWeight in 1292, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                5.746245 = idf(docFreq=383, maxDocs=44218)
                0.0625 = fieldNorm(doc=1292)
          0.18542345 = weight(abstract_txt:author in 1292) [ClassicSimilarity], result of:
            0.18542345 = score(doc=1292,freq=5.0), product of:
              0.2665359 = queryWeight, product of:
                4.448979 = boost
                4.9778743 = idf(docFreq=827, maxDocs=44218)
                0.012035149 = queryNorm
              0.69567907 = fieldWeight in 1292, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                4.9778743 = idf(docFreq=827, maxDocs=44218)
                0.0625 = fieldNorm(doc=1292)
          0.2903094 = weight(abstract_txt:coauthors in 1292) [ClassicSimilarity], result of:
            0.2903094 = score(doc=1292,freq=1.0), product of:
              0.53684175 = queryWeight, product of:
                5.155372 = boost
                8.652365 = idf(docFreq=20, maxDocs=44218)
                0.012035149 = queryNorm
              0.5407728 = fieldWeight in 1292, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.652365 = idf(docFreq=20, maxDocs=44218)
                0.0625 = fieldNorm(doc=1292)
          0.38374895 = weight(abstract_txt:disambiguation in 1292) [ClassicSimilarity], result of:
            0.38374895 = score(doc=1292,freq=3.0), product of:
              0.4829475 = queryWeight, product of:
                5.466909 = boost
                7.3401785 = idf(docFreq=77, maxDocs=44218)
                0.012035149 = queryNorm
              0.7945976 = fieldWeight in 1292, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                7.3401785 = idf(docFreq=77, maxDocs=44218)
                0.0625 = fieldNorm(doc=1292)
        0.2 = coord(5/25)
    
  5. Zhang, C.; Bu, Y.; Ding, Y.; Xu, J.: Understanding scientific collaboration : homophily, transitivity, and preferential attachment (2018) 0.17
    0.16550021 = sum of:
      0.16550021 = product of:
        0.82750106 = sum of:
          0.10172255 = weight(abstract_txt:coauthor in 4011) [ClassicSimilarity], result of:
            0.10172255 = score(doc=4011,freq=1.0), product of:
              0.14485173 = queryWeight, product of:
                1.3389634 = boost
                8.988837 = idf(docFreq=14, maxDocs=44218)
                0.012035149 = queryNorm
              0.7022529 = fieldWeight in 4011, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.988837 = idf(docFreq=14, maxDocs=44218)
                0.078125 = fieldNorm(doc=4011)
          0.026197555 = weight(abstract_txt:features in 4011) [ClassicSimilarity], result of:
            0.026197555 = score(doc=4011,freq=1.0), product of:
              0.07387475 = queryWeight, product of:
                1.3522902 = boost
                4.5391517 = idf(docFreq=1283, maxDocs=44218)
                0.012035149 = queryNorm
              0.35462123 = fieldWeight in 4011, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.5391517 = idf(docFreq=1283, maxDocs=44218)
                0.078125 = fieldNorm(doc=4011)
          0.039791513 = weight(abstract_txt:authors in 4011) [ClassicSimilarity], result of:
            0.039791513 = score(doc=4011,freq=2.0), product of:
              0.07747695 = queryWeight, product of:
                1.3848672 = boost
                4.648501 = idf(docFreq=1150, maxDocs=44218)
                0.012035149 = queryNorm
              0.51359165 = fieldWeight in 4011, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.648501 = idf(docFreq=1150, maxDocs=44218)
                0.078125 = fieldNorm(doc=4011)
          0.14659011 = weight(abstract_txt:author in 4011) [ClassicSimilarity], result of:
            0.14659011 = score(doc=4011,freq=2.0), product of:
              0.2665359 = queryWeight, product of:
                4.448979 = boost
                4.9778743 = idf(docFreq=827, maxDocs=44218)
                0.012035149 = queryNorm
              0.5499826 = fieldWeight in 4011, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.9778743 = idf(docFreq=827, maxDocs=44218)
                0.078125 = fieldNorm(doc=4011)
          0.5131993 = weight(abstract_txt:coauthors in 4011) [ClassicSimilarity], result of:
            0.5131993 = score(doc=4011,freq=2.0), product of:
              0.53684175 = queryWeight, product of:
                5.155372 = boost
                8.652365 = idf(docFreq=20, maxDocs=44218)
                0.012035149 = queryNorm
              0.9559602 = fieldWeight in 4011, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.652365 = idf(docFreq=20, maxDocs=44218)
                0.078125 = fieldNorm(doc=4011)
        0.2 = coord(5/25)