Document (#43496)

Author
Roy, D.
Bhatia, S.
Jain, P.
Title
Information asymmetry in Wikipedia across different languages : a statistical analysis
Source
Journal of the Association for Information Science and Technology. 73(2022) no.3, S.347-361
Year
2022
Abstract
Wikipedia is the largest web-based open encyclopedia covering more than 300 languages. Different language editions of Wikipedia differ significantly in terms of their information coverage. In this article, we compare the information coverage in English Wikipedia (most exhaustive) and Wikipedias in 8 other widely spoken languages, namely Arabic, German, Hindi, Korean, Portuguese, Russian, Spanish, and Turkish. We analyze variations in different language editions of Wikipedia in terms of the number of topics covered as well as the amount of information discussed about different topics. Further, as a step towards bridging the information gap, we present WikiCompare-a browser plugin that allows Wikipedia readers to have a comprehensive overview of topics by incorporating missing information from Wikipedia page in other language.
Content
Vgl.: https://asistdl.onlinelibrary.wiley.com/doi/10.1002/asi.24553.
Theme
Multilinguale Probleme
Object
Wikipedia

Similar documents (author)

  1. Jain, H.C.: Colon Classification : a review article (1964) 5.87
    5.871439 = sum of:
      5.871439 = weight(author_txt:jain in 1952) [ClassicSimilarity], result of:
        5.871439 = fieldWeight in 1952, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.394302 = idf(docFreq=9, maxDocs=44218)
          0.625 = fieldNorm(doc=1952)
    
  2. Jain, A.K.: Image data compression : a review (1981) 5.87
    5.871439 = sum of:
      5.871439 = weight(author_txt:jain in 8696) [ClassicSimilarity], result of:
        5.871439 = fieldWeight in 8696, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.394302 = idf(docFreq=9, maxDocs=44218)
          0.625 = fieldNorm(doc=8696)
    
  3. Jain, R.: Visual information retrieval in digital libraries (1997) 5.87
    5.871439 = sum of:
      5.871439 = weight(author_txt:jain in 760) [ClassicSimilarity], result of:
        5.871439 = fieldWeight in 760, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.394302 = idf(docFreq=9, maxDocs=44218)
          0.625 = fieldNorm(doc=760)
    
  4. Jain, P.: ¬An empirical study of knowledge management in academic libraries in East and Southern Africa (2007) 5.87
    5.871439 = sum of:
      5.871439 = weight(author_txt:jain in 864) [ClassicSimilarity], result of:
        5.871439 = fieldWeight in 864, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.394302 = idf(docFreq=9, maxDocs=44218)
          0.625 = fieldNorm(doc=864)
    
  5. Das, A.; Jain, A.: Indexing the World Wide Web : the journey so far (2012) 4.70
    4.697151 = sum of:
      4.697151 = weight(author_txt:jain in 95) [ClassicSimilarity], result of:
        4.697151 = fieldWeight in 95, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.394302 = idf(docFreq=9, maxDocs=44218)
          0.5 = fieldNorm(doc=95)
    

Similar documents (content)

  1. Callahan, E.S.; Herring, S.C.: Cultural bias in Wikipedia content on famous persons (2011) 0.24
    0.2378172 = sum of:
      0.2378172 = product of:
        0.990905 = sum of:
          0.05980659 = weight(abstract_txt:encyclopedia in 4764) [ClassicSimilarity], result of:
            0.05980659 = score(doc=4764,freq=1.0), product of:
              0.11119206 = queryWeight, product of:
                1.0069661 = boost
                6.8847027 = idf(docFreq=122, maxDocs=44218)
                0.016038869 = queryNorm
              0.5378674 = fieldWeight in 4764, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8847027 = idf(docFreq=122, maxDocs=44218)
                0.078125 = fieldNorm(doc=4764)
          0.056873377 = weight(abstract_txt:language in 4764) [ClassicSimilarity], result of:
            0.056873377 = score(doc=4764,freq=2.0), product of:
              0.12308663 = queryWeight, product of:
                1.8350338 = boost
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.016038869 = queryNorm
              0.46205974 = fieldWeight in 4764, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.1820874 = idf(docFreq=1834, maxDocs=44218)
                0.078125 = fieldNorm(doc=4764)
          0.1296386 = weight(abstract_txt:editions in 4764) [ClassicSimilarity], result of:
            0.1296386 = score(doc=4764,freq=1.0), product of:
              0.23464285 = queryWeight, product of:
                2.0686958 = boost
                7.071914 = idf(docFreq=101, maxDocs=44218)
                0.016038869 = queryNorm
              0.5524933 = fieldWeight in 4764, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.071914 = idf(docFreq=101, maxDocs=44218)
                0.078125 = fieldNorm(doc=4764)
          0.036103867 = weight(abstract_txt:different in 4764) [ClassicSimilarity], result of:
            0.036103867 = score(doc=4764,freq=1.0), product of:
              0.12607537 = queryWeight, product of:
                2.1444855 = boost
                3.6655018 = idf(docFreq=3075, maxDocs=44218)
                0.016038869 = queryNorm
              0.28636733 = fieldWeight in 4764, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6655018 = idf(docFreq=3075, maxDocs=44218)
                0.078125 = fieldNorm(doc=4764)
          0.07677932 = weight(abstract_txt:languages in 4764) [ClassicSimilarity], result of:
            0.07677932 = score(doc=4764,freq=1.0), product of:
              0.18942809 = queryWeight, product of:
                2.276464 = boost
                5.188118 = idf(docFreq=670, maxDocs=44218)
                0.016038869 = queryNorm
              0.40532172 = fieldWeight in 4764, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.188118 = idf(docFreq=670, maxDocs=44218)
                0.078125 = fieldNorm(doc=4764)
          0.63170326 = weight(abstract_txt:wikipedia in 4764) [ClassicSimilarity], result of:
            0.63170326 = score(doc=4764,freq=4.0), product of:
              0.6450537 = queryWeight, product of:
                6.416895 = boost
                6.2675414 = idf(docFreq=227, maxDocs=44218)
                0.016038869 = queryNorm
              0.97930336 = fieldWeight in 4764, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                6.2675414 = idf(docFreq=227, maxDocs=44218)
                0.078125 = fieldNorm(doc=4764)
        0.24 = coord(6/25)
    
  2. Hara, N.; Shachaf, P.; Hew, K.F.: Cross-cultural analysis of the Wikipedia community (2010) 0.22
    0.2203869 = sum of:
      0.2203869 = product of:
        1.1019344 = sum of:
          0.01809422 = weight(abstract_txt:other in 4001) [ClassicSimilarity], result of:
            0.01809422 = score(doc=4001,freq=2.0), product of:
              0.058148835 = queryWeight, product of:
                1.0298251 = boost
                3.5204957 = idf(docFreq=3555, maxDocs=44218)
                0.016038869 = queryNorm
              0.3111708 = fieldWeight in 4001, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5204957 = idf(docFreq=3555, maxDocs=44218)
                0.0625 = fieldNorm(doc=4001)
          0.17775545 = weight(abstract_txt:wikipedias in 4001) [ClassicSimilarity], result of:
            0.17775545 = score(doc=4001,freq=2.0), product of:
              0.21169946 = queryWeight, product of:
                1.3894337 = boost
                9.499662 = idf(docFreq=8, maxDocs=44218)
                0.016038869 = queryNorm
              0.83965945 = fieldWeight in 4001, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                9.499662 = idf(docFreq=8, maxDocs=44218)
                0.0625 = fieldNorm(doc=4001)
          0.028883094 = weight(abstract_txt:different in 4001) [ClassicSimilarity], result of:
            0.028883094 = score(doc=4001,freq=1.0), product of:
              0.12607537 = queryWeight, product of:
                2.1444855 = boost
                3.6655018 = idf(docFreq=3075, maxDocs=44218)
                0.016038869 = queryNorm
              0.22909386 = fieldWeight in 4001, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6655018 = idf(docFreq=3075, maxDocs=44218)
                0.0625 = fieldNorm(doc=4001)
          0.16251118 = weight(abstract_txt:languages in 4001) [ClassicSimilarity], result of:
            0.16251118 = score(doc=4001,freq=7.0), product of:
              0.18942809 = queryWeight, product of:
                2.276464 = boost
                5.188118 = idf(docFreq=670, maxDocs=44218)
                0.016038869 = queryNorm
              0.8579044 = fieldWeight in 4001, product of:
                2.6457512 = tf(freq=7.0), with freq of:
                  7.0 = termFreq=7.0
                5.188118 = idf(docFreq=670, maxDocs=44218)
                0.0625 = fieldNorm(doc=4001)
          0.71469057 = weight(abstract_txt:wikipedia in 4001) [ClassicSimilarity], result of:
            0.71469057 = score(doc=4001,freq=8.0), product of:
              0.6450537 = queryWeight, product of:
                6.416895 = boost
                6.2675414 = idf(docFreq=227, maxDocs=44218)
                0.016038869 = queryNorm
              1.1079552 = fieldWeight in 4001, product of:
                2.828427 = tf(freq=8.0), with freq of:
                  8.0 = termFreq=8.0
                6.2675414 = idf(docFreq=227, maxDocs=44218)
                0.0625 = fieldNorm(doc=4001)
        0.2 = coord(5/25)
    
  3. Zhao, D.; Strotmann, A.: Mapping knowledge domains on Wikipedia : an author bibliographic coupling analysis of traditional Chinese medicine (2022) 0.20
    0.20269075 = sum of:
      0.20269075 = product of:
        0.8445448 = sum of:
          0.009595909 = weight(abstract_txt:other in 608) [ClassicSimilarity], result of:
            0.009595909 = score(doc=608,freq=1.0), product of:
              0.058148835 = queryWeight, product of:
                1.0298251 = boost
                3.5204957 = idf(docFreq=3555, maxDocs=44218)
                0.016038869 = queryNorm
              0.16502324 = fieldWeight in 608, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.5204957 = idf(docFreq=3555, maxDocs=44218)
                0.046875 = fieldNorm(doc=608)
          0.041575354 = weight(abstract_txt:missing in 608) [ClassicSimilarity], result of:
            0.041575354 = score(doc=608,freq=1.0), product of:
              0.12265848 = queryWeight, product of:
                1.057613 = boost
                7.230979 = idf(docFreq=86, maxDocs=44218)
                0.016038869 = queryNorm
              0.33895212 = fieldWeight in 608, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.230979 = idf(docFreq=86, maxDocs=44218)
                0.046875 = fieldNorm(doc=608)
          0.013239345 = weight(abstract_txt:information in 608) [ClassicSimilarity], result of:
            0.013239345 = score(doc=608,freq=2.0), product of:
              0.082494505 = queryWeight, product of:
                2.1245458 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.016038869 = queryNorm
              0.16048759 = fieldWeight in 608, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.046875 = fieldNorm(doc=608)
          0.021662321 = weight(abstract_txt:different in 608) [ClassicSimilarity], result of:
            0.021662321 = score(doc=608,freq=1.0), product of:
              0.12607537 = queryWeight, product of:
                2.1444855 = boost
                3.6655018 = idf(docFreq=3075, maxDocs=44218)
                0.016038869 = queryNorm
              0.1718204 = fieldWeight in 608, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6655018 = idf(docFreq=3075, maxDocs=44218)
                0.046875 = fieldNorm(doc=608)
          0.0751804 = weight(abstract_txt:topics in 608) [ClassicSimilarity], result of:
            0.0751804 = score(doc=608,freq=3.0), product of:
              0.1820581 = queryWeight, product of:
                2.23174 = boost
                5.086191 = idf(docFreq=742, maxDocs=44218)
                0.016038869 = queryNorm
              0.41294727 = fieldWeight in 608, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.086191 = idf(docFreq=742, maxDocs=44218)
                0.046875 = fieldNorm(doc=608)
          0.6832915 = weight(abstract_txt:wikipedia in 608) [ClassicSimilarity], result of:
            0.6832915 = score(doc=608,freq=13.0), product of:
              0.6450537 = queryWeight, product of:
                6.416895 = boost
                6.2675414 = idf(docFreq=227, maxDocs=44218)
                0.016038869 = queryNorm
              1.0592785 = fieldWeight in 608, product of:
                3.6055512 = tf(freq=13.0), with freq of:
                  13.0 = termFreq=13.0
                6.2675414 = idf(docFreq=227, maxDocs=44218)
                0.046875 = fieldNorm(doc=608)
        0.24 = coord(6/25)
    
  4. Fallis, D.: Toward an epistemology of Wikipedia (2008) 0.18
    0.17971818 = sum of:
      0.17971818 = product of:
        0.89859086 = sum of:
          0.041864607 = weight(abstract_txt:encyclopedia in 2010) [ClassicSimilarity], result of:
            0.041864607 = score(doc=2010,freq=1.0), product of:
              0.11119206 = queryWeight, product of:
                1.0069661 = boost
                6.8847027 = idf(docFreq=122, maxDocs=44218)
                0.016038869 = queryNorm
              0.37650716 = fieldWeight in 2010, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.8847027 = idf(docFreq=122, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2010)
          0.01583244 = weight(abstract_txt:other in 2010) [ClassicSimilarity], result of:
            0.01583244 = score(doc=2010,freq=2.0), product of:
              0.058148835 = queryWeight, product of:
                1.0298251 = boost
                3.5204957 = idf(docFreq=3555, maxDocs=44218)
                0.016038869 = queryNorm
              0.27227443 = fieldWeight in 2010, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5204957 = idf(docFreq=3555, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2010)
          0.016967269 = weight(abstract_txt:terms in 2010) [ClassicSimilarity], result of:
            0.016967269 = score(doc=2010,freq=1.0), product of:
              0.07672326 = queryWeight, product of:
                1.1829231 = boost
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.016038869 = queryNorm
              0.22114895 = fieldWeight in 2010, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                4.0438666 = idf(docFreq=2106, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2010)
          0.026753085 = weight(abstract_txt:information in 2010) [ClassicSimilarity], result of:
            0.026753085 = score(doc=2010,freq=6.0), product of:
              0.082494505 = queryWeight, product of:
                2.1245458 = boost
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.016038869 = queryNorm
              0.32430142 = fieldWeight in 2010, product of:
                2.4494898 = tf(freq=6.0), with freq of:
                  6.0 = termFreq=6.0
                2.4209464 = idf(docFreq=10677, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2010)
          0.79717344 = weight(abstract_txt:wikipedia in 2010) [ClassicSimilarity], result of:
            0.79717344 = score(doc=2010,freq=13.0), product of:
              0.6450537 = queryWeight, product of:
                6.416895 = boost
                6.2675414 = idf(docFreq=227, maxDocs=44218)
                0.016038869 = queryNorm
              1.235825 = fieldWeight in 2010, product of:
                3.6055512 = tf(freq=13.0), with freq of:
                  13.0 = termFreq=13.0
                6.2675414 = idf(docFreq=227, maxDocs=44218)
                0.0546875 = fieldNorm(doc=2010)
        0.2 = coord(5/25)
    
  5. Ménard, E.; Khashman, N.; Kochkina, S.; Torres-Moreno, J.-M.; Velazquez-Morales, P.; Zhou, F.; Jourlin, P.; Rawat, P.; Peinl, P.; Linhares Pontes, E.; Brunetti., I.: ¬A second life for TIIARA : from bilingual to multilingual! (2016) 0.16
    0.15972938 = sum of:
      0.15972938 = product of:
        0.57046205 = sum of:
          0.049447004 = weight(abstract_txt:spanish in 2834) [ClassicSimilarity], result of:
            0.049447004 = score(doc=2834,freq=1.0), product of:
              0.11366003 = queryWeight, product of:
                1.0180799 = boost
                6.9606886 = idf(docFreq=113, maxDocs=44218)
                0.016038869 = queryNorm
              0.43504304 = fieldWeight in 2834, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.9606886 = idf(docFreq=113, maxDocs=44218)
                0.0625 = fieldNorm(doc=2834)
          0.0635958 = weight(abstract_txt:arabic in 2834) [ClassicSimilarity], result of:
            0.0635958 = score(doc=2834,freq=1.0), product of:
              0.13442089 = queryWeight, product of:
                1.1071625 = boost
                7.5697527 = idf(docFreq=61, maxDocs=44218)
                0.016038869 = queryNorm
              0.47310954 = fieldWeight in 2834, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.5697527 = idf(docFreq=61, maxDocs=44218)
                0.0625 = fieldNorm(doc=2834)
          0.07141956 = weight(abstract_txt:russian in 2834) [ClassicSimilarity], result of:
            0.07141956 = score(doc=2834,freq=1.0), product of:
              0.14523096 = queryWeight, product of:
                1.1508205 = boost
                7.8682456 = idf(docFreq=45, maxDocs=44218)
                0.016038869 = queryNorm
              0.49176535 = fieldWeight in 2834, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.8682456 = idf(docFreq=45, maxDocs=44218)
                0.0625 = fieldNorm(doc=2834)
          0.089343995 = weight(abstract_txt:portuguese in 2834) [ClassicSimilarity], result of:
            0.089343995 = score(doc=2834,freq=1.0), product of:
              0.16861312 = queryWeight, product of:
                1.2400056 = boost
                8.478011 = idf(docFreq=24, maxDocs=44218)
                0.016038869 = queryNorm
              0.5298757 = fieldWeight in 2834, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.478011 = idf(docFreq=24, maxDocs=44218)
                0.0625 = fieldNorm(doc=2834)
          0.13042556 = weight(abstract_txt:hindi in 2834) [ClassicSimilarity], result of:
            0.13042556 = score(doc=2834,freq=1.0), product of:
              0.2169816 = queryWeight, product of:
                1.4066609 = boost
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.016038869 = queryNorm
              0.6010904 = fieldWeight in 2834, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                9.617446 = idf(docFreq=7, maxDocs=44218)
                0.0625 = fieldNorm(doc=2834)
          0.028883094 = weight(abstract_txt:different in 2834) [ClassicSimilarity], result of:
            0.028883094 = score(doc=2834,freq=1.0), product of:
              0.12607537 = queryWeight, product of:
                2.1444855 = boost
                3.6655018 = idf(docFreq=3075, maxDocs=44218)
                0.016038869 = queryNorm
              0.22909386 = fieldWeight in 2834, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.6655018 = idf(docFreq=3075, maxDocs=44218)
                0.0625 = fieldNorm(doc=2834)
          0.13734703 = weight(abstract_txt:languages in 2834) [ClassicSimilarity], result of:
            0.13734703 = score(doc=2834,freq=5.0), product of:
              0.18942809 = queryWeight, product of:
                2.276464 = boost
                5.188118 = idf(docFreq=670, maxDocs=44218)
                0.016038869 = queryNorm
              0.72506154 = fieldWeight in 2834, product of:
                2.236068 = tf(freq=5.0), with freq of:
                  5.0 = termFreq=5.0
                5.188118 = idf(docFreq=670, maxDocs=44218)
                0.0625 = fieldNorm(doc=2834)
        0.28 = coord(7/25)