Document (#37506)

Author
Derek Doran, D.
Gokhale, S.S.
Title
¬A classification framework for web robots
Source
Journal of the American Society for Information Science and Technology. 63(2012) no.12, S.2549-2554,
Year
2012
Series
Brief communication
Abstract
The behavior of modern web robots varies widely when they crawl for different purposes. In this article, we present a framework to classify these web robots from two orthogonal perspectives, namely, their functionality and the types of resources they consume. Applying the classification framework to a year-long access log from the UConn SoE web server, we present trends that point to significant differences in their crawling behavior.
Theme
Internet
Data Mining

Similar documents (author)

  1. Doran, K.: Unified disparity : theory and practice of union listing (1996) 6.19
    6.190705 = sum of:
      6.190705 = weight(author_txt:doran in 4726) [ClassicSimilarity], result of:
        6.190705 = fieldWeight in 4726, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.905128 = idf(docFreq=5, maxDocs=44218)
          0.625 = fieldNorm(doc=4726)
    
  2. Doran, C.; Martin, C.: Measuring success in outsourced cataloging : a data-driven investigation (2017) 4.95
    4.952564 = sum of:
      4.952564 = weight(author_txt:doran in 5150) [ClassicSimilarity], result of:
        4.952564 = fieldWeight in 5150, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.905128 = idf(docFreq=5, maxDocs=44218)
          0.5 = fieldNorm(doc=5150)
    
  3. Rittschof, K.A.; Kulhavy, R.W.; Stock, W.A.; Verdi, M.P.; Doran, J.M.: Thematic maps improve memory for facts and inferences : a test of the stimulus order hypothesis (1994) 3.10
    3.0953524 = sum of:
      3.0953524 = weight(author_txt:doran in 2089) [ClassicSimilarity], result of:
        3.0953524 = fieldWeight in 2089, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.905128 = idf(docFreq=5, maxDocs=44218)
          0.3125 = fieldNorm(doc=2089)
    
  4. Monireh, E.; Sarker, M.K.; Bianchi, F.; Hitzler, P.; Doran, D.; Xie, N.: Reasoning over RDF knowledge bases using deep learning (2018) 2.48
    2.476282 = sum of:
      2.476282 = weight(author_txt:doran in 4553) [ClassicSimilarity], result of:
        2.476282 = fieldWeight in 4553, product of:
          1.0 = tf(freq=1.0), with freq of:
            1.0 = termFreq=1.0
          9.905128 = idf(docFreq=5, maxDocs=44218)
          0.25 = fieldNorm(doc=4553)
    

Similar documents (content)

  1. Byers, D.: Full-text indexing of non-textual resources (1998) 0.13
    0.13175043 = sum of:
      0.13175043 = product of:
        0.8234402 = sum of:
          0.029835528 = weight(abstract_txt:from in 3606) [ClassicSimilarity], result of:
            0.029835528 = score(doc=3606,freq=3.0), product of:
              0.04985899 = queryWeight, product of:
                1.0510787 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.01716282 = queryNorm
              0.59839815 = fieldWeight in 3606, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.125 = fieldNorm(doc=3606)
          0.090083115 = weight(abstract_txt:server in 3606) [ClassicSimilarity], result of:
            0.090083115 = score(doc=3606,freq=1.0), product of:
              0.11922857 = queryWeight, product of:
                1.1493139 = boost
                6.044398 = idf(docFreq=284, maxDocs=44218)
                0.01716282 = queryNorm
              0.7555497 = fieldWeight in 3606, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.044398 = idf(docFreq=284, maxDocs=44218)
                0.125 = fieldNorm(doc=3606)
          0.04309401 = weight(abstract_txt:they in 3606) [ClassicSimilarity], result of:
            0.04309401 = score(doc=3606,freq=1.0), product of:
              0.09188388 = queryWeight, product of:
                1.4268658 = boost
                3.7520406 = idf(docFreq=2820, maxDocs=44218)
                0.01716282 = queryNorm
              0.46900508 = fieldWeight in 3606, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7520406 = idf(docFreq=2820, maxDocs=44218)
                0.125 = fieldNorm(doc=3606)
          0.6604275 = weight(abstract_txt:robots in 3606) [ClassicSimilarity], result of:
            0.6604275 = score(doc=3606,freq=1.0), product of:
              0.64894605 = queryWeight, product of:
                4.6442266 = boost
                8.14154 = idf(docFreq=34, maxDocs=44218)
                0.01716282 = queryNorm
              1.0176924 = fieldWeight in 3606, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.14154 = idf(docFreq=34, maxDocs=44218)
                0.125 = fieldNorm(doc=3606)
        0.16 = coord(4/25)
    
  2. Kimmel, S.: Robot-generated databases on the World Wide Web (1996) 0.12
    0.11931661 = sum of:
      0.11931661 = product of:
        0.9943051 = sum of:
          0.01722555 = weight(abstract_txt:from in 4724) [ClassicSimilarity], result of:
            0.01722555 = score(doc=4724,freq=1.0), product of:
              0.04985899 = queryWeight, product of:
                1.0510787 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.01716282 = queryNorm
              0.34548533 = fieldWeight in 4724, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.125 = fieldNorm(doc=4724)
          0.04309401 = weight(abstract_txt:they in 4724) [ClassicSimilarity], result of:
            0.04309401 = score(doc=4724,freq=1.0), product of:
              0.09188388 = queryWeight, product of:
                1.4268658 = boost
                3.7520406 = idf(docFreq=2820, maxDocs=44218)
                0.01716282 = queryNorm
              0.46900508 = fieldWeight in 4724, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7520406 = idf(docFreq=2820, maxDocs=44218)
                0.125 = fieldNorm(doc=4724)
          0.93398553 = weight(abstract_txt:robots in 4724) [ClassicSimilarity], result of:
            0.93398553 = score(doc=4724,freq=2.0), product of:
              0.64894605 = queryWeight, product of:
                4.6442266 = boost
                8.14154 = idf(docFreq=34, maxDocs=44218)
                0.01716282 = queryNorm
              1.4392345 = fieldWeight in 4724, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.14154 = idf(docFreq=34, maxDocs=44218)
                0.125 = fieldNorm(doc=4724)
        0.12 = coord(3/25)
    
  3. Moya Anegón, F. de; López-Huertas, M.J.: ¬An automatic model for updating the conceptual structure of a scientific discipline (2000) 0.10
    0.097802654 = sum of:
      0.097802654 = product of:
        0.40751106 = sum of:
          0.015072357 = weight(abstract_txt:from in 126) [ClassicSimilarity], result of:
            0.015072357 = score(doc=126,freq=4.0), product of:
              0.04985899 = queryWeight, product of:
                1.0510787 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.01716282 = queryNorm
              0.30229968 = fieldWeight in 126, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.0546875 = fieldNorm(doc=126)
          0.03661266 = weight(abstract_txt:applying in 126) [ClassicSimilarity], result of:
            0.03661266 = score(doc=126,freq=1.0), product of:
              0.11351509 = queryWeight, product of:
                1.121438 = boost
                5.8977947 = idf(docFreq=329, maxDocs=44218)
                0.01716282 = queryNorm
              0.32253563 = fieldWeight in 126, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8977947 = idf(docFreq=329, maxDocs=44218)
                0.0546875 = fieldNorm(doc=126)
          0.015920699 = weight(abstract_txt:their in 126) [ClassicSimilarity], result of:
            0.015920699 = score(doc=126,freq=2.0), product of:
              0.06515396 = queryWeight, product of:
                1.201528 = boost
                3.1594994 = idf(docFreq=5101, maxDocs=44218)
                0.01716282 = queryNorm
              0.24435507 = fieldWeight in 126, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1594994 = idf(docFreq=5101, maxDocs=44218)
                0.0546875 = fieldNorm(doc=126)
          0.018853629 = weight(abstract_txt:they in 126) [ClassicSimilarity], result of:
            0.018853629 = score(doc=126,freq=1.0), product of:
              0.09188388 = queryWeight, product of:
                1.4268658 = boost
                3.7520406 = idf(docFreq=2820, maxDocs=44218)
                0.01716282 = queryNorm
              0.20518972 = fieldWeight in 126, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.7520406 = idf(docFreq=2820, maxDocs=44218)
                0.0546875 = fieldNorm(doc=126)
          0.03211467 = weight(abstract_txt:classification in 126) [ClassicSimilarity], result of:
            0.03211467 = score(doc=126,freq=2.0), product of:
              0.10401637 = queryWeight, product of:
                1.5181488 = boost
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.01716282 = queryNorm
              0.3087463 = fieldWeight in 126, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.9920752 = idf(docFreq=2218, maxDocs=44218)
                0.0546875 = fieldNorm(doc=126)
          0.28893703 = weight(abstract_txt:robots in 126) [ClassicSimilarity], result of:
            0.28893703 = score(doc=126,freq=1.0), product of:
              0.64894605 = queryWeight, product of:
                4.6442266 = boost
                8.14154 = idf(docFreq=34, maxDocs=44218)
                0.01716282 = queryNorm
              0.44524044 = fieldWeight in 126, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.14154 = idf(docFreq=34, maxDocs=44218)
                0.0546875 = fieldNorm(doc=126)
        0.24 = coord(6/25)
    
  4. Day, R.E.: Indexing it all : the subject in the age of documentation, information, and data (2014) 0.09
    0.092971526 = sum of:
      0.092971526 = product of:
        0.46485764 = sum of:
          0.014917764 = weight(abstract_txt:from in 3024) [ClassicSimilarity], result of:
            0.014917764 = score(doc=3024,freq=3.0), product of:
              0.04985899 = queryWeight, product of:
                1.0510787 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.01716282 = queryNorm
              0.29919907 = fieldWeight in 3024, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.0625 = fieldNorm(doc=3024)
          0.03823343 = weight(abstract_txt:purposes in 3024) [ClassicSimilarity], result of:
            0.03823343 = score(doc=3024,freq=1.0), product of:
              0.106889136 = queryWeight, product of:
                1.0882164 = boost
                5.723078 = idf(docFreq=392, maxDocs=44218)
                0.01716282 = queryNorm
              0.35769236 = fieldWeight in 3024, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.723078 = idf(docFreq=392, maxDocs=44218)
                0.0625 = fieldNorm(doc=3024)
          0.0686268 = weight(abstract_txt:modern in 3024) [ClassicSimilarity], result of:
            0.0686268 = score(doc=3024,freq=3.0), product of:
              0.10946119 = queryWeight, product of:
                1.1012313 = boost
                5.7915254 = idf(docFreq=366, maxDocs=44218)
                0.01716282 = queryNorm
              0.626951 = fieldWeight in 3024, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                5.7915254 = idf(docFreq=366, maxDocs=44218)
                0.0625 = fieldNorm(doc=3024)
          0.0128658675 = weight(abstract_txt:their in 3024) [ClassicSimilarity], result of:
            0.0128658675 = score(doc=3024,freq=1.0), product of:
              0.06515396 = queryWeight, product of:
                1.201528 = boost
                3.1594994 = idf(docFreq=5101, maxDocs=44218)
                0.01716282 = queryNorm
              0.19746871 = fieldWeight in 3024, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.1594994 = idf(docFreq=5101, maxDocs=44218)
                0.0625 = fieldNorm(doc=3024)
          0.33021376 = weight(abstract_txt:robots in 3024) [ClassicSimilarity], result of:
            0.33021376 = score(doc=3024,freq=1.0), product of:
              0.64894605 = queryWeight, product of:
                4.6442266 = boost
                8.14154 = idf(docFreq=34, maxDocs=44218)
                0.01716282 = queryNorm
              0.5088462 = fieldWeight in 3024, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.14154 = idf(docFreq=34, maxDocs=44218)
                0.0625 = fieldNorm(doc=3024)
        0.2 = coord(5/25)
    
  5. Hidalgo, C.: Why information grows : the evolution of order, from atoms to economies (2015) 0.09
    0.088410035 = sum of:
      0.088410035 = product of:
        0.44205016 = sum of:
          0.011188323 = weight(abstract_txt:from in 2154) [ClassicSimilarity], result of:
            0.011188323 = score(doc=2154,freq=3.0), product of:
              0.04985899 = queryWeight, product of:
                1.0510787 = boost
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.01716282 = queryNorm
              0.2243993 = fieldWeight in 2154, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                2.7638826 = idf(docFreq=7577, maxDocs=44218)
                0.046875 = fieldNorm(doc=2154)
          0.03138228 = weight(abstract_txt:applying in 2154) [ClassicSimilarity], result of:
            0.03138228 = score(doc=2154,freq=1.0), product of:
              0.11351509 = queryWeight, product of:
                1.121438 = boost
                5.8977947 = idf(docFreq=329, maxDocs=44218)
                0.01716282 = queryNorm
              0.27645913 = fieldWeight in 2154, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                5.8977947 = idf(docFreq=329, maxDocs=44218)
                0.046875 = fieldNorm(doc=2154)
          0.013646314 = weight(abstract_txt:their in 2154) [ClassicSimilarity], result of:
            0.013646314 = score(doc=2154,freq=2.0), product of:
              0.06515396 = queryWeight, product of:
                1.201528 = boost
                3.1594994 = idf(docFreq=5101, maxDocs=44218)
                0.01716282 = queryNorm
              0.2094472 = fieldWeight in 2154, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.1594994 = idf(docFreq=5101, maxDocs=44218)
                0.046875 = fieldNorm(doc=2154)
          0.035588674 = weight(abstract_txt:present in 2154) [ClassicSimilarity], result of:
            0.035588674 = score(doc=2154,freq=2.0), product of:
              0.123444505 = queryWeight, product of:
                1.6538624 = boost
                4.348943 = idf(docFreq=1552, maxDocs=44218)
                0.01716282 = queryNorm
              0.28829694 = fieldWeight in 2154, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                4.348943 = idf(docFreq=1552, maxDocs=44218)
                0.046875 = fieldNorm(doc=2154)
          0.35024455 = weight(abstract_txt:robots in 2154) [ClassicSimilarity], result of:
            0.35024455 = score(doc=2154,freq=2.0), product of:
              0.64894605 = queryWeight, product of:
                4.6442266 = boost
                8.14154 = idf(docFreq=34, maxDocs=44218)
                0.01716282 = queryNorm
              0.5397129 = fieldWeight in 2154, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.14154 = idf(docFreq=34, maxDocs=44218)
                0.046875 = fieldNorm(doc=2154)
        0.2 = coord(5/25)