Document (#43044)

Author
Förster, F.
Title
Zuweisung von Katalogdatensätzen an Personennormdatensätze mittels Wahrscheinlichkeiten
Source
B.I.T. Online. 23(2020) H.2, S.138-148
Year
2020
Abstract
Im Juni 2020 werden die Tn-Sätze in der Gemeinsamen Normdatei (GND) gelöscht. Die Tp-Sätze für eindeutig identifizierbare Personen bleiben im Bereich des Personenbestandes übrig. Dieser Beitrag soll eine Anreicherung und Bereinigung der Personennamensdatensätze mittels Wahrscheinlichkeiten auf der Datenbasis von GND und k10plus anregen. Zu jedem Tp-Satz kann ein Profil aus verknüpften Informationen erstellt werden: z. B. über Stichwörter, fachliche Schwerpunkte, Ko-Autoren, Zeiten und Orte usw. Im gleichen Maß können abgrenzbare Profile für Tn-Sätze per Algorithmus erkannt werden. Zusätzlich könnten bestehende Verknüpfungen von Personen- in Titeldatensätzen Fehlzuweisungen aufspüren. Die Folgen eines solchen Verfahrens wären eine retrospektive Anreichung des Altbestandes und eine präzisere Ausgestaltung des Katalogs.
Theme
Normdateien
Object
GND

Similar documents (content)

  1. Vorndran, A.: Hervorholen, was in unseren Daten steckt! : Mehrwerte durch Analysen großer Bibliotheksdatenbestände (2018) 0.10
    0.09807455 = sum of:
      0.09807455 = product of:
        0.49037275 = sum of:
          0.094873115 = weight(abstract_txt:verknüpfungen in 4601) [ClassicSimilarity], result of:
            0.094873115 = score(doc=4601,freq=1.0), product of:
              0.14696726 = queryWeight, product of:
                1.0412332 = boost
                8.2629 = idf(docFreq=30, maxDocs=44218)
                0.017082054 = queryNorm
              0.6455391 = fieldWeight in 4601, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.2629 = idf(docFreq=30, maxDocs=44218)
                0.078125 = fieldNorm(doc=4601)
          0.094873115 = weight(abstract_txt:anreicherung in 4601) [ClassicSimilarity], result of:
            0.094873115 = score(doc=4601,freq=1.0), product of:
              0.14696726 = queryWeight, product of:
                1.0412332 = boost
                8.2629 = idf(docFreq=30, maxDocs=44218)
                0.017082054 = queryNorm
              0.6455391 = fieldWeight in 4601, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.2629 = idf(docFreq=30, maxDocs=44218)
                0.078125 = fieldNorm(doc=4601)
          0.043493822 = weight(abstract_txt:werden in 4601) [ClassicSimilarity], result of:
            0.043493822 = score(doc=4601,freq=4.0), product of:
              0.07938966 = queryWeight, product of:
                1.3255017 = boost
                3.5062556 = idf(docFreq=3606, maxDocs=44218)
                0.017082054 = queryNorm
              0.54785246 = fieldWeight in 4601, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.5062556 = idf(docFreq=3606, maxDocs=44218)
                0.078125 = fieldNorm(doc=4601)
          0.10023707 = weight(abstract_txt:mittels in 4601) [ClassicSimilarity], result of:
            0.10023707 = score(doc=4601,freq=1.0), product of:
              0.19208233 = queryWeight, product of:
                1.6834352 = boost
                6.6796074 = idf(docFreq=150, maxDocs=44218)
                0.017082054 = queryNorm
              0.5218443 = fieldWeight in 4601, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6796074 = idf(docFreq=150, maxDocs=44218)
                0.078125 = fieldNorm(doc=4601)
          0.15689561 = weight(abstract_txt:personen in 4601) [ClassicSimilarity], result of:
            0.15689561 = score(doc=4601,freq=2.0), product of:
              0.20552549 = queryWeight, product of:
                1.7413478 = boost
                6.9093957 = idf(docFreq=119, maxDocs=44218)
                0.017082054 = queryNorm
              0.7633876 = fieldWeight in 4601, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                6.9093957 = idf(docFreq=119, maxDocs=44218)
                0.078125 = fieldNorm(doc=4601)
        0.2 = coord(5/25)
    
  2. Damasio, A.R.; Damasio, H.: Sprache und Gehirn (1992) 0.07
    0.06791275 = sum of:
      0.06791275 = product of:
        0.84890944 = sum of:
          0.08908838 = weight(abstract_txt:eine in 4168) [ClassicSimilarity], result of:
            0.08908838 = score(doc=4168,freq=3.0), product of:
              0.078619756 = queryWeight, product of:
                1.3190588 = boost
                3.4892128 = idf(docFreq=3668, maxDocs=44218)
                0.017082054 = queryNorm
              1.1331551 = fieldWeight in 4168, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                3.4892128 = idf(docFreq=3668, maxDocs=44218)
                0.1875 = fieldNorm(doc=4168)
          0.75982106 = weight(abstract_txt:sätze in 4168) [ClassicSimilarity], result of:
            0.75982106 = score(doc=4168,freq=1.0), product of:
              0.47333175 = queryWeight, product of:
                3.236541 = boost
                8.561393 = idf(docFreq=22, maxDocs=44218)
                0.017082054 = queryNorm
              1.6052611 = fieldWeight in 4168, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.561393 = idf(docFreq=22, maxDocs=44218)
                0.1875 = fieldNorm(doc=4168)
        0.08 = coord(2/25)
    
  3. Geisriegler, E.: Enriching electronic texts with semantic metadata : a use case for the historical Newspaper Collection ANNO (Austrian Newspapers Online) of the Austrian National Libraryhek (2012) 0.06
    0.0641673 = sum of:
      0.0641673 = product of:
        0.32083648 = sum of:
          0.10733668 = weight(abstract_txt:anreicherung in 595) [ClassicSimilarity], result of:
            0.10733668 = score(doc=595,freq=2.0), product of:
              0.14696726 = queryWeight, product of:
                1.0412332 = boost
                8.2629 = idf(docFreq=30, maxDocs=44218)
                0.017082054 = queryNorm
              0.7303441 = fieldWeight in 595, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                8.2629 = idf(docFreq=30, maxDocs=44218)
                0.0625 = fieldNorm(doc=595)
          0.09020367 = weight(abstract_txt:orte in 595) [ClassicSimilarity], result of:
            0.09020367 = score(doc=595,freq=1.0), product of:
              0.16489772 = queryWeight, product of:
                1.1029226 = boost
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.017082054 = queryNorm
              0.547028 = fieldWeight in 595, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.752448 = idf(docFreq=18, maxDocs=44218)
                0.0625 = fieldNorm(doc=595)
          0.017145066 = weight(abstract_txt:eine in 595) [ClassicSimilarity], result of:
            0.017145066 = score(doc=595,freq=1.0), product of:
              0.078619756 = queryWeight, product of:
                1.3190588 = boost
                3.4892128 = idf(docFreq=3668, maxDocs=44218)
                0.017082054 = queryNorm
              0.2180758 = fieldWeight in 595, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4892128 = idf(docFreq=3668, maxDocs=44218)
                0.0625 = fieldNorm(doc=595)
          0.017397529 = weight(abstract_txt:werden in 595) [ClassicSimilarity], result of:
            0.017397529 = score(doc=595,freq=1.0), product of:
              0.07938966 = queryWeight, product of:
                1.3255017 = boost
                3.5062556 = idf(docFreq=3606, maxDocs=44218)
                0.017082054 = queryNorm
              0.21914098 = fieldWeight in 595, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.5062556 = idf(docFreq=3606, maxDocs=44218)
                0.0625 = fieldNorm(doc=595)
          0.08875356 = weight(abstract_txt:personen in 595) [ClassicSimilarity], result of:
            0.08875356 = score(doc=595,freq=1.0), product of:
              0.20552549 = queryWeight, product of:
                1.7413478 = boost
                6.9093957 = idf(docFreq=119, maxDocs=44218)
                0.017082054 = queryNorm
              0.43183723 = fieldWeight in 595, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.9093957 = idf(docFreq=119, maxDocs=44218)
                0.0625 = fieldNorm(doc=595)
        0.2 = coord(5/25)
    
  4. Nissen, K.; Reuter, M.: ¬Die neuen Leiden der jungen Wörter : Das aktuelle Wörterbuch zur Rächtschraiprehvorm (1999) 0.06
    0.0629485 = sum of:
      0.0629485 = product of:
        0.39342815 = sum of:
          0.16396365 = weight(abstract_txt:stichwörter in 2859) [ClassicSimilarity], result of:
            0.16396365 = score(doc=2859,freq=1.0), product of:
              0.15471897 = queryWeight, product of:
                1.06834 = boost
                8.478011 = idf(docFreq=24, maxDocs=44218)
                0.017082054 = queryNorm
              1.0597514 = fieldWeight in 2859, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                8.478011 = idf(docFreq=24, maxDocs=44218)
                0.125 = fieldNorm(doc=2859)
          0.03429013 = weight(abstract_txt:eine in 2859) [ClassicSimilarity], result of:
            0.03429013 = score(doc=2859,freq=1.0), product of:
              0.078619756 = queryWeight, product of:
                1.3190588 = boost
                3.4892128 = idf(docFreq=3668, maxDocs=44218)
                0.017082054 = queryNorm
              0.4361516 = fieldWeight in 2859, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.4892128 = idf(docFreq=3668, maxDocs=44218)
                0.125 = fieldNorm(doc=2859)
          0.034795057 = weight(abstract_txt:werden in 2859) [ClassicSimilarity], result of:
            0.034795057 = score(doc=2859,freq=1.0), product of:
              0.07938966 = queryWeight, product of:
                1.3255017 = boost
                3.5062556 = idf(docFreq=3606, maxDocs=44218)
                0.017082054 = queryNorm
              0.43828195 = fieldWeight in 2859, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                3.5062556 = idf(docFreq=3606, maxDocs=44218)
                0.125 = fieldNorm(doc=2859)
          0.16037932 = weight(abstract_txt:mittels in 2859) [ClassicSimilarity], result of:
            0.16037932 = score(doc=2859,freq=1.0), product of:
              0.19208233 = queryWeight, product of:
                1.6834352 = boost
                6.6796074 = idf(docFreq=150, maxDocs=44218)
                0.017082054 = queryNorm
              0.8349509 = fieldWeight in 2859, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                6.6796074 = idf(docFreq=150, maxDocs=44218)
                0.125 = fieldNorm(doc=2859)
        0.16 = coord(4/25)
    
  5. Meyer, R.: Allein, es wär' so schön gewesen : Der Copernic Summarzier kann Internettexte leider nicht befriedigend und sinnvoll zusammenfassen (2002) 0.06
    0.056481082 = sum of:
      0.056481082 = product of:
        0.35300678 = sum of:
          0.04202127 = weight(abstract_txt:wären in 648) [ClassicSimilarity], result of:
            0.04202127 = score(doc=648,freq=1.0), product of:
              0.13555783 = queryWeight, product of:
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.017082054 = queryNorm
              0.30998778 = fieldWeight in 648, product of:
                1.0 = tf(freq=1.0), with freq of:
                  1.0 = termFreq=1.0
                7.935687 = idf(docFreq=42, maxDocs=44218)
                0.0390625 = fieldNorm(doc=648)
          0.021431332 = weight(abstract_txt:eine in 648) [ClassicSimilarity], result of:
            0.021431332 = score(doc=648,freq=4.0), product of:
              0.078619756 = queryWeight, product of:
                1.3190588 = boost
                3.4892128 = idf(docFreq=3668, maxDocs=44218)
                0.017082054 = queryNorm
              0.27259475 = fieldWeight in 648, product of:
                2.0 = tf(freq=4.0), with freq of:
                  4.0 = termFreq=4.0
                3.4892128 = idf(docFreq=3668, maxDocs=44218)
                0.0390625 = fieldNorm(doc=648)
          0.015377388 = weight(abstract_txt:werden in 648) [ClassicSimilarity], result of:
            0.015377388 = score(doc=648,freq=2.0), product of:
              0.07938966 = queryWeight, product of:
                1.3255017 = boost
                3.5062556 = idf(docFreq=3606, maxDocs=44218)
                0.017082054 = queryNorm
              0.1936951 = fieldWeight in 648, product of:
                1.4142135 = tf(freq=2.0), with freq of:
                  2.0 = termFreq=2.0
                3.5062556 = idf(docFreq=3606, maxDocs=44218)
                0.0390625 = fieldNorm(doc=648)
          0.2741768 = weight(abstract_txt:sätze in 648) [ClassicSimilarity], result of:
            0.2741768 = score(doc=648,freq=3.0), product of:
              0.47333175 = queryWeight, product of:
                3.236541 = boost
                8.561393 = idf(docFreq=22, maxDocs=44218)
                0.017082054 = queryNorm
              0.5792487 = fieldWeight in 648, product of:
                1.7320508 = tf(freq=3.0), with freq of:
                  3.0 = termFreq=3.0
                8.561393 = idf(docFreq=22, maxDocs=44218)
                0.0390625 = fieldNorm(doc=648)
        0.16 = coord(4/25)