Section 2-2 USING PROBABILITY TO MEASURE THREE KEY ATTRIBUTES

The previous section outlined rough definitions for three logical attributes of a record comparison as they concern our study: 1) matched vs. unmatched records, ¶ 2-1.1, 2) agreement vs. disagreement in the data values in the fields of the records, ¶ 2-1.3, & ¶ 2-1.4, and 3) presence vs. absence of data in the fields of the records, ¶ 2-1.5. Our final goal is to define comparisons so as get a handle on the first attribute — to distinguish those that are matched from those that are unmatched. To do this accurately we must study the necessary relationships between these three attributes and obtain measures of the other two attributes — agreement/disagreement and presence/absence.

2-2.1Probability.
2-2.2Two basic relationships between attributes.
2-2.3Example of conditional probability.
2-2.4Calculating conditional probabilities.
2-2.5Independence of events.
2-2.6The theorem of total probability.
2-2.7Bayes' theorem.
2-2.8The strategy of probabilistic record linkage.