Section 3-3 BLOCKING PRECISION

In contrast to blocking recall in equation 3.1, precision is a bit more difficult to find an equation for. This measure is the proportion of matched records in a block. Of course, all matched records that are recalled are in blocks, but there are usually unmatched records there too. These unmatched records are the noise. Now, suppose we are searching a very large file. If we use the same blocking fields as we did on a small file, agreement in these fields will probably define larger blocks. The only way to retain the same sized blocks is to choose a field whose number of distinctive values is greater in proportion to the size of the file.

3-3.1A definition for blocking precision.
3-3.2Matched records in blocks.
3-3.3Unmatched records (noise) in blocks.
3-3.4A specific expression for blocking precision.
3-3.5An iterative approach to blocking precision.
3-3.6Measuring coincidence.