The binary Jaccard coefficient measures the degree of overlap between
two sets and is computed as the ratio of the number of shared
attributes (words) of
AND
to
the number possessed by
OR
. For example, given two sets' binary indicator vectors
and
, the cardinality of their intersect is 1 and the
cardinality of their union is 3, rendering their Jaccard coefficient
1/3. The binary Jaccard coefficient It is often used in retail
market-basket applications. In chapter 3, we extended the binary definition of Jaccard
coefficient to continuous or discrete non-negative features. The
extended Jaccard is computed as
![]() |
(4.4) |
). The Dice coefficient can be obtained from the extended Jaccard
coefficient by adding