Effective validation techniques are an essential pre-requisite for segmentation and non-rigid registration techniques to enter clinical use. These algorithms can be evaluated by calculating the overlap of corresponding test and gold-standard regions. Common overlap measures compare pairs of binary labels but it is now common for multiple labels to exist and for fractional (partial volume) labels to be used to describe multiple tissue types contributing to a single voxel. Evaluation studies may involve multiple image pairs. In this paper we use results from fuzzy set theory and fuzzy morphology to extend the definitions of existing overlap measures to accommodate multiple fractional labels. Simple formulas are provided which define single figures of merit to quantify the total overlap for ensembles of pairwise or groupwise label comparisons. A quantitative link between overlap and registration error is established by defining the overlap tolerance. Experiments are performed on publicly available labeled brain data to demonstrate the new measures in a comparison of pairwise and groupwise registration.