MinkowskiSize

class frlearn.vector_size_measures.MinkowskiSize(p: float, unrooted: bool = False, scale_by_dimensionality: bool = False)[source]

Family of vector size measures of the form (x1**p + x2**p + … + xm**p)**(1/p) (if unrooted = False), or (x1**p + x2**p + … + xm**p) (if unrooted = True), for 0 < p < ∞, and their limits in 0 and ∞.

For p = 0, the rooted variant evaluates to ∞ if there is more than one non-zero coefficient, to 0 if all coefficients are zero, and to the only non-zero coefficient otherwise. The unrooted variant is equal to the number of non-zero coefficients.

For p = ∞, the rooted variant is the maximum of all coefficients. The unrooted variant evaluates to ∞ if there is at least one coefficient larger than 1, and to the number of coefficients equal to 1 otherwise.

Parameters
p: float = 1

Exponent to use. Must be in [0, ∞].

unrooted: bool = False

Whether to omit the root **(1/p) from the formula. For p = 0, this gives Hamming size. For p = 2, this gives squared Euclidean size.

scale_by_dimensionality: bool = False

If True, values are scaled linearly such that the vector [1, 1, …, 1] has size 1. This can be used to ensure that the range of dissimilarity values in the unit hypercube is [0, 1], which can be useful when working with features scaled to [0, 1].

Notes

The most used parameter combinations have their own name.

  • Hamming size is unrooted p = 0.

  • The Boscovich norm is p = 1. Also known as cityblock, Manhattan or Taxicab norm.

  • The Euclidean norm is rooted p = 2. Also known as Pythagorean norm.

  • Squared Euclidean size is unrooted p = 2.

  • The Chebishev norm is rooted p = ∞. Also known as chessboard or maximum norm.