CD

class frlearn.data_descriptors.CD(measure: str = 'euclidean', threshold_perc: int = 80, preprocessors=(<frlearn.statistics.feature_preprocessors.Standardiser object>, ))[source]

Implementation of the Centre Distance (CD) data descriptor. Calculates a score based on the distance to a central point of the target data.

This is implemented simply as the vector size of each element (the distance to the origin), with the expectation that the given preprocessor normalises the data in such a way that a suitable central value of the data is located at the origin, and that all features have the same scale. The drawback of this approach is that it does not allow dissimilarity measures to be used that are not induced by a vector size measure.

By default (standardisation) this is euclidean centroid distance.

Parameters
measure: str or float or (np.array -> float) = ‘euclidean’

The vector size measure to use. A float is interpreted as Minkowski size with the corresponding value for p. For convenience, a number of popular measures can be referred to by name.

threshold_percint or None, default=80

Threshold percentile for normal instances. Should be in (0, 100] or None. All distances below the distance value in the target set corresponding to this percentile result in a final score above 0.5. If None, 1 is used as the threshold instead.

preprocessorsiterable = (Standardiser(), )

Preprocessors to apply. The default standardiser places the centroid of the data at the origin, and ensures that all features have the same standard deviation.

class Model[source]

Examples using frlearn.data_descriptors.CD