sklearn.manifold.trustworthiness
-
sklearn.manifold.trustworthiness(X, X_embedded, *, n_neighbors=5, metric='euclidean')
[source] -
Expresses to what extent the local structure is retained.
The trustworthiness is within [0, 1]. It is defined as
\[T(k) = 1 - \frac{2}{nk (2n - 3k - 1)} \sum^n_{i=1} \sum_{j \in \mathcal{N}_{i}^{k}} \max(0, (r(i, j) - k))\]where for each sample i, \(\mathcal{N}_{i}^{k}\) are its k nearest neighbors in the output space, and every sample j is its \(r(i, j)\)-th nearest neighbor in the input space. In other words, any unexpected nearest neighbors in the output space are penalised in proportion to their rank in the input space.
- “Neighborhood Preservation in Nonlinear Projection Methods: An Experimental Study” J. Venna, S. Kaski
- “Learning a Parametric Embedding by Preserving Local Structure” L.J.P. van der Maaten
- Parameters
-
-
Xndarray of shape (n_samples, n_features) or (n_samples, n_samples)
-
If the metric is ‘precomputed’ X must be a square distance matrix. Otherwise it contains a sample per row.
-
X_embeddedndarray of shape (n_samples, n_components)
-
Embedding of the training data in low-dimensional space.
-
n_neighborsint, default=5
-
Number of neighbors k that will be considered.
-
metricstr or callable, default=’euclidean’
-
Which metric to use for computing pairwise distances between samples from the original input space. If metric is ‘precomputed’, X must be a matrix of pairwise distances or squared distances. Otherwise, see the documentation of argument metric in sklearn.pairwise.pairwise_distances for a list of available metrics.
New in version 0.20.
-
- Returns
-
-
trustworthinessfloat
-
Trustworthiness of the low-dimensional embedding.
-
© 2007–2020 The scikit-learn developers
Licensed under the 3-clause BSD License.
https://scikit-learn.org/0.24/modules/generated/sklearn.manifold.trustworthiness.html