Each Answer to this Q is separated by one/two green lines.
I am using truncated SVD from
In the definition of SVD, an original matrix A is approxmated as a product A ? U?V* where U and V have orthonormal columns, and ? is non-negative diagonal.
I need to get the U, ? and V* matrices.
Looking at the source code here I found out that V* is stored in
self.components_ field after calling
Is it possible to get U and ? matrices?
import sklearn.decomposition as skd import numpy as np matrix = np.random.random((20,20)) trsvd = skd.TruncatedSVD(n_components=15) transformed = trsvd.fit_transform(matrix) VT = trsvd.components_
Looking into the source via the link you provided,
TruncatedSVD is basically a wrapper around sklearn.utils.extmath.randomized_svd; you can manually call this yourself like this:
from sklearn.utils.extmath import randomized_svd U, Sigma, VT = randomized_svd(X, n_components=15, n_iter=5, random_state=None)
import numpy as np from scipy.sparse.linalg import svds matrix = np.random.random((20, 20)) num_components = 2 u, s, v = svds(matrix, k=num_components) X = u.dot(np.diag(s)) # output of TruncatedSVD
If you’re working with really big sparse matrices (perhaps your working with natural text), even
scipy.sparse.svds might blow up your computer’s RAM. In such cases, consider the sparsesvd package which uses SVDLIBC, and what
gensim uses under-the-hood.
import numpy as np from sparsesvd import sparsesvd X = np.random.random((30, 30)) ut, s, vt = sparsesvd(X.tocsc(), k) projected = (X * ut.T)/s
Just as a note:
generate U * Sigma.
generates Sigma in vector form.
Maybe we can use
to get U because U * Sigma * Sigma ^ -1 = U * I = U.
From the source code, we can see
X_transformed which is
U * Sigma (Here
Sigma is a vector) is returned
fit_transform method. So we can get
svd = TruncatedSVD(k) X_transformed = svd.fit_transform(X) U = X_transformed / svd.singular_values_ Sigma_matrix = np.diag(svd.singular_values_) VT = svd.components_
Truncated SVD is an approximation. X ? X’ = U?V*. We have X’V = U?. But what about XV? An interesting fact is XV = X’V. This can be proved by comparing the full SVD form of X and the truncated SVD form of X’. Note XV is just
transform(X), so we can also get
U = svd.transform(X) / svd.singular_values_
If your matrices are not large, since numpy computes SVD by sorting singular values in order, this can be computed directly with
np.linalg.svd simply by taking the first k singular values from ?, first k columns of U, and first k rows of Vh. (Use
full_matrices=False to get thin SVD if one of your dimensions is huge.)
m = np.random.random((5,5)) u, s, vh = np.linalg.svd(m) u2, s2, vh2 = u[:,:2], s[:2], vh[:2,:] m2 = u2 @ np.diag(s2) @ vh2 # rank-2 approx
If your matrices are large, then the randomized algorithms provided by
sklearn.decomposition.TruncatedSVD will compute truncated SVD more efficiently.
I know this is an older question but the correct version is-
U = svd.fit_transform(X) Sigma = svd.singular_values_ VT = svd.components_
However, one thing to keep in mind is that U and VT are truncated hence without the rest of the values it not possible to recreate X.
Let us suppose X is our input matrix on which we want yo perform Truncated SVD.
Below commands helps to find out the U, Sigma and VT :
from sklearn.decomposition import TruncatedSVD SVD = TruncatedSVD(n_components=r) U = SVD.fit_transform(X) Sigma = SVD.explained_variance_ratio_ VT = SVD.components_ #r corresponds to the rank of the matrix
To understand the above terms, please refer to http://scikit-learn.org/stable/modules/generated/sklearn.decomposition.TruncatedSVD.html