Skip to content

Add Randomized SVDs#2999

Open
Intron7 wants to merge 11 commits intorapidsai:mainfrom
Intron7:randomized_svds
Open

Add Randomized SVDs#2999
Intron7 wants to merge 11 commits intorapidsai:mainfrom
Intron7:randomized_svds

Conversation

@Intron7
Copy link
Copy Markdown

@Intron7 Intron7 commented Apr 9, 2026

This PR adds randomized SVDs to raft based on (Halko et al. 2009) and (Tomás et al. 2024). I also added the possibility for a very limited linear operator. This one is C++ and might be useful for sparse PCA in cuml. It's tested in the C++ layer but not exposed in Python. This PR mimics the implementation I did for rapids-singlecell

sparse svd benchmarks (k=50, n_oversamples=10)
GPU: RTX PRO 6000 Blackwell. raft = best of 3; sklearn/scipy = single run.

rows       nnz      raft GPU   sklearn CPU   scipy ARPACK   sk/raft   scipy/raft
50k        101M     0.27s      7.38s         18.69s         27.3×     69.2×
200k       400M     1.06s      25.61s        62.49s         24.1×     58.8×
500k       1.02B    2.75s      64.97s        153.40s        23.6×     55.8×
982k       2.01B    5.48s      126.16s       307.04s        23.0×     56.1×

~23-27x over sklearn, ~56-69x over scipy ARPACK. speedup is remarkably stable across scales — raft stays sub-linear while CPU methods scale linearly with nnz. at ~1M cells raft finishes in 5.5s vs 2 min (sklearn) / 5 min (scipy).

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Apr 9, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@aamijar aamijar added non-breaking Non-breaking change feature request New feature or request labels Apr 10, 2026
@aamijar aamijar changed the title add randomized svds Add randomized svds Apr 10, 2026
@aamijar aamijar changed the title Add randomized svds Add Randomized SVDs Apr 10, 2026
@aamijar
Copy link
Copy Markdown
Member

aamijar commented Apr 10, 2026

/ok to test 307d18e

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Apr 10, 2026

/ok to test 307d18e

@aamijar, there was an error processing your request: E2

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/2/

@aamijar
Copy link
Copy Markdown
Member

aamijar commented Apr 10, 2026

/ok to test e52ad48

Copy link
Copy Markdown
Contributor

@viclafargue viclafargue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Here are some comments.

Comment thread python/pylibraft/pylibraft/sparse/linalg/svds.pyx
template <typename ValueTypeT>
struct sparse_svd_config {
/** @brief Number of singular values/vectors to compute */
int n_components;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should maybe set a default for n_components. A C++ user that forgets to specify a value would face an undefined behavior. Setting it to 0 would allow it to be catched immediately by parameters validation.

Comment on lines +40 to +41
cdef sparse_svd_config[float] config_float
cdef sparse_svd_config[double] config_double
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using module-level global config is not thread-safe. Please instantiate them in the function where they are needed.

void sparse_randomized_svd(
raft::resources const& handle,
sparse_svd_config<ValueTypeT> const& config,
raft::device_csr_matrix_view<ValueTypeT, int, int, NNZTypeT> A,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
raft::device_csr_matrix_view<ValueTypeT, int, int, NNZTypeT> A,
raft::device_csr_matrix_view<const ValueTypeT, int, int, NNZTypeT> A,

The input matrix should be const.

Also for the doc, we sometimes use the paradigm @param [in], @param [inout], @param [out]. @param [in] is supposed to be usable with const data.

Omega.data_handle(), n, block_size),
Y.view());
} // Omega freed here
cholesky_qr2(handle, Y.view());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should maybe check the return value of cholesky_qr2 calls and emit a warning when there's a fallback to standard QR.

Comment on lines +54 to +55
int rows() const { return m_; }
int cols() const { return n_; }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
int rows() const { return m_; }
int cols() const { return n_; }
int rows() const { return A_.structure_view().get_n_rows(); }
int cols() const { return A_.structure_view().get_n_cols(); }

Comment on lines +29 to +31
raft::device_csr_matrix_view<const ValueTypeT, int, int, NNZTypeT> A_;
int m_;
int n_;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A_ already stores the dimensions of the matrix, storing m_ and n_ here is redundant, adds complexity and introduces the possibility of bugs. The number of rows and cols can be derived directly from A_. Also A_ should probably be a private member.

void sparse_randomized_svd(
raft::resources const& handle,
sparse_svd_config<ValueTypeT> const& config,
OperatorT const& op,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though the csr_linear_operator struct is certainly useful, the user should not ever need to construct this exotic type. The public function should expose an interface with a raft::device_csr_matrix_view and the csr_linear_operator utility should be built inside of the function.

Comment on lines +138 to +140
rmm::device_uvector<ValueTypeT> Q_copy(m * k, stream);
raft::copy(Q_copy.data(), Q.data_handle(), m * k, stream);
raft::linalg::qrGetQ(handle, Q_copy.data(), Q.data_handle(), m, k, stream);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
rmm::device_uvector<ValueTypeT> Q_copy(m * k, stream);
raft::copy(Q_copy.data(), Q.data_handle(), m * k, stream);
raft::linalg::qrGetQ(handle, Q_copy.data(), Q.data_handle(), m, k, stream);
raft::linalg::qrGetQ(handle, Q.data_handle(), Q.data_handle(), m, k, stream);

qrGetQ already does a copy internally. Using Q.data_handle() twice would allow the operation to work inplace (even the copy could be avoided as src==dst). To double check though.

Additionally, m * k has an integer overflow risk.

Comment on lines +154 to +156
rmm::device_uvector<ValueTypeT> Q_copy(m * k, stream);
raft::copy(Q_copy.data(), Q.data_handle(), m * k, stream);
raft::linalg::qrGetQ(handle, Q_copy.data(), Q.data_handle(), m, k, stream);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar comment here.

Copy link
Copy Markdown
Contributor

@achirkin achirkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Couple notes about usage of raft primitives below

Comment thread cpp/include/raft/sparse/solver/detail/cholesky_qr.cuh Outdated
Comment thread cpp/include/raft/sparse/solver/detail/cholesky_qr.cuh
@Intron7 Intron7 requested a review from achirkin April 14, 2026 15:51
@aamijar
Copy link
Copy Markdown
Member

aamijar commented Apr 14, 2026

/ok to test a803923

1 similar comment
@aamijar
Copy link
Copy Markdown
Member

aamijar commented Apr 14, 2026

/ok to test a803923

@Intron7 Intron7 requested a review from viclafargue April 15, 2026 11:44
@aamijar
Copy link
Copy Markdown
Member

aamijar commented Apr 20, 2026

/ok to test 5dc8079

Copy link
Copy Markdown
Member

@aamijar aamijar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Intron7 for the great work! It would be great to see your benchmarks results in the PR description!

I've left my review below:

#include <raft/sparse/solver/randomized_svds.cuh>

#define FUNC_DEF(ValueType, Suffix) \
void sparse_randomized_svd_##Suffix( \
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to have a separate function with a suffix here. You can just have the same function name with float and double overloads.

* @{
*/

void sparse_randomized_svd_float(const raft::resources& handle,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove the suffix here.
Also you can use macros with FUNC_DECL for more concise declarations. See https://github.com/rapidsai/raft/blob/main/cpp/include/raft_runtime/solver/lanczos.hpp

*/
/** @} */

} // namespace raft::sparse::solver
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to add this empty types file?

struct csr_linear_operator {
/**
* @brief Construct from a const CSR matrix view
*/
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to see a LinearOperator implementation! It could be useful in the lanczos solver as well (#2705). Is this one similar to https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.LinearOperator.html

I see this is in a detail file, but it might be nice to have a public interface.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a full Linear Operator. Its just a very limited version for raft to do PCA with the SVDs

Comment thread cpp/include/raft/sparse/solver/randomized_svds.cuh Outdated
@aamijar aamijar moved this to In Progress in Unstructured Data Processing Apr 21, 2026
/**
* @brief Compute truncated SVD of a sparse CSR matrix using randomized algorithm.
*
* Overload accepting a mutable CSR matrix view (implicitly converted to const).
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this overload at all? If sparse_randomized_svd doesn't modify A then we should just have a const API.

@Intron7
Copy link
Copy Markdown
Author

Intron7 commented Apr 27, 2026

sparse svd benchmarks (k=50, n_oversamples=10)
GPU: RTX PRO 6000 Blackwell. raft = best of 3; sklearn/scipy = single run.

rows       nnz      raft GPU   sklearn CPU   scipy ARPACK   sk/raft   scipy/raft
50k        101M     0.27s      7.38s         18.69s         27.3×     69.2×
200k       400M     1.06s      25.61s        62.49s         24.1×     58.8×
500k       1.02B    2.75s      64.97s        153.40s        23.6×     55.8×
982k       2.01B    5.48s      126.16s       307.04s        23.0×     56.1×

~23-27x over sklearn, ~56-69x over scipy ARPACK. speedup is remarkably stable across scales — raft stays sub-linear while CPU methods scale linearly with nnz. at ~1M cells raft finishes in 5.5s vs 2 min (sklearn) / 5 min (scipy).

@Intron7 Intron7 requested a review from aamijar April 27, 2026 14:22
@aamijar
Copy link
Copy Markdown
Member

aamijar commented Apr 28, 2026

/ok to test b601ea2

Copy link
Copy Markdown
Member

@aamijar aamijar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Intron7!

}

// Step 9: Truncate S, optionally compute Vt
raft::copy(singular_values.data_handle(), S_full.data_handle(), k, stream);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A nitpick

Suggested change
raft::copy(singular_values.data_handle(), S_full.data_handle(), k, stream);
raft::copy(handle, singular_values, raft::make_const_mdspan(S_full));

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature request New feature or request non-breaking Non-breaking change

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

5 participants