1

I have a bunch of scipy matrices (of the same #columns) loaded from disk.

I want to combine them into one scipy sparse matrix.

I am using scipy sparse vstack method.

I am able to load the matrices individually (with adequate memory left in RAM), but during the vstack operation, I am getting memory overflow error.

Can anyone suggest an alternative to vstack to stack sparse matrices?

SHASHANK GUPTA
  • 3,745
  • 4
  • 18
  • 26
  • Note that `vstack` returns a new matrix. The old matrices will stay in memory alongside the new concatenated matrix, so the memory consumption should approximately double. There also might be some hidden overhead in the `vstack` method. This doesn't help with your problem of course, but it's understandable that you're getting a OOM error even though all the constituent matrices can fit in memory. – zachdj Jan 14 '20 at 14:56
  • 1
    @zachdj Thanks for your prompt response. Makes sense. Instead of loading all matrices in one go and later stacking them, I'll stack matrices to one matrix as I read them sequentially. Will check if that solves the problem. – SHASHANK GUPTA Jan 14 '20 at 15:18
  • 1
    @zachdj I tried loading matrix and appending it to a single matrix one by one. It worked. – SHASHANK GUPTA Jan 15 '20 at 07:14
  • Glad to hear it! Maybe post your solution as an answer in case other people have similar troubles? – zachdj Jan 15 '20 at 21:00

0 Answers0