Questions tagged [sparse]

32 questions
3
votes
0 answers

Converting pandas dataframe to scipy sparse arrays

Converting pandas data frame with mixed column types -- numerical, ordinal as well as categorical -- to Scipy sparse arrays is a central problem in machine learning. Now, if my pandas' data frame consists of only numerical data, then I can simply do…
learner
  • 359
  • 1
  • 11
3
votes
1 answer

Convert Pandas Dataframe with mixed datatypes to LibSVM format

I have a pandas data frame with about Million rows and 3 columns. The columns are of 3 different datatypes. NumberOfFollowers is of a numerical datatype, UserName is of a categorical data type, Embeddings is of categorical-set type. df: Index …
learner
  • 359
  • 1
  • 11
2
votes
1 answer

Best metric and hyperparameters in dimension reduction with UMAP for binary sparse data

I am playing with a dimensionality reduction step prior to clustering for a pretty large sparse binary matrix of almost 3000 columns and 50k rows. My idea is to embed the 3000 dimensions into a two-dimensional space with UMAP and then cluster the…
linello
  • 121
  • 4
2
votes
1 answer

Why do my target labels need to begin at 0 for sparse categorical cross entropy to work?

I'm following a guide here to implement image segmentation in Keras. One thing I'm confused about are these lines: # Ground truth labels are 1, 2, 3. Subtract one to make them 0, 1, 2: y[j] -= 1 The ground truth targets are .png files with either…
TomSelleck
  • 115
  • 5
1
vote
0 answers

Linear regression on sparse matrix?

I have a matrix with sparse data. A small extract from it is seen below. The columns represent years and the rows represent different race tracks. The feature values are velocities on that specific track a specific year. Generally the velocity…
Hanson
  • 11
  • 1
1
vote
2 answers

Predicting high frequency sparse time series data in python

I have a dataset of a couple of EV charging stations (10 min frequency) over 1 year. This data consists of lots of 0's, since there is no continuous flow of cars coming to charge but rather reoccurring charging events as peaks(for example from 7-9…
1
vote
3 answers

How to create a big data frame in Python

I have a sparse matrix, $X$, created by TfidfVectorizer and its size is $(500000, 200000)$. I want to convert $X$ to a data frame but I'm always getting a memory error. I tried pd.DataFrame(X.toarray(),…
user109341
1
vote
1 answer

Predicting sparse time series data

I have a dataset of a couple of EV charging stations (10 min frequency) over 1 year. This data consists of lots of 0s, since there is no continuous flow of cars coming to charge, but rather reoccurring charging events as peaks (for example from…
Ruben Kl
  • 11
  • 1
1
vote
1 answer

Anomaly detection on sparse categorical data

I have a big dataset with a column "clientid" and a categorical column "choice". I want to find out what are the clients that have strange combinations of choices (less frequent ones) and being able in the future to identify new strange combinations…
1
vote
1 answer

unsupervised anomaly detection on sparse data

Given that I have a very sparse data matrix with continuous features, like this dataframe for example Feature_A Feature_B Feature_C....Feature_Z 0.3 0 0.1 0 0.5 0.5 0 0 0 0…
1
vote
0 answers

Autoencoder for Extremely Sparse Data

I am attempting to train an autoencoder on data that is extremely sparse. Each datapoint is only zeros and ones and contains ~3% 1s. Being that the data is mostly zero the autoencoder learns to guess zero every time. Is there a way to prevent…
PDPDPDPD
  • 113
  • 2
1
vote
0 answers

Sparse Covariance Selection

I was reading this article https://www.di.ens.fr/~aspremon/PDF/CovSelSIMAX.pdf, whose goal is to estimate the covariance matrix from a the sample covariance matrix drawn from a distribution $X$. ' Given a sample covariance matrix, we solve a maximum…
mbz0
  • 11
  • 2
1
vote
0 answers

Custom Loss Function for Mixing Sparse and Dense Features for a Prediction Problem

I have a largely uncorrelated feature space of about 40 dichotomous features, using which I'm trying to predict a continuous target variable. Now, some of these features are very sparse (Active less than 10% of the time, with the rest as zeros). But…
1
vote
1 answer

What are the sparse and dense vector ? I cant undestand ,can you explain to me?please.Why do we use for?

I am new to neural networks, embeddings, etc. I am struggling understanding things like sparse representation, embeddings, and especially sparse vectors. Could you explain these to me? Why do we need this vector? What is it?
1
vote
0 answers

Scipy Sparse vstack memory error

I have a bunch of scipy matrices (of the same #columns) loaded from disk. I want to combine them into one scipy sparse matrix. I am using scipy sparse vstack method. I am able to load the matrices individually (with adequate memory left in RAM),…
SHASHANK GUPTA
  • 3,745
  • 4
  • 18
  • 26
1
2 3