Questions tagged [sparse]
32 questions
3
votes
0 answers
Converting pandas dataframe to scipy sparse arrays
Converting pandas data frame with mixed column types -- numerical, ordinal as well as categorical -- to Scipy sparse arrays is a central problem in machine learning.
Now, if my pandas' data frame consists of only numerical data, then I can simply do…
learner
- 359
- 1
- 11
3
votes
1 answer
Convert Pandas Dataframe with mixed datatypes to LibSVM format
I have a pandas data frame with about Million rows and 3 columns. The columns are of 3 different datatypes. NumberOfFollowers is of a numerical datatype, UserName is of a categorical data type, Embeddings is of categorical-set type.
df:
Index …
learner
- 359
- 1
- 11
2
votes
1 answer
Best metric and hyperparameters in dimension reduction with UMAP for binary sparse data
I am playing with a dimensionality reduction step prior to clustering for a pretty large sparse binary matrix of almost 3000 columns and 50k rows.
My idea is to embed the 3000 dimensions into a two-dimensional space with UMAP and then cluster the…
linello
- 121
- 4
2
votes
1 answer
Why do my target labels need to begin at 0 for sparse categorical cross entropy to work?
I'm following a guide here to implement image segmentation in Keras.
One thing I'm confused about are these lines:
# Ground truth labels are 1, 2, 3. Subtract one to make them 0, 1, 2:
y[j] -= 1
The ground truth targets are .png files with either…
TomSelleck
- 115
- 5
1
vote
0 answers
Linear regression on sparse matrix?
I have a matrix with sparse data. A small extract from it is seen below. The columns represent years and the rows represent different race tracks. The feature values are velocities on that specific track a specific year. Generally the velocity…
Hanson
- 11
- 1
1
vote
2 answers
Predicting high frequency sparse time series data in python
I have a dataset of a couple of EV charging stations (10 min frequency) over 1 year. This data consists of lots of 0's, since there is no continuous flow of cars coming to charge but rather reoccurring charging events as peaks(for example from 7-9…
Ruben Kl
- 11
- 2
1
vote
3 answers
How to create a big data frame in Python
I have a sparse matrix, $X$, created by TfidfVectorizer and its size is $(500000, 200000)$. I want to convert $X$ to a data frame but I'm always getting a memory error.
I tried
pd.DataFrame(X.toarray(),…
user109341
1
vote
1 answer
Predicting sparse time series data
I have a dataset of a couple of EV charging stations (10 min frequency) over 1 year. This data consists of lots of 0s, since there is no continuous flow of cars coming to charge, but rather reoccurring charging events as peaks (for example from…
Ruben Kl
- 11
- 1
1
vote
1 answer
Anomaly detection on sparse categorical data
I have a big dataset with a column "clientid" and a categorical column "choice".
I want to find out what are the clients that have strange combinations of choices (less frequent ones) and being able in the future to identify new strange combinations…
DataLover
- 19
- 3
1
vote
1 answer
unsupervised anomaly detection on sparse data
Given that I have a very sparse data matrix with continuous features, like this dataframe for example
Feature_A Feature_B Feature_C....Feature_Z
0.3 0 0.1 0
0.5 0.5 0 0
0 0…
greghouse1
- 11
- 2
1
vote
0 answers
Autoencoder for Extremely Sparse Data
I am attempting to train an autoencoder on data that is extremely sparse. Each datapoint is only zeros and ones and contains ~3% 1s. Being that the data is mostly zero the autoencoder learns to guess zero every time. Is there a way to prevent…
PDPDPDPD
- 113
- 2
1
vote
0 answers
Sparse Covariance Selection
I was reading this article https://www.di.ens.fr/~aspremon/PDF/CovSelSIMAX.pdf, whose goal is to estimate the covariance matrix from a the sample covariance matrix drawn from a distribution $X$.
' Given a sample covariance matrix, we solve a maximum…
mbz0
- 11
- 2
1
vote
0 answers
Custom Loss Function for Mixing Sparse and Dense Features for a Prediction Problem
I have a largely uncorrelated feature space of about 40 dichotomous features, using which I'm trying to predict a continuous target variable.
Now, some of these features are very sparse (Active less than 10% of the time, with the rest as zeros). But…
Recyclops
- 11
- 2
1
vote
1 answer
What are the sparse and dense vector ? I cant undestand ,can you explain to me?please.Why do we use for?
I am new to neural networks, embeddings, etc. I am struggling understanding things like sparse representation, embeddings, and especially sparse vectors. Could you explain these to me? Why do we need this vector? What is it?
Tugba Ozkan
- 11
- 2
1
vote
0 answers
Scipy Sparse vstack memory error
I have a bunch of scipy matrices (of the same #columns) loaded from disk.
I want to combine them into one scipy sparse matrix.
I am using scipy sparse vstack method.
I am able to load the matrices individually (with adequate memory left in RAM),…
SHASHANK GUPTA
- 3,745
- 4
- 18
- 26