Questions tagged [python-3.x]

240 questions
94
votes
10 answers

ValueError: Input contains NaN, infinity or a value too large for dtype('float32')

I got ValueError when predicting test data using a RandomForest model. My code: clf = RandomForestClassifier(n_estimators=10, max_depth=6, n_jobs=1, verbose=2) clf.fit(X_fit, y_fit) df_test.fillna(df_test.mean()) X_test = df_test.values y_pred =…
Edamame
  • 2,705
  • 5
  • 23
  • 32
8
votes
1 answer

What tokenizer does OpenAI's GPT3 API use?

I'm building an application for the API, but I would like to be able to count the number of tokens my prompt will use, before I submit an API call. Currently I often submit prompts that yield a 'too-many-tokens' error. The closest I got to an answer…
Herman Autore
  • 83
  • 1
  • 3
6
votes
2 answers

Python stemmer for Georgian

I am currently working with Georgian texts processing. Does anybody know any stemmers/lemmatizers (or other NLP tools) for Georgian that I could use with Python. Thanks in advance!
5
votes
1 answer

Applying Differencing on a time series, before or after train and test split?

I am attempting to improve my RNN model by making my dependent variable, a stock price, non-stationary. I am aiming to make the series stationary by removing the trend with a log transformation and then performing moving average differencing to…
5
votes
1 answer

One Hot Encoding for any kind of dataset

How can I make a one hot encoding for a unknown dataset which can iterate and check the dytype of the dataset and do one hot encoding by checking the number of unique values of the columns, also how to keep track of the new one hot encoded data with…
5
votes
2 answers

Beginning my data science journey

I am a Master's student in physics. Lately I have been intrigued by the field of data science. I have beginner level knowledge of python, undergraduate level knowledge of mathematics and master's level knowledge of physics. I now want to learn data…
Maaz Khan
  • 53
  • 3
4
votes
2 answers

What is the best model for a recommendation system using implicit ratings?

I have a similariy matrix that looks like this: I have a bunch of user vectors with 1s and 0s, with a 1 indicating that someone has clicked on an email (as part of a campaign) and zero to indicate they haven't. Implicit ratings as I have come to…
4
votes
1 answer

How is the GridsearchCV Score calculated?

How is the score of GridsearchCV calculated? Is the score a percentage? Does this mean higher is a better?
ml_learner
  • 347
  • 1
  • 4
  • 11
4
votes
1 answer

Pandas dataframe, create columns depending on the row value

I get a csv that if you read it, looks like: import pandas as pd df = pd.DataFrame([['de,ch,fr', '1,2,3'],['fr,ch,dk', '3,4,5']], columns=['countries', 'numbers'], index=['abc', 'bcd']) I want to make it look like this: df =…
4
votes
2 answers

Is it alright to split a GridSearchCV?

Is it ok to split a GridsearchCV? At first, I would try estimators from 100-300 (100 steps) for a random forest regressor and some other parameters and after that, I would start the GridsearchCV with the same parameter and just change the estimators…
4
votes
1 answer

What is the difference between "Python interactive" and "Python 3" kernels

When creating a new notebook using the web interface, Jupyter gives me two options: 'Python Interactive' and 'Python 3' (see screenshot). However, I've been unable to find any indication on what the differences between these two kernels (which I…
Fritz
  • 141
  • 2
4
votes
1 answer

How to make a Python code execute on GPU and not CPU

I am doing some image pre-processing using Python 3. As I know the code is executed on CPU for which the process is a little bit slow, its a good idea to make it run on GPU for the faster process because GPU can do graph operation much faster than…
Hunar
  • 1,127
  • 2
  • 11
  • 33
3
votes
1 answer

Keras early stopping to a target

I'm really struggling to understand how the parameters of Keras early stopping callbacks play out, especially in the presence of a baseline. What I want is simply for the training to stop within 2 epochs of the validation accuracy hitting 95%. So…
3
votes
4 answers

How to combine and separate test and train data for data cleaning?

I am working on an ML model in which I have been provided the data in 2 files test.csv and train.csv. I want to perform data cleaning on both files together be concatenating them and then separating them. I know how to concatenate 2 dataframes, but…
Ishan Dutta
  • 283
  • 2
  • 5
  • 14
3
votes
1 answer

No gradients provided for any variable: ['Variable:0', 'Variable:0']

I am using Python 3.6 and Tensorflow 2.0 for the following code for linear regression: tx = tf.constant(x, dtype=tf.float32) ty = tf.constant(y, dtype=tf.float32) tw = tf.Variable(initial_value=np.random.randn(), dtype=tf.float32,…
1
2 3
15 16