Highest Voted Questions - Data Science Stack Exchange

41

votes

2 answers

How to prepare/augment images for neural network?

I would like to use a neural network for image classification. I'll start with pre-trained CaffeNet and train it for my application. How should I prepare the input images? In this case, all the images are of the same object but with variations…

neural-network image-classification convolutional-neural-network preprocessing

asked Feb 24 '15 at 11:59

Alex I

3,142
1
21
27

41

votes

2 answers

How to calculate mAP for detection task for the PASCAL VOC Challenge?

How to calculate the mAP (mean Average Precision) for the detection task for the Pascal VOC leaderboards? There said - at page 11: Average Precision (AP). For the VOC2007 challenge, the interpolated average precision (Salton and Mcgill 1986) was…

machine-learning neural-network svm computer-vision object-recognition

asked Nov 26 '17 at 19:32

Alex

639
1
7
13

41

votes

3 answers

Choosing between CPU and GPU for training a neural network

I've seen discussions about the 'overhead' of a GPU, and that for 'small' networks, it may actually be faster to train on a CPU (or network of CPUs) than a GPU. What is meant by 'small'? For example, would a single-layer MLP with 100 hidden units…

neural-network deep-learning gpu

asked May 25 '17 at 23:48

StatsSorceress

1,981
3
14
30

40

votes

3 answers

When to use what - Machine Learning

Recently in a Machine Learning class from professor Oriol Pujol at UPC/Barcelona he described the most common algorithms, principles and concepts to use for a wide range of machine learning related task. Here I share them with you and ask you: is…

machine-learning algorithms

asked Jan 20 '15 at 15:27

Javierfdr

1,490
12
14

40

votes

6 answers

How to set the number of neurons and layers in neural networks

I am a beginner to neural networks and have had trouble grasping two concepts: How does one decide the number of middle layers a given neural network have? 1 vs. 10 or whatever. How does one decide the number of neurons in each middle layer? Is it…

machine-learning neural-network deep-learning hyperparameter hyperparameter-tuning

asked Jan 13 '18 at 15:26

stk1234

573
1
5
6

40

votes

4 answers

What are the advantages of HDF compared to alternative formats?

What are the advantages of HDF compared to alternative formats? What are the main data science tasks where HDF is really suitable and useful?

data-formats hierarchical-data-format

asked Jun 10 '14 at 09:26

IgorS

5,444
11
31
43

40

votes

10 answers

Do I need to learn Hadoop to be a Data Scientist?

An aspiring data scientist here. I don't know anything about Hadoop, but as I have been reading about Data Science and Big Data, I see a lot of talk about Hadoop. Is it absolutely necessary to learn Hadoop to be a Data Scientist?

bigdata apache-hadoop

asked Jun 10 '14 at 06:20

Pensu

591
1
4
8

40

votes

6 answers

Are there free cloud services to train machine learning models?

I want to train a deep model with a large amount of training data, but my desktop does not have that power to train such a deep model with these abundant data. I'd like to know whether there are any free cloud services that can be used for training…

machine-learning neural-network deep-learning cloud-computing

asked Nov 03 '17 at 12:41

Green Falcon

13,868
9
55
98

40

votes

4 answers

Why do we need XGBoost and Random Forest?

I wasn't clear on couple of concepts: XGBoost converts weak learners to strong learners. What's the advantage of doing this ? Combining many weak learners instead of just using a single tree ? Random Forest uses various sample from tree to create…

machine-learning data-mining random-forest decision-trees xgboost

asked Oct 14 '17 at 12:33

John Constantine

697
2
8
10

40

votes

1 answer

How to decide neural network architecture?

I was wondering how do we have to decide how many nodes in hidden layers, and how many hidden layers to put when we build a neural network architecture. I understand the input and output layer depends on the training set that we have but how do we…

machine-learning neural-network

asked Jul 06 '17 at 19:05

user7677413

515
1
4
5

40

votes

6 answers

Unbalanced multiclass data with XGBoost

I have 3 classes with this distribution: Class 0: 0.1169 Class 1: 0.7668 Class 2: 0.1163 And I am using xgboost for classification. I know that there is a parameter called scale_pos_weight. But how is it handled for 'multiclass' case, and how can…

classification xgboost multiclass-classification class-imbalance

asked Jan 16 '17 at 12:53

shda

565
1
5
10

40

votes

1 answer

The difference between `Dense` and `TimeDistributedDense` of `Keras`

I am still confused about the difference between Dense and TimeDistributedDense of Keras even though there are already some similar questions asked here and here. People are discussing a lot but no common-agreed conclusions. And even though, here,…

machine-learning neural-network keras

asked Mar 22 '16 at 20:04

fluency03

503
1
5
8

40

votes

4 answers

Guidelines for selecting an optimizer for training neural networks

I have been using neural networks for a while now. However, one thing that I constantly struggle with is the selection of an optimizer for training the network (using backprop). What I usually do is just start with one (e.g. standard SGD) and then…

neural-network optimization backpropagation

asked Mar 04 '16 at 09:32

mplappert

501
1
4
4

39

votes

5 answers

When to use Random Forest over SVM and vice versa?

When would one use Random Forest over SVM and vice versa? I understand that cross-validation and model comparison is an important aspect of choosing a model, but here I would like to learn more about rules of thumb and heuristics of the two…

machine-learning classification random-forest svm

asked Aug 20 '15 at 04:16

Rohit

545
1
4
7

39

votes

5 answers

What are some standard ways of computing the distance between documents?

When I say "document", I have in mind web pages like Wikipedia articles and news stories. I prefer answers giving either vanilla lexical distance metrics or state-of-the-art semantic distance metrics, with stronger preference for the latter.

machine-learning data-mining nlp text-mining similarity

asked Jul 05 '14 at 16:10

Matt

811
1
7
12

Most Popular