Questions tagged [deep-learning]

a new area of Machine Learning research concerned with the technologies used for learning hierarchical representations of data, mainly done with deep neural networks (i.e. networks with two or more hidden layers), but also with some sort of Probabilistic Graphical Models.

What is Deep Learning?

Deep Learning is an area of machine-learning which attempts to build neural-networks to learn complex functions by using special architectures composed of many layers (hence the term "deep").

Deep architectures allow more complex tasks to be learned because, in addition to these neural networks having more layers to perform transformations, the larger number of layers and more complex architectures of the neural network allow a hierarchical organization of functionality to emerge.

Deep Learning was introduced into machine learning research with the intention of moving machine learning closer to artificial intelligence. A significant impact of deep learning lies in feature learning, mitigating much of the effort going into manual feature engineering in non-deep learning neural networks.

New to Deep Learning?

There are a variety of resources including books, tutorials/workshops, etc. for those looking to learn more about Deep Learning.

A popular introductory tutorial is:

SciPy 2020 Conference Tutorial:

Deep Learning from Scratch with PyTorch

Some popular introductory books:

Deep Learning with Python, by François Chollet
Deep Learning with PyTorch, by Eli Stevens, Luca Antiga, and Thomas Viehman
Deep Learning, by Ian Goodfellow

Resources

Papers

Deep Learning in Neural Networks: An Overview

Books

Neural Networks and Deep Learning By Michael Nielsen - this is a free book with associated Python source code on Github
Deep Learning Made Easy with R: A Gentle Introduction For Data Science
Deep Learning: Methods and Applications
Autonomous Robotics and Deep Learning

Videos

Neural Networks Demystified - accompanied by a set of Jupyter Notebooks
Deep Learning by Andrew Ng

Stack Exchange Sites

Other StackExchange sites with Deep Learning tag:

4825 questions

256

votes

10 answers

How to set class weights for imbalanced classes in Keras?

I know that there is a possibility in Keras with the class_weights parameter dictionary at fitting, but I couldn't find any example. Would somebody so kind to provide one? By the way, in this case the appropriate praxis is simply to weight up the…

deep-learning classification keras weighted-data

asked Aug 17 '16 at 09:35

Hendrik

8,377
17
40
55

193

votes

6 answers

How to draw Deep learning network architecture diagrams?

I have built my model. Now I want to draw the network architecture diagram for my research paper. Example is shown below:

machine-learning neural-network deep-learning svm software-recommendation

asked Nov 03 '16 at 03:10

Muhammad Ali

2,437
5
19
22

188

votes

5 answers

What is the "dying ReLU" problem in neural networks?

Referring to the Stanford course notes on Convolutional Neural Networks for Visual Recognition, a paragraph says: "Unfortunately, ReLU units can be fragile during training and can "die". For example, a large gradient flowing through a ReLU…

machine-learning neural-network deep-learning

asked May 07 '15 at 04:11

tejaskhot

3,935
7
20
18

176

votes

6 answers

When to use GRU over LSTM?

The key difference between a GRU and an LSTM is that a GRU has two gates (reset and update gates) whereas an LSTM has three gates (namely input, output and forget gates). Why do we make use of GRU when we clearly have more control on the network…

neural-network deep-learning lstm gru

asked Oct 17 '16 at 11:47

Sayali Sonawane

2,001
3
12
13

174

votes

20 answers

How do you visualize neural network architectures?

When writing a paper / making a presentation about a topic which is about neural networks, one usually visualizes the networks architecture. What are good / simple ways to visualize common architectures automatically?

machine-learning neural-network deep-learning visualization

asked Jul 18 '16 at 17:08

Martin Thoma

18,630
31
92
167

114

votes

10 answers

Choosing a learning rate

I'm currently working on implementing Stochastic Gradient Descent, SGD, for neural nets using back-propagation, and while I understand its purpose I have some questions about how to choose values for the learning rate. Is the learning rate related…

machine-learning neural-network deep-learning optimization hyperparameter

asked Jun 16 '14 at 18:08

ragingSloth

1,824
3
14
15

votes

1 answer

When to use (He or Glorot) normal initialization over uniform init? And what are its effects with Batch Normalization?

I knew that Residual Network (ResNet) made He normal initialization popular. In ResNet, He normal initialization is used , while the first layer uses He uniform initialization. I've looked through ResNet paper and "Delving Deep into Rectifiers"…

neural-network deep-learning normalization

asked Jul 28 '16 at 17:12

Rizky Luthfianto

2,176
2
19
22

votes

8 answers

Time series prediction using ARIMA vs LSTM

The problem that I am dealing with is predicting time series values. I am looking at one time series at a time and based on for example 15% of the input data, I would like to predict its future values. So far I have come across two models: LSTM…

time-series deep-learning rnn prediction

asked Jul 11 '16 at 16:45

ahajib

1,075
1
9
15

votes

5 answers

What is the difference between "equivariant to translation" and "invariant to translation"

I'm having trouble understanding the difference between equivariant to translation and invariant to translation. In the book Deep Learning. MIT Press, 2016 (I. Goodfellow, A. Courville, and Y. Bengio), one can find on the convolutional…

neural-network deep-learning convolution

asked Jan 04 '17 at 08:41

Aamir

votes

6 answers

What is the difference between Gradient Descent and Stochastic Gradient Descent?

What is the difference between Gradient Descent and Stochastic Gradient Descent? I am not very familiar with these, can you describe the difference with a short example?

machine-learning neural-network deep-learning gradient-descent

asked Aug 04 '18 at 06:36

Developer

1,069
2
9
11

votes

6 answers

Cross-entropy loss explanation

Suppose I build a neural network for classification. The last layer is a dense layer with Softmax activation. I have five different classes to classify. Suppose for a single training example, the true label is [1 0 0 0 0] while the predictions be…

machine-learning neural-network deep-learning softmax

asked Jul 10 '17 at 10:26

enterML

3,011
9
26
38

votes

5 answers

Adding Features To Time Series Model LSTM

have been reading up a bit on LSTM's and their use for time series and its been interesting but difficult at the same time. One thing I have had difficulties with understanding is the approach to adding additional features to what is already a list…

machine-learning neural-network deep-learning time-series

asked Feb 21 '17 at 22:17

Rjay155

1,205
2
12
9

votes

5 answers

In softmax classifier, why use exp function to do normalization?

Why use softmax as opposed to standard normalization? In the comment area of the top answer of this question, @Kilian Batzner raised 2 questions which also confuse me a lot. It seems no one gives an explanation except numerical benefits. I get the…

machine-learning deep-learning

asked Sep 20 '17 at 05:53

Hans

votes

4 answers

Why mini batch size is better than one single "batch" with all training data?

I often read that in case of Deep Learning models the usual practice is to apply mini batches (generally a small one, 32/64) over several training epochs. I cannot really fathom the reason behind this. Unless I'm mistaken, the batch size is the…

machine-learning deep-learning

asked Feb 07 '17 at 12:40

Hendrik

8,377
17
40
55

votes

11 answers

Why should the data be shuffled for machine learning tasks

In machine learning tasks it is common to shuffle data and normalize it. The purpose of normalization is clear (for having same range of feature values). But, after struggling a lot, I did not find any valuable reason for shuffling data. I have read…

machine-learning neural-network deep-learning

asked Nov 09 '17 at 07:42

Green Falcon

13,868
9
55
98

2 3

…

99 100 Next