Questions tagged [beginner]

For questions which concern getting started in Data Science or any of its related subdomains.

Some good beginner resources for getting started in Data Science are:

  1. Data Science wiki of Quora

Some good beginner resources for getting started in Machine Learning are:

  1. Professor Andrew Ng's course on Machine Learning
149 questions
62
votes
5 answers

RNN vs CNN at a high level

I've been thinking about the Recurrent Neural Networks (RNN) and their varieties and Convolutional Neural Networks (CNN) and their varieties. Would these two points be fair to say: Use CNNs to break a component (such as an image) into subcomponents…
Larry Freeman
  • 745
  • 1
  • 6
  • 8
57
votes
8 answers

Why do internet companies prefer Java/Python for data scientist job?

I see a many times in job description for data scientist asking for Python/Java experience and disregard R. Below is a personal email I received from chief data scientist of a company I applied for through linkedin. X, Thanks for connecting and…
StatguyUser
  • 865
  • 1
  • 8
  • 20
27
votes
2 answers

How to deal with time series which change in seasonality or other patterns?

Background I'm working on a time series data set of energy meter readings. The length of the series varies by meter - for some I have several years, others only a few months, etc. Many display significant seasonality, and often multiple layers -…
Jo Douglass
  • 401
  • 1
  • 5
  • 10
24
votes
3 answers

Keyword/phrase extraction from Text using Deep Learning libraries

Perhaps this is too broad, but I am looking for references on how to use deep learning in a text summarization task. I have already implemented text summarization using standard word-frequency approaches and sentence-ranking, but I'd like to explore…
21
votes
2 answers

Data science without knowledge of a specific topic, is it worth pursuing as a career?

I had a conversation with someone recently and mentioned my interest in data analysis and who I intended to learn the necessary skills and tools. They suggested to me that while it is great to learn the tools and build the skills there is little…
user3754366
  • 333
  • 1
  • 2
  • 4
19
votes
5 answers

Open source data science projects to contribute

Contribution into open source projects is typically a good way to get some practice for newbies, and try a new area for experienced data scientists and analysts. Which projects do you contribute? Please provide some intro + link on Github.
IgorS
  • 5,444
  • 11
  • 31
  • 43
16
votes
3 answers

How to self-learn data science?

I am a self-taught web developer and am interested in teaching myself data science, but I'm unsure of how to begin. In particular, I'm wondering: What fields are there within data science? (e.g., Artificial Intelligence, machine learning, data…
xyhhx
  • 263
  • 2
  • 6
16
votes
1 answer

Do you have to normalize data when building decision trees using R?

So, our data set this week has 14 attributes and each column has very different values. One column has values below 1 while another column has values that go from three to four whole digits. We learned normalization last week and it seems like…
Jae
  • 163
  • 1
  • 1
  • 8
13
votes
8 answers

I am a programmer, how do I get into field of Data Science?

First of all this term sounds so obscure. Anyways..I am a software programmer. One of the languages I can code is Python. Speaking of Data I can use SQL and can do Data Scraping. What I figured out so far after reading soo many articles that Data…
Volatil3
  • 341
  • 3
  • 10
13
votes
3 answers

Unstructured text classification

I'm going to classify unstructured text documents, namely web sites of unknown structure. The number of classes to which I am classifying is limited (at this point, I believe there is no more than three). Does anyone have a suggested for how I might…
12
votes
7 answers

How to apply class weight to a multi-output model?

I have a model with 2 categorical outputs. The first output layer can predict 2 classes: [0, 1] and the second output layer can predict 3 classes: [0, 1, 2]. How can I apply different class weight dictionaries for each of the outputs? For example,…
12
votes
3 answers

Which supervised learning algorithms are available for matching?

I'm working on a non-profit where we try to help potential university applicants by matching them with alumni that want to share their experience/wisdom and, at the moment, it is happening manually. So I'll have two tables, one with students and one…
11
votes
3 answers

What is a policy in machine learning?

While I was reading the paper "Grounded Action Transformation for Robot Learning in Simulation", I came across the term "policy". Could someone explain to me what that actually is (in general and in the particular context of the paper)?
Ramya Raj
  • 111
  • 1
  • 4
10
votes
4 answers

Online machine learning tutorial

Does anyone know some good tutorials on online machine learning technics? I.e. how it can be used in real-time environments, what are key differences compared to normal machine learning methods etc. UPD: Thank you everyone for answers, by "online" I…
Igor Bobriakov
  • 1,071
  • 2
  • 9
  • 11
10
votes
4 answers

What initial steps should I use to make sense of large data sets, and what tools should I use?

Caveat: I am a complete beginner when it comes to machine learning, but eager to learn. I have a large dataset and I'm trying to find pattern in it. There may / may not be correlation across the data, either with known variables, or variables that…
user3791372
  • 398
  • 2
  • 14
1
2 3
9 10