Questions tagged [best-practice]
5 questions
3
votes
3 answers
A good way to organize/store a lot of datasets
In machine translation, we often have bilingual dataset, e.g. for German-English and French-English we will have something that looks like this:
/en-de
train.de
train.en
dev.de
dev.en
test.de
test.en
/en-fr
train.fr
…
alvas
- 2,340
- 6
- 25
- 38
0
votes
0 answers
What are best practices for MLOps?
What are the best practices or design patterns for structuring data science projects and MLOps architecture in small teams?
1. Context and Background:
I work in a small data science team (<5). We exclusively develop predictive analytics solutions.…
bayes2021
- 218
- 1
- 4
0
votes
2 answers
Data Science Project Data Workflow Structure
I'm in the middle of a project of marketing regarding the sales prediction with promotions. The client has very complex business processes and so the data needs a lot of preprocessing (joins, filters, etc.). I have organize the code in different…
ru.mp
- 63
- 1
- 1
- 7
0
votes
0 answers
Ensuring independence of data in real life practice - meaning of Type I error
Suppose these two cases:
a. I have access to 100 examples. I then have access to another batch of 100 examples.
b. I have access to 200 examples, in once.
I am reading that in the first, I could do a T-test to check type-error I, by comparing…
user305883
- 165
- 9
0
votes
0 answers
Best way to store "small" data
I have a small dataset consisting of around 200 measurements of different light sources. The light is measured with a broadband and an IR diode, and the resulting voltage is sampled it with 4 megasamples/second. All the measurements have a length of…
ilja
- 1
- 1