Questions tagged [best-practice]

5 questions
3
votes
3 answers

A good way to organize/store a lot of datasets

In machine translation, we often have bilingual dataset, e.g. for German-English and French-English we will have something that looks like this: /en-de train.de train.en dev.de dev.en test.de test.en /en-fr train.fr …
alvas
  • 2,340
  • 6
  • 25
  • 38
0
votes
0 answers

What are best practices for MLOps?

What are the best practices or design patterns for structuring data science projects and MLOps architecture in small teams? 1. Context and Background: I work in a small data science team (<5). We exclusively develop predictive analytics solutions.…
bayes2021
  • 218
  • 1
  • 4
0
votes
2 answers

Data Science Project Data Workflow Structure

I'm in the middle of a project of marketing regarding the sales prediction with promotions. The client has very complex business processes and so the data needs a lot of preprocessing (joins, filters, etc.). I have organize the code in different…
ru.mp
  • 63
  • 1
  • 1
  • 7
0
votes
0 answers

Ensuring independence of data in real life practice - meaning of Type I error

Suppose these two cases: a. I have access to 100 examples. I then have access to another batch of 100 examples. b. I have access to 200 examples, in once. I am reading that in the first, I could do a T-test to check type-error I, by comparing…
0
votes
0 answers

Best way to store "small" data

I have a small dataset consisting of around 200 measurements of different light sources. The light is measured with a broadband and an IR diode, and the resulting voltage is sampled it with 4 megasamples/second. All the measurements have a length of…