I am very new to Data Science, but I have an use case which I want to solve.
I want to build a data synchronization scheduler which keeps track of the amount of data sync after every scheduled triggers and auto-adjusts the next schedule.
For example :
Let us suppose I have 3 jobs to execute. Currently, we keep each of them at 5 minutes interval (say) but this needs to be auto-scheduled.
- So let at 10 AM Job 1 got executed and got 10 entries.
- At 10 AM job 2 got executed and got 100 entries.
- At 10 AM job 3 got executed and got 200 entries.
For such a scenario, job 1 got less stream of data than job 2 and job 3. The auto-scheduler in such a case will auto-adjust the interval and recommend to change the next execution at :
- Job 1 - may be 10 min interval
- Job 2 - may be 5 min interval
- Job 3 - may be 2 min interval.
The scheduler will train itself based on time based historical data as well, for instance if stream of data is more at 10 AM for job 3, it might be less at 1 PM, when Job 1 might have more data. The scheduler would automatically adjust the time and make next schedule of 1 PM at lesser time interval for job 1 than job 3.
Can you suggest me any algorithm which I can follow to support this case ? Or even if you can help me how to proceed in ML, it would help me a lot.