0

We are currently storing images, and text in AWS s3, and small percentage of data comes with annotations. Every week we should remove most of the annotated data and keep only data that are relevant for further training the models which we are currently doing using airflow and sampling techniques.

We want to migrate fully to the cloud and we are rewriting ML pipelines to Azure stack using Azure ML and storing data in Azure blop storage.

We want to include sampling pipeline, which basically downloads the data from azure blop storage, do some sampling techniques and moves or copies/deletes data to a new location in azure blop storage and creates new data assets.

Would it make sense to include sampling pipeline in Azure ML pipelines, or rather Azure data factory pipelines? Are there any advantages/disadvantages in using either of those?

carak
  • 1

0 Answers0