Azure ML or Azure Data Factory for sampling/data asset pipeline

Asked Jul 11 '23 at 16:52

Active Jul 11 '23 at 16:52

Viewed 12 times

We are currently storing images, and text in AWS s3, and small percentage of data comes with annotations. Every week we should remove most of the annotated data and keep only data that are relevant for further training the models which we are currently doing using airflow and sampling techniques.

We want to migrate fully to the cloud and we are rewriting ML pipelines to Azure stack using Azure ML and storing data in Azure blop storage.

We want to include sampling pipeline, which basically downloads the data from azure blop storage, do some sampling techniques and moves or copies/deletes data to a new location in azure blop storage and creates new data assets.

Would it make sense to include sampling pipeline in Azure ML pipelines, or rather Azure data factory pipelines? Are there any advantages/disadvantages in using either of those?

asked Jul 11 '23 at 16:52

carak

Azure ML or Azure Data Factory for sampling/data asset pipeline

0 Answers0