Questions tagged [aws]

Amazon Web Services (AWS) is a popular cloud computing service offered by Amazon.

Amazon Web Services (AWS) is a popular cloud computing service offered by Amazon. It offers clients access to computing power, data storage, and content delivery among other things. AWS is well known for its flexibility and scalability making it popular among large companies such as Facebook, Netflix, ESPN, etc.

64 questions
16
votes
5 answers

Downloading a large dataset on the web directly into AWS S3

Does anyone know if it's possible to import a large dataset into Amazon S3 from a URL? Basically, I want to avoid downloading a huge file and then reuploading it to S3 through the web portal. I just want to supply the download URL to S3 and wait…
Will Stedden
  • 163
  • 1
  • 1
  • 5
12
votes
3 answers

Does Amazon RedShift replace Hadoop for ~1XTB data?

There is plenty of hype surrounding Hadoop and its eco-system. However, in practice, where many data sets are in the terabyte range, is it not more reasonable to use Amazon RedShift for querying large data sets, rather than spending time and effort…
trienism
  • 253
  • 2
  • 9
12
votes
3 answers

Instances vs. cores when using EC2

Working on what could often be called "medium data" projects, I've been able to parallelize my code (mostly for modeling and prediction in Python) on a single system across anywhere from 4 to 32 cores. Now I'm looking at scaling up to clusters on…
Therriault
  • 871
  • 1
  • 8
  • 13
7
votes
1 answer

Does AWS Sagemaker continue running when I go offline?

I'm pulling data from web services in an AWS Sagemaker Notebook. I'd like to be able to close my computer for an hour, reconnect my computer and see that the data collection has continued to run. Additionally, I'd like my notebook's kernel…
the_cat_lady
  • 271
  • 2
  • 8
6
votes
2 answers

Processing data stored in Redshift

We're currently using Redshift as our data warehouse, which we're very happy with. However, we now have a requirement to do machine learning against the data in our warehouse. Given the volume of data involved, ideally I'd want to run the…
deanj
  • 63
  • 4
5
votes
3 answers

What technologies are fastest at performing joins on large datasets?

By "large", I mean in the range of 100m to 10b rows. I'm currently using both Hadoop MapReduce and Amazon RedShift. MapReduce has been a little disappointing here. Redshift works very well if the data is distributed well for the given query. Are…
Bill
  • 51
  • 1
4
votes
2 answers

AWS machine learning prediction schema problems

I've trained an AWS Machine Learning model with the training data from here : https://www.kaggle.com/c/titanic/data I'm now trying to run a batch prediction with the test data from the same source but I get the following error when I try to load the…
Monty
  • 41
  • 1
4
votes
2 answers

Keras neural network entirely different results on two different hardwares (one is AWS)

I have a Keras model. which is defined as follows: nn_model = Sequential() nn_model.add(Dense(300, activation="relu",input_shape=(4,))) # because we have 4 features nn_model.add(Dropout(0.3)) nn_model.add(Dense(150,…
Osama Dar
  • 589
  • 2
  • 8
  • 19
4
votes
2 answers

How do I get the name of Sagemaker Estimator's job

I'm having a stumbling block with SageMaker. How do I know what my job name is? For example: mnist_estimator = MXNet(entry_point='/home/ec2-user/sample-notebooks/sagemaker-python-sdk/mxnet_mnist/mnist.py', role=role, …
Paul Cezanne
  • 211
  • 2
  • 7
3
votes
2 answers

upload model to S3

I'm using AWS Sage Maker to build my model. I want to store the model in S3 for later use. How do you save your model in S3 with Amazon Sage Maker? I know this seems trivial but I didn't understand the sources/documentation I've read.
Laurent
  • 53
  • 1
  • 4
3
votes
1 answer

AWS : Workflow for deep learning

I am using my company computer since I don't have another one or linux. Therefore, I am starting to use cloud resources to perform some tasks. I have a very simple question: Since most cloud resources don't have a GUI, how can I perform simple…
FenryrMKIII
  • 143
  • 8
3
votes
2 answers

sklearn and pandas in AWS Lambda

I made a front end where I would like to make REST calls to an AWS Lambda interfaced with AWS API Gateway. I dumped my model as a pickle file (and so my encoders) which I initially trained locally. I then stored these files in a S3 bucket. The…
3nomis
  • 531
  • 6
  • 17
3
votes
1 answer

Why is Amazon P2 instance cost much more than G3?

I'm currently looking for the correct number of 2018 budget for the AWS instances in preparation of more deep neural net training, but now a bit confused because of this chart below: p2.16xlarge costs 3 times more than g3.16xlarge with the same…
YJ Kim
  • 39
  • 1
  • 3
3
votes
1 answer

Using AWS ML to recommend products

I have millions of user ratings on about 2k products. I want to use Machine Learning to analyse these ratings and recommend products to users based on other users ratings of the same and different products. i.e. user A likes 1,2 & 3, user B likes…
Simon
  • 31
  • 1
2
votes
1 answer

How to include lifecycle configuration in Sagemaker studio's user notebook

I want to use lifecycle configuration in Sagemaker studio so that on start of user's notebook it runs the given lifecycle configuration. My lifecycle configuration will have shell script which will launch cronjob having python script to send…
1
2 3 4 5