Questions tagged [web-scraping]

32 questions
5
votes
1 answer

Scraping financial web data

I recently started working as a data scientist and I am starting a web scraping and NLP project using Python. The idea is to create a program that searches for public information on the company's clients. These information can come from various…
JamieA
  • 51
  • 1
2
votes
2 answers

looking for public dataset for stock market

I want to do some modelling and data visualization on historical stock data, including price, volume, financials, etc. Is there an public dataset available for stock price history? I looked at a few, but either they have a high cost, or not sure…
Donald S
  • 1,889
  • 3
  • 7
  • 28
2
votes
1 answer

getting error while scrapping Amazon using Selenium and bs4

I'm working on a class project using BeautifulSoup and webdriver to scrap Disposable Diapers on amazon for the name of the item, price, reviews, rating. My goal is to have something like this where I will split this info in different column: …
cesco
  • 29
  • 1
  • 7
2
votes
0 answers

Is there a way to scrape tweets in realtime from a list of specified users?

I am trying to build a scraper that will run continuously and save the tweets from a list of users instantaneously or within seconds of the user tweeting it. It could save the tweet details to a continuously updated csv file.
niusoski
  • 21
  • 2
1
vote
1 answer

Beautifulsoup iterating through scraped data

I have this html code that is repeating multiple times:
1
vote
0 answers

Web scraping using Beatiful Soup

I have this code and I wanna extract holidays, petrol and temperature but I don't know where is the problem. I need your help as soon as possible, please. I want to add this extraction to my dataset that is based on date columns, so comparing the…
1
vote
1 answer

Web Scraping: Multiple small files or one large file?

I plan to scrape some forums (Reddit, 4chan) for a research project. We will scrape the newest posts, every 10 minutes for around 3 months. I am wondering how best to store the JSON data from each scrape, so that pre-processing (via Python) later…
1
vote
0 answers

Good database/plug-in to scrape for academic paper info?

I am trying to scrape the web to find information about a set of academic papers. Unfortunately, for many of the papers, I only have the author's name, a year, and part of a title. (For example, [BANDURA A, 1997, SELF EFFICACY EXERCI]) I have tried…
Mox
  • 19
  • 1
1
vote
0 answers

Cyber crime - data set

I'm doing science project on my university to make an app, which uses AI to detect cyber crimes. I'm looking for sites to make my data set. Do you know any with ads like tobacco, alcohol, prescriptions, drugs, forgery? I'm using LASER embedder, so…
Piotr Rarus
  • 814
  • 4
  • 15
1
vote
1 answer

Scraping mixed elements and passing to SQL

I'm running webscrapes via python which are retrieving data from csv's hosted on the web. I'd like to pass the data into a MSSQL database. An issue I have is the mixed elements/data types in the csv. Here is an example of the data Item Val1 …
1
vote
1 answer

Turning Histogram values into Numerical format ( Excel-xslx, Pandas-DataFrame, etc.)

I am trying to do a correlation study about personality traits as described in Hofstede's :https://www.hofstede-insights.com/product/compare-countries/ . I would like to have the values described in the bar charts numerically into, say an Excel or…
MSIS
  • 113
  • 4
1
vote
0 answers

Bootstrap Template for a Dash App

Where Can I find a Bootstrap Template for a Dash app? Are there implemented examples? I cannot find nothing about this. I am building a Dashboard with Dash/Plotly but I want to improve the style of the application.
Laura
  • 161
  • 5
1
vote
0 answers

Web scrapping with beautifulSoup is done slowly

I have developed a web scrapping code in Python which takes data from Hattrick.org's matches and returns them in a table so it can be mined, determined likelihood of goals, etc. I have the difficult that is really slow, returning 12.000 rows in 5…
1
vote
1 answer

What is the NLP problem I am solving called and how should i go about solving it?

I am working on a POC where I am required to write a NLP code after web scraping. The prompt to my code is How good is the online Data Science degree offered by MIT? I am required to do web scrapping and other information resources and generate a…
1
vote
1 answer

Detect data (web textual content) age

This is a broad question and maybe does not have an answer but I will try. I have been thinking of some techniques to detect the date of publication of public data in the wild of the internet. Without raising any defamation concerns, admitting data…
bacloud14
  • 453
  • 5
  • 13
1
2 3