Online data collection using python

Data collection from websites Article by wanga harron. One of the core skills for a person working with data is data collection. As a data scientist /statistician/data analyst/data engineer can’t do anything constructive without data, this data must be found hence data collection. Data can be collected in many ways depending on data type and purpose of collection. Here I will look at data collection from a websites using relevant technology. Not long ago when organizations wanted to collect information from websites, they had to hire a group of people to do the collection, people hired had to copy paste a lot of pages this was tiresome and not economical . Until a python library called beautiful soup was invented. This made it easy to scrape(collect data from a website) a lot of pages in websites with ease. Beautiful soup also inspired the creation of R package called rvest which is used for web scraping. Another library in python used for scraping is seleniu...