nepy | articles

Are you tired of going to the browser, downloading the data you want, and then saving it to your desired folder? Well, here is your solution! You can download the data from the web using Python! Let everything be automated!

The data used for this tutorial was downloaded from the following source: https://github.com/owid/covid-19-data/blob/master/public/data/vaccinations/vaccinations.csv.

After you have searched your file on the web (it can be any file from any web), the first thing you should do is to right-click on the file and copy its link address as shown in the figure below.

Now, go to your Python file and paste this link address of the file in order to read and download the file. For this purpose, since we are working with a link address, we have to import the request library from the urllib library.

#Importing library
from urllib import request


#Reading the file from the link
file_url = r'https://github.com/owid/covid-19-data/blob/master/public/data/vaccinations/vaccinations.csv?raw=true'

The letter r in the code stands for reading mode. Note that the link address should be inside the quotation marks (' ').

Now, we will get the file downloaded line by line and saved in a text file (which is not yet created). For this purpose, we will define a function for doing this, and then, at the end, we should call this function in order to get the data.

#Defining a function to download the file


def file_info(url):

   
#Opening the url file
   file_open = request.urlopen(url)

   
#Reading the file
   file_content = file_open.read()

   
#Converting into string
   content = str(file_content)

   
#Splitting the lines
   lines = content.split('\\n')

Notice that the function's name is file_info, and its input is called url, which can be differently named, as you prefer. However, if you do so, do not forget to change the corresponding names in the upcoming code lines!

Once the function is defined, the first thing we should do is to open the file from the web. For this, the function request.urlopen is needed. Then, in order for Python to go through the whole file and read it, the function read is needed.

The opened file by Python is in the bit format, which is a complex format to work with. Thus, the need to convert it to a string format arises. After doing so, we must split the lines of the file, otherwise, the whole content of the file will be in one long line.

Now that Python is able to read the file from the web, we will save it as a new file in the same directory as our Python script file. For this purpose, we just need 4 lines of code!

   #Saving data into a text file
   with open('vaccinations.txt', 'w') as output_file:
      for line in lines:
         save_data = output_file.write(line + '\n')
         print(save_data)

Python has the possibility to 'open' a file that does not exist in a write mode. The write mode 'w' means that the text file Python just created is ready to be written. In the first line, output_file is the name of the variable. It is similar to this:

output_file = open('vaccinations.txt', 'w')

Then, the second line of the code is used to go through the lines variable, which contains the content of our web file. Once Python has read all the lines of the web file, it will copy and paste it into the created text file using the write function, and then save it. As already explained before, the keyword '\n' is used to split the lines.

Once we got the text file created with the content from the web file, we just need to call our previous created function file_info(url).

#Calling the function
file_info(file_url)

If we run this code, the text file created by Python will be found in the same folder as your Python script.

The final code will look like this:

#Importing library
from urllib import request


#Reading the file from the link
file_url = r'https://github.com/owid/covid-19-data/blob/master/public/data/vaccinations/vaccinations.csv?raw=true'


def file_info(url):

   
#Opening the url file
   file_open = request.urlopen(url)

   
#Reading the file
   file_content = file_open.read()

   
#Converting into string
   content = str(file_content)

   
#Splitting the lines
   lines = content.split('\\n')

   
#Saving data into a text file
   with open('vaccinations.txt', 'w') as output_file:
      for line in lines:
         save_data = output_file.write(line + '\n')
         print(save_data)


#Calling the function
file_info(file_url)

Congratulations! You just made your first step into huge amount of data! Keep coding! To download the complete code, please click here.

joushe info

Looking for new horizons

Python: Downloading data from the web

Related Articles

How to connect through SSH to a remote device that is not in the same network

Python: Drawing a perfect heart

How to configure a free domain name for a dynamic IP

Python Machine Learning: Linear Regression (I)

Notifications

Other Articles

Integral calculus to predict the future of Despacito

Linear extrapolation to predict the future of Despacito

How to be a millionaire with stocks during a financial crisis

Anime: Peru goes to the 2018 FIFA World Cup (captain Tsubasa version)

Make money with margin and leverage on bitcoins. The secret

Configure

Color

Navigation Position

Vertical Navbar Style