All Courses

Python - Processing CSV Data

Neha Kumawat

2 years ago

Python - Processing CSV Data | InsideAIML
Table of Contents
  • Introduction
  • Input as CSV File
  • Reading a CSV File
  • Reading Specific Rows
  • Reading Specific Columns
  • Reading Specific Columns and Rows
  • Reading Specific Columns for a Range of Rows

Introduction

          Perusing data from CSV(comma isolated qualities) is a crucial need in Data Science. Frequently, we get information from different sources that can get traded to CSV design so they can be utilized by different frameworks. The Panadas library gives highlights utilizing which we can peruse the CSV record in full just as in parts for just a choice gathering of segments and lines.

Input as CSV File

          The csv file is a text document in which the qualities in the sections are isolated by a comma. We should consider the accompanying information present in the document named input.csv.
You can make this record utilizing windows notebook by reordering this information. Spare the document as input.csv utilizing the spare As All files(*.*) alternative in the scratchpad.

id,name,salary,start_date,dept
1,Rick,623.3,2012-01-01,IT
2,Dan,515.2,2013-09-23,Operations
3,Tusar,611,2014-11-15,IT
4,Ryan,729,2014-05-11,HR
5,Gary,843.25,2015-03-27,Finance
6,Rasmi,578,2013-05-21,IT
7,Pranab,632.8,2013-07-30,Operations
8,Guru,722.5,2014-06-17,Finance

Reading a CSV File

          The read_csv function of the panda's library is utilized perused the substance of a CSV record into the python condition as a pandas DataFrame. The capacity can peruse the records from the OS by utilizing an appropriate way to the document.
import pandas as pd
data = pd.read_csv('path/input.csv')
print (data)
At the point when we execute the above code, it delivers the accompanying outcome. If it's not too much trouble note how an extra section beginning with zero as a list has been made by the function. 

   id    name  salary  start_date        dept
0   1    Rick  623.30  2012-01-01          IT
1   2     Dan  515.20  2013-09-23  Operations
2   3   Tusar  611.00  2014-11-15          IT
3   4    Ryan  729.00  2014-05-11          HR
4   5    Gary  843.25  2015-03-27     Finance
5   6   Rasmi  578.00  2013-05-21          IT
6   7  Pranab  632.80  2013-07-30  Operations
7   8    Guru  722.50  2014-06-17     Finance

Reading Specific Rows

          The read_csv function of the panda's library can likewise be utilized to peruse some particular lines for a given segment. We cut the outcome from the read_csv work utilizing the code that appeared beneath for the initial 5 lines for the segment named compensation.

import pandas as pd
data = pd.read_csv('path/input.csv')

# Slice the result for first 5 rows
print (data[0:5]['salary'])
At the point when we execute the above code, it creates the accompanying outcome. 

0    623.30
1    515.20
2    611.00
3    729.00
4    843.25
Name: salary, dtype: float64

Reading Specific Columns

          The read_csv function of the panda's library can likewise be utilized to peruse some particular segments. We utilize the multi-tomahawks ordering strategy called .loc() for this reason. We decide to show the compensation and name segment for all the lines.

import pandas as pd
data = pd.read_csv('path/input.csv')

# Use the multi-axes indexing funtion
print (data.loc[:,['salary','name']])
At the point when we execute the above code, it creates the accompanying outcome.

   salary    name
0  623.30    Rick
1  515.20     Dan
2  611.00   Tusar
3  729.00    Ryan
4  843.25    Gary
5  578.00   Rasmi
6  632.80  Pranab
7  722.50    Guru

Reading Specific Columns and Rows

          The read_csv capacity of the panda's library can likewise be utilized to peruse some particular sections and explicit lines. We utilize the multi-tomahawks ordering strategy called .loc() for this reason. We decide to show the compensation and name segment for a portion of the lines. 

import pandas as pd
data = pd.read_csv('path/input.csv')

# Use the multi-axes indexing funtion
print (data.loc[[1,3,5],['salary','name']])
At the point when we execute the above code, it delivers the accompanying outcome.

   salary   name
1   515.2    Dan
3   729.0   Ryan
5   578.0  Rasmi

Reading Specific Columns for a Range of Rows

          The read_csv function of the panda's library can likewise be utilized to peruse some particular sections and scope of columns. We utilize the multi-tomahawks ordering strategy called .loc() for this reason. We decide to show the pay and name segment for a portion of the lines. 

import pandas as pd
data = pd.read_csv('path/input.csv')

# Use the multi-axes indexing funtion
print (data.loc[2:6,['salary','name']])
At the point when we execute the above code, it creates the accompanying outcome.

   salary    name
2  611.00   Tusar
3  729.00    Ryan
4  843.25    Gary
5  578.00   Rasmi
6  632.80  Pranab
I hope you enjoyed reading this article and finally, you came to know about Python - Processing CSV Data.
    
For more such blogs/courses on data science, machine learning, artificial intelligence and emerging new technologies do visit us at InsideAIML.
Thanks for reading…
Happy Learning…

Submit Review