Using Pandas to CSV() with Perfection

Pandas to_csv method is used to convert objects into CSV files. Comma-separated values or CSV files are plain text files that contain data separated by a comma. This type of file is used to store and exchange data. Now let us learn how to export objects like Pandas Data-Frame and Series into a CSV file.

A CSV file looks something like this-

Name, Age,Address 
David,20,Seattle 
Robert,30,Chicago

Contents

Exporting Data to CSV file using pandas to_csv method

We can covert objects like a pandas Data Frame and pandas Series into CSV files. Let us learn how-

We need a pandas library for this purpose, so first, we have to install it in our system using pip install pandas.
Now we have to import it using import pandas.

Converting Data-Frame into CSV

Data-Frame is a two-dimensional data structure containing rows and columns. A data frame looks something like this-

To convert a dataframe df into a csv file we use df.to_csv()

import pandas as pd #pd is an alias(nickname) given to pandas 

df = {'Name': ['David', 'Robert'], 'Age': [20, 30], 'Year':[4,3]} 
df = pd.DataFrame(df) 
print(df) 
data_csv = df.to_csv() 
print(data_csv)

Output- 
DataFrame-
Name Age Year 
0 David 20 4 
1 Robert 20 3 

Csv File-
,Name,Age,Year 
0,David,20,4 
1,Robert,30,3

Now, there are so many parameters in to_csv(). Let us understand some of the important ones.

df.to_csv( 
path_or_buf=None, 
sep=',', 
na_rep='', 
float_format=None, 
columns=None, 
header=True, 
index=True, 
index_label=None, 
mode='w', 
encoding=None, 
compression='infer', 
quoting=None, 
quotechar='"', 
line_terminator=None, 
chunksize=None, 
tupleize_cols=None, 
date_format=None, 
doublequote=True, 
escapechar=None, 
decimal='.', 
)

path_or_buf: It is the location where you want to save your csv file. None is the default value which means if no value is given, the output is in the form of a string.

import pandas as pd #pd is an alias(nickname) given to pandas 

df = {'Name': ['David', 'Robert'], 'Age': [20, 18], 'Year':[4,3]} 
df = pd.DataFrame(df) 
df.to_csv( r"C:\Users\Owner\Desktop\sample_csv.csv") """ sample.csv is the name we want to give """

Here we have placed ‘r’ before the path and file name to avoid this error-

“(unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape”

Due to ‘/’ python cannot read the path clearly so we have used ‘r’ to give python the order to read the path as it is.

Bonus-

We can also use pandas to read this csv file.

read_csv=pd.read_csv(r"C:\Users\Owner\Desktop\samples_csv.csv") 
print(read_csv)

Unnamed Name Age Year
0 David 20 4
1 Robert 18 3

sep: The default value of this parameter is ‘,’. This parameter decides how our data should be separated. If we give the value as ‘-’. The output would be-

csv=df.to_csv(sep="-") 
print(csv)

Output-  
-Name-Age-Year 
0-David-20-4 
1-Robert-18-3

na_rep: The default value is “”(empty string). If there is a null value (no value present), by default, it gets replaced by an empty string.

import pandas as pd #pd is an alias(nickname) given to pandas 

df = {'Name': ['David', pd.NaT], 'Age': [20, 18], 'Year':[4,3]} 
df = pd.DataFrame(df) 
print(df) 
csv=df.to_csv(na_rep="Anonymous") 
print(csv)

Output-
Name Age Year 
0 David 20 4 
1 NaT 18 3 
,Name,Age,Year 
0,David,20,4

columns: Here, we have to specify the columns of the data frame that we want to include in the CSV file. Also, whatever sequence of columns we specify, the CSV file will contain the same sequence. Suppose we only want to include columns- Name and Age and not Year-

csv=df.to_csv(columns=['Name','Age']) 
print(csv)

Output-
,Name,Age 
0,David,20 
1,Robert,18

header: The default value is True. If we do not want to add the header names (columns names) in the CSV file, we set header=False.

csv=df.to_csv(header=False) 
print(csv)

Output-
0,David,20,4 
1,Robert,18,3

index: This parameter accepts only boolean values, the default value being True. But when it is set to False, the CSV file does not contain the index.

csv=df.to_csv(index=False) 
print(csv)

Output-
Name,Age,Year 
David,20,4
Robert,18,3

index_label: By default, the value is set to None, and it takes string as an input. Using this parameter, we can give header to the index.

csv=df.to_csv(index_label="index")
print(csv)

Output-
,index,Name,Age,Year 
0,David,20,4 
1,Robert,18,3

If the value of index=False header=True, or index=True header=false, or index=False header=False, the value of index_label either True or False has no significance.

Converting JSON file into CSV file using Pandas to_csv:

Suppose we have a json file with input-

{"Name":{"0":"David","1":"Robert"},"Age":{"0":20,"1":18}}

And it is stored at a location – C:\Users\Owner\Documents\json.json

Converting this json file into a csv file is a single line of code –

pd.read_json(r"C:\Users\Owner\Documents\david\json.json").to_csv("jsontocsv.csv")

Output-
,Name,Age 
0,David,20 
1,Robert,18

Series into CSV File in Python

Series is a one-dimensional labelled ndarray. It looks something like this-

0 New York
1 Seattle
2 Chicago
3 Boston
4 Washington
5 Vegas
Name: Cities, dtype: object

Converting a series into a CSV file is the same as saving a data frame into a CSV file.

Let suppose above series is saved into a variable name ‘cities’.

c=cities.to_csv() 
print(c)

Output-
0,New York
1,Seattle
2,Chicago
3,Boston
4,Washington
5,Vegas

Must Read:

Conclusion-

Many times, we have to create a CSV file, like when we have to store some critical data on our computer or when we have to share that data with someone else. And that is why the Pandas to CSV method is very important.