Key Takeaways
- Python’s built-in csv module provides functions and classes for reading, writing, and handling data in CSV formats. The csv.reader() function can be used to read CSV files, while csv.writer() allows writing data to CSV files.
- CSV files can be converted to JSON format using Python’s csv.DictReader class, which converts a CSV file to a Python dictionary. The dictionary can then be converted to a JSON file using the json.dump() function.
- JSON files can be converted to CSV format by first using the json.load() function to convert the JSON file to a Python dictionary. The csv.DictWriter class methods can then be used to convert the dictionary to a CSV file.
- CSV files are commonly used for data import and export in spreadsheets and databases. Python’s csv module simplifies working with CSV files and converting them to other formats such as JSON.
In this article, we’ll learn how to use Python to read from and write data to CSV files, and how to convert CSV files to JSON format and vice versa. We’ll explore how to use the csv module and also look at examples that help understand how it works.
A CSV (comma-separated values) file is a text file format that allows data to be saved in a tabular structure. This is a popular format used for exporting and importing data from databases and spreadsheets.
As the name suggests, each piece of data in a CSV file is separated by a comma (,
). Sometimes the term “CSV” can be used to describe formats with other types of separators, such as colons (:
), semicolons (;
) and tabs (\t
). For the purposes of this article, we’ll just be dealing with CSV files that use commas as delimiters (known as RFC 4180).
When opened, the content of a CSV file looks like this:
Employee Id,First Name,Gender,Start Date,Last Login Time,Salary,Bonus %,Senior Management,Team
1,Douglas,Male,8/6/1993,12:42 PM,,6.945,TRUE,Marketing
2,Thomas,Male,3/31/1996,6:53 AM,61933,4.17,,
3,Maria,Female,4/23/1993,11:17 AM,,11.858,FALSE,Finance
4,Jerry,Male,3/4/2005,1:00 PM,138705,9.34,,Finance
As seen above, the comma delimiter, ,
, is used to separate each specific piece of data in the file.
The first row of data may optionally serve as the header, identifying each column of data below it. CSV files are commonly saved with a .csv
file extension.
The csv Module
Since spreadsheets and databases like MS SQL can be imported and exported as CSV files, it’s important to know how to handle data served in CSV format programmatically. Most programming languages like Python support handling files in CSV and also transforming them to other formats like JSON.
Python provides the csv module for reading, writing and performing other forms of file handling in CSV formats. The in-built library provides functions and classes that make working with CSV files seamless.
How to Read CSV Files Using Python
The csv module has the csv.reader()
function for reading CSV files. It’s used together with objects (including file objects) such as those produced with Python’s in-built open()
function.
Given a file object from a call to open()
, csv.reader()
will return a reader object. The reader object can be used to iterate over each line of CSV data, where rows are returned as a list of strings.
Let’s take an example:
import csv
with open('employees.csv', newline='') as file_obj:
reader_obj = csv.reader(file_obj)
for row in reader_obj:
print(row)
Here’s the output of the code above:
['Employee Id', 'First Name', 'Gender', 'Start Date', 'Last Login Time', 'Salary', 'Bonus %', 'Senior Management', 'Team']
['1', 'Douglas', 'Male', '8/6/1993', '12:42 PM', '', '6.945', 'TRUE', 'Marketing']
['2', 'Thomas', 'Male', '3/31/1996', '6:53 AM', '61933', '4.17', '', '']
['3', 'Maria', 'Female', '4/23/1993', '11:17 AM', '', '11.858', 'FALSE', 'Finance']
['4', 'Jerry', 'Male', '3/4/2005', '1:00 PM', '138705', '9.34', '', 'Finance']
['5', 'Larry', 'Male', '1/24/1998', '4:47 PM', '101004', '1.389', 'TRUE', 'Client Services']
...
From the first code snippet, the employees.csv
file is opened, after which the csv.reader()
function parses it and returns a reader object. A simple for loop is used to iterate over the reader object, which returns a list of data from the each row from the employees.csv
file, starting from the top.
How to Write to CSV Files Using Python
Besides reading data from CSV files, we can also write data to these files in Python. The csv.writer()
function enables us to write data to CSV format. After opening the file in write mode, the csv.writer()
function returns a writer object, which converts supplied data into delimited strings on the provided file object. The writer object has the writerow()
method for writing a row — an iterable of strings or numbers of comma-separated values per time — while the writerows()
method is used for multiple rows at once. The writerow()
and writerows()
methods are they only two options for writing data to a CSV file.
All the list objects used in the code snippet above could be grouped into a 2D list and passed in as an argument to the writerows()
method of the writer object to achieve the same result.
After the with
statement is executed, a CSV file (products.csv
) is created in the current working directory containing these comma-separated values.
Here’s an example:
import csv
with open('products.csv', 'w', newline='') as file_obj:
writer_obj = csv.writer(file_obj)
writer_obj.writerow(['Product Name', 'Price', 'Quantity', 'SKU Number' ])
writer_obj.writerow(['Rice', 80, 35, 'RI59023'])
writer_obj.writerow(['Curry', 2, 200, 'CY13890'])
writer_obj.writerow(['Milk', 9.5, 315, 'MK10204'])
Here’s the output of the code above:
Product Name,Price,Quantity,SKU Number
Rice,80,35,RI59023
Curry,2,200,CY13890
Milk,9.5,315,MK10204
How to Convert CSV to JSON Using Python
While performing file I/O operations, we might want to convert a CSV file to JSON format — which is popular for receiving and transmitting data between a client and a server. The csv module provides the csv.DictReader
class to help us to achieve this.
The csv.DictReader
class methods help to convert a given CSV file to a Python dictionary before applying the json module’s json.dump()
function to convert the resulting Python dictionary to a JSON file. The csv.DictReader()
class takes an optional fieldnames
argument. Where the field names are omitted, values from the first row will be mapped to the rest of the data as field names.
Let’s take a look at an example:
import csv
import json
my_dict = {}
with open('employees.csv', newline='') as file_obj:
reader_object = csv.DictReader(file_obj)
for row in reader_object:
key = row['Employee Id']
my_dict[key] = row
with open('employee.json', 'w', encoding='utf-8') as file_obj:
json.dump(my_dict, file_obj, indent=4)
Here’s the output of the code above:
"1": {
"Employee Id": "1",
"First Name": "Douglas",
"Gender": "Male",
"Start Date": "8/6/1993",
"Last Login Time": "12:42 PM",
"Salary": "",
"Bonus %": "6.945",
"Senior Management": "TRUE",
"Team": "Marketing"
},
"2": {
"Employee Id": "2",
"First Name": "Thomas",
"Gender": "Male",
"Start Date": "3/31/1996",
"Last Login Time": "6:53 AM",
"Salary": "61933",
"Bonus %": "4.17",
"Senior Management": "",
"Team": ""
},
...
To convert a CSV file to a JSON equivalent, we applied the following steps:
- opened the
employees.csv
file in read mode - created a Python dictionary from the returned file object using the
csv.DictReader
class - opened a JSON file in write mode, such as
employees.json
(if no such file had existed, one would have been created) - used the
dump()
function of the json module to convert the Python dictionary (my_dict
) to a JSON file
How to Convert JSON to CSV Using Python
In this section, we’ll look at how to convert data from a JSON file to CSV format. To achieve this, we’ll use both the in-built csv and json Python modules. The json module’s json.load()
function will help convert a JSON file to a Python dictionary, while the csv module’s csv.DictWiter
class methods will help convert the Python dictionary to a CSV file.
Here’s an example:
import csv
import json
py_dict = {}
# convert json file to python dictionary
with open('employees.json', 'r', encoding='utf-8') as file_obj:
py_dict = json.load(file_obj)
# convert write python dictionary to csv
with open('employees_records.csv', 'w', newline='') as file_obj:
csv_writer = csv.DictWriter(file_obj, fieldnames=py_dict['1'].keys())
csv_writer.writeheader()
for key in py_dict.keys():
csv_writer.writerow(py_dict[key])
To convert a JSON file to a CSV equivalent, we applied the following steps:
- opened the
employees.json
file in read mode - used the
json.load()
function to create a Python dictionarypy_dict
- opened a CSV file
employees_records.csv
in write mode (if no such file had existed, one would have been created) - created a writer object with the
csv.DictWriter
class with necessary arguments - used the writer object methods to map dictionaries into the appropriate number of rows
Conclusion
CSV files are very popular and often used in exporting and importing spreadsheets and databases. This file format is used very often by those working with data. However, while programming with Python there might be need to quickly use CSV files, so it’s important to learn how to perform file I/O operations with CSV.
Python’s csv module is very handy for working with CSV files, as it provides the necessary functions and classes for these sort of tasks.
It’s important to also note that we may need to convert files from one format to another (CSV to JSON) as seen in our examples above.
Frequently Asked Questions (FAQs) about Python CSV Files
How can I handle large CSV files in Python?
Handling large CSV files in Python can be a bit challenging due to memory constraints. However, Python’s built-in CSV module provides a way to read and write CSV files in smaller chunks, thus making it possible to work with large files. You can use the reader
object in a loop to read a specific number of rows at a time. This way, you can process a large file in smaller, more manageable chunks.
How can I write to a CSV file in Python?
Writing to a CSV file in Python is straightforward with the CSV module. You can use the writer
object and its writerow
method to write a single row, or writerows
method to write multiple rows at once. Remember to open the file in write mode (‘w’) before writing to it.
How can I handle CSV files with different delimiters in Python?
The CSV module in Python allows you to specify the delimiter when reading or writing CSV files. The reader
and writer
objects take a delimiter
parameter, which you can set to any character that your CSV file uses as a delimiter.
How can I handle CSV files with headers in Python?
If your CSV file includes a header row, you can use the DictReader
object in the CSV module to read the file. This object treats each row as a dictionary, where the keys are the column names from the header row, and the values are the data in each row.
How can I handle CSV files with quoted fields in Python?
The CSV module in Python provides the QUOTE_MINIMAL
, QUOTE_ALL
, QUOTE_NONNUMERIC
, and QUOTE_NONE
constants to handle quoted fields in CSV files. You can specify these constants as the quoting
parameter when creating a reader
or writer
object.
How can I handle CSV files with special characters in Python?
If your CSV file contains special characters, you can handle them by opening the file in binary mode and using the unicodecsv
module instead of the built-in CSV module. This module works just like the CSV module, but it supports Unicode characters.
How can I handle CSV files with missing values in Python?
Missing values in CSV files can be handled using the pandas
library in Python. You can read the CSV file into a DataFrame, and then use the fillna
method to fill missing values with a specific value or a calculated value.
How can I convert a CSV file to a different format in Python?
Python provides several libraries to convert CSV files to different formats. For example, you can use the pandas
library to convert a CSV file to an Excel file, a SQL database, or a JSON file.
How can I sort a CSV file by a specific column in Python?
You can sort a CSV file by a specific column using the pandas
library in Python. After reading the CSV file into a DataFrame, you can use the sort_values
method to sort the DataFrame by one or more columns.
How can I filter rows in a CSV file based on a condition in Python?
You can filter rows in a CSV file based on a condition using the pandas
library in Python. After reading the CSV file into a DataFrame, you can use boolean indexing to filter the DataFrame based on a condition.
Ini is a startup enthusiast, software engineer and technical writer. Flutter and Django are his favorite tools at the moment for building software solutions. He loves Afrobeats music.