CSV file
You can read CSV directly with numpy, pandas, or native csv module, which depends on the usage
Native module: csv
- Useful if wanna read row by row
By csv.reader
- Read each row as list
from csv import reader
path = "some.csv"
# open file in read mode
with open(path, 'r') as read_obj:
# pass the file object to reader() to get the reader object
csv_reader = reader(read_obj)
header = next(csv_reader)
print(header)
# Check file as empty
if header != None:
# Iterate over each row in the csv using reader object
for row in csv_reader:
# row variable is a list that represents a row in csv
print(row)
By csv.DictReader
- Read each row as a dict, with the csv header as dict keys
from csv import DictReader
path = "some.csv"
# open file in read mode
with open(path, 'r') as read_obj:
# pass the file object to DictReader() to get the DictReader object
csv_dict_reader = DictReader(read_obj)
# iterate over each line as a ordered dictionary
for row in csv_dict_reader:
# row variable is a dictionary that represents a row in csv
print(row)
By Pandas
- Good if the subsequent tasks are Panads involved
- https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html
By numpy
- Not common
- because difficult to set each field type
- https://numpy.org/doc/stable/user/basics.io.genfromtxt.html?highlight=csv