Lecture 10 - Reading CSV and TXT files

Rather than creating series or data frames structures from scratch, the most typical use of pandas is based on the loading of information from files or sources of information for further exploration

Reading data with python -

To read a file in python we use the open() function. This function has a single required argument that is the path to the file and has a single return, the file object

filepath = 'btc-market-price.csv'

with open(filepath,'r') as reader :
		print(reader)

#Opens a file

Once the file is opened, we can read its content as the following :

filepath = 'btc-market-price.csv'

with open(filepath,'r') as reader :
		for index, line in enumerate(reader.readlines()):
				#Read just the first 10 lines
				if (index < 10):
						print(index, line)

Reading data with pandas -

Probably one of the most recurrent types of work for data analysis : public data sources, logs, historical information tables, exports from databases. The pandas library offers us functions to read and write files in multiple formats like CSV, JSON, XML and excels XLSX.

The read_csv Method -

Method	Descriptor
Filepath	Path of file to be read
sep	character(s) that are used as a field separator in the field
header	index of row containing the names of the columns
index_col	Index of the column or sequence of indexes that should be used as index of rows of the data
names	Sequence containing the names of the columns (used together with header = None)
skiprows	Number of rows or sequence of row indexes to ignore in the load
na_values	Sequence of values that, if found in the file, should be treated as NaN
dtype	Dictionary in which the keys will be column names and the values will be types of NumPy to which their content must be converted
parse_dates	Flag that indicates if Python should try to parse data with a format similar to dates as dates. You can enter a list of column names that must be joined for the parsing as a date