import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

%matplotlib inline

Pandas can easily read data stored in different file formats like CSV, JSON, XML or even Excel. Parsing always involves specifying the correct structure, encoding and other details. The read_csv method reads CSV files and accepts many parameters.

pd.read_csv?

df = pd.read_csv('data/btc-market-price.csv') #Retrieves the info from csv file

df.head() #displays the csv

The CSV file we are reading has only two columns which are timestamp and price. This CSV file doesn't have a header, it only contains whitespaces and has values separated by commas. Pandas automatically assigned the first row of data as headers which is incorrect. We can over write this behaviour with the header parameter.

df = pd.read_csv('data/btc-market-price.csv', header = none)

df.head()

WE can set the names of the columns though in order to improve visibility

df.columns = ['Timestamp','Price'] #Columns for the data set

df.shape

df.head()

df.tail(3) #Retrieves last 3 rows

df.dtypes

pd.to_datetime(df['Timestamp']).head()

df['TimeStamp'] = pd.to_datetime(df['Timestamp'])

df.set_index('Timestamp', inplace = true) #Sets the index time stamp to df and saves it to the data frame

df.loc['2017-09-29']

Putting Everything together -

Desired steps of the data frame to parse our CSV file

df = pd.read_csv('data/btc-market-price.csv', header=None)
df.columns = ['Timestamp', 'Price']
df['Timestamp'] = pd.to_datetime(df['Timestamp'])
df.set_index('Timestamp', inplace=True)

#################################################
Final steps for our desired CSV
#################################################

However, This can be quite repetitive and there is a faster way to achieve this, while also keeping it easier to read

df = pd.read_csv(
    'data/btc-market-price.csv',
    header=None,
    names=['Timestamp', 'Price'],
    index_col=0,
    parse_dates=True
)

Plotting basics -

df.plot() #Plots a graph using the entire CSV

image.png

Behind the scenes of This, it is using matplot.lib.pyplot interface. We can create a similar plot with the plt.plot() function:

plt.plot(df.index, df['Price'])

x = np.arange(-10,11) #From -10 to 10 1D array

plt.plot(x,x ** 2) #y^2 creates a quadratic graph

plt.plot(x, -1 * ( x ** 2)) #Negative quadratic graph