top of page
learn_data_science.jpg

Data Scientist Program

 

Free Online Data Science Training for Complete Beginners.
 


No prior coding knowledge required!

Investigating Netflix Series

In this project, we will take a look at a dataset of The Office episodes, and try to understand how the popularity and quality of the series varied over time.


To do so, we will use the following dataset: datasets/office_episodes.csv, which was downloaded from Here.


After downloading it let us open and read it using ower Jupyter Notebook.


import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
%config InlineBackend.figure_format = 'retina'
pd.options.mode.chained_assignment = None  # default='warn'

df = pd.read_csv('the_office_series.csv', index_col = [0])
df.head(10)


Let us view its columns details.

df.info()


Let's remove the unnecessary columns.


df1 = df[['Ratings', 'Viewership', 'Date']]
df1.head()


Let's change the Date column type to be 'Date Time'


df1['Date'] = pd.to_datetime(df1['Date'])
df1.head()


Let's now analys our data over time and show out the graphs.


fig, ax = plt.subplots()
fig.set_figheight(5)
fig.set_figwidth(13)
ax.scatter(y= df1['Ratings'], x= df1['Date'], color = 'blue', alpha= 0.3)
ax.scatter(y= df1['Viewership'], x= df1['Date'], color = 'gold', alpha = 0.3)
ax.legend(['Ratings', 'Viewership'])
plt.show()


df1.plot(y= ['Ratings','Viewership'], x= 'Date', figsize = (13, 5))

We can conclude that, the `Quality` of the series according to the `Ratings` of the viewers has not been affected by time. On the other hand we have seen that, the `Popularity` had an obvious declination in the last two years according to the `Viewership` values.

 
 
 

Comments


COURSES, PROGRAMS & CERTIFICATIONS

 

Advanced Business Analytics Specialization

Applied Data Science with Python (University of Michigan)

Data Analyst Professional Certificate (IBM)

Data Science Professional Certificate (IBM)

Data Science Specialization (John Hopkins University)

Data Science with Python Certification Training 

Data Scientist Career Path

Data Scientist Nano Degree Program

Data Scientist Program

Deep Learning Specialization

Machine Learning Course (Andrew Ng @ Stanford)

Machine Learning, Data Science and Deep Learning

Machine Learning Specialization (University of Washington)

Master Python for Data Science

Mathematics for Machine Learning (Imperial College London)

Programming with Python

Python for Everybody Specialization (University of Michigan)

Python Machine Learning Certification Training

Reinforcement Learning Specialization (University of Alberta)

Join our mailing list

Data Insight participates in affiliate programs and may sometimes get a commission through purchases made through our links without any additional cost to our visitors.

bottom of page