top of page
learn_data_science.jpg

Data Scientist Program

 

Free Online Data Science Training for Complete Beginners.
 


No prior coding knowledge required!

Pandas Technique: Summary Statistics

Summary statistics is a part of descriptive statistics that summarizes and provides the gist of information about the sample data. Statisticians commonly try to describe and characterize the observations by finding: a measure of location, or central tendency, such as the arithmetic mean.

import pandas as pd
import numpy as np
# read dataset
df = pd.read_csv('Srt_dta.csv')
df

Summarizing numerical data

df['Height(cm)'].mean()
'2011-12-11'
df['Date of Birth'].max()
'2018-02-27'

The .agg() method

agg() is used to pass a function or list of function to be applied on a series or even each element of series separately. In case of list of function, multiple results are returned by agg() method.

def pct30(column):
    return column.quantile(0.3)
    
df['Weight(kg)'].agg(pct30)
21.0

Summaries on multiple columns

df[['Height(cm)', 'Weight(kg)']].agg(pct30)
Height(cm)    45.4
Weight(kg)    21.0
dtype: float64

Multiple summaries

def pct40(column):
    return column.quantile(0.4)
    
df['Height(cm)'].agg([pct30, pct40])
pct30    45.4
pct40    47.2
Name: Height(cm), dtype: float64

Cumulative sum

df['Weight(kg)'].cumsum()
# another method
# .cummax()
# .cumprod()
# .cummin()
0     25
1     48
2     70
3     87
4    116
5    118
6    192
Name: Weight(kg), dtype: int64

 
 
 

Comments


COURSES, PROGRAMS & CERTIFICATIONS

 

Advanced Business Analytics Specialization

Applied Data Science with Python (University of Michigan)

Data Analyst Professional Certificate (IBM)

Data Science Professional Certificate (IBM)

Data Science Specialization (John Hopkins University)

Data Science with Python Certification Training 

Data Scientist Career Path

Data Scientist Nano Degree Program

Data Scientist Program

Deep Learning Specialization

Machine Learning Course (Andrew Ng @ Stanford)

Machine Learning, Data Science and Deep Learning

Machine Learning Specialization (University of Washington)

Master Python for Data Science

Mathematics for Machine Learning (Imperial College London)

Programming with Python

Python for Everybody Specialization (University of Michigan)

Python Machine Learning Certification Training

Reinforcement Learning Specialization (University of Alberta)

Join our mailing list

Data Insight participates in affiliate programs and may sometimes get a commission through purchases made through our links without any additional cost to our visitors.

bottom of page