python commands pandas

Series

You can convert a list,numpy array, or dictionary to a Series:

labels = ['a','b','c']
my_list = [10,20,30]
arr = np.array([10,20,30])

d = {'a':10,'b':20,'c':30}

pd.Series(data=my_list)

pd.Series(data=my_list,index=labels)

pd.Series(d)

#series can even hold function, though its very unlikely that we will use it

Using an index

ser1 = pd.Series([1,2,3,4],index = ['USA', 'Germany','USSR', 'Japan'])

USA        1
Germany    2
USSR       3
Japan      4
dtype: int64

ser2 = pd.Series([1,2,5,4],index = ['USA', 'Germany','Italy', 'Japan'])

ser1 + ser2

Germany    4.0
Italy      NaN
Japan      8.0
USA        2.0
USSR       NaN




Data Frames
We can think of a DataFrame as a bunch of Series objects 




df = pd.DataFrame(randn(5,4),index='A B C D E'.split(),columns='W X Y Z'.split())











W X Y Z

A 2.706850 0.628133 0.907969 0.503826
B 0.651118 -0.319318 -0.848077 0.605965
C -2.018168 0.740122 0.528813 -0.589001
D 0.188695 -0.758872 -0.933237 0.955057
E 0.190794 1.978757 2.605967 0.683509
















df['new'] = df['W'] + df['Y']





W X Y Z new

A 2.706850 0.628133 0.907969 0.503826 3.614819
B 0.651118 -0.319318 -0.848077 0.605965 -0.196959
C -2.018168 0.740122 0.528813 -0.589001 -1.489355
D 0.188695 -0.758872 -0.933237 0.955057 -0.744542
E 0.190794 1.978757 2.605967 0.683509 2.796762






df>0



W X Y Z

A 2.706850 0.628133 0.907969 0.503826
B 0.651118 NaN NaN 0.605965
C NaN 0.740122 0.528813 NaN
D 0.188695 NaN NaN 0.955057
E 0.190794 1.978757 2.605967 0.683509




df[df['W']>0]


W X Y Z

A 2.706850 0.628133 0.907969 0.503826
B 0.651118 -0.319318 -0.848077 0.605965
D 0.188695 -0.758872 -0.933237 0.955057
E 0.190794 1.978757 2.605967 0.683509


df[df['W']>0][['Y','X']]



Y X

A 0.907969 0.628133
B -0.848077 -0.319318
D -0.933237 -0.758872
E 2.605967 1.978757


df[(df['W']>0) & (df['Y'] > 1)]


W X Y Z

E 0.190794 1.978757 2.605967 0.683509




# Reset to default 0,1...n index
df.reset_index()


index W X Y Z

0 A 2.706850 0.628133 0.907969 0.503826
1 B 0.651118 -0.319318 -0.848077 0.605965
2 C -2.018168 0.740122 0.528813 -0.589001
3 D 0.188695 -0.758872 -0.933237 0.955057
4 E 0.190794 1.978757 2.605967 0.683509



newind = 'CA NY WY OR CO'.split()
df['States'] = newind
df


W X Y Z States

A 2.706850 0.628133 0.907969 0.503826 CA
B 0.651118 -0.319318 -0.848077 0.605965 NY
C -2.018168 0.740122 0.528813 -0.589001 WY
D 0.188695 -0.758872 -0.933237 0.955057 OR
E 0.190794 1.978757 2.605967 0.683509 CO


df.set_index('States') # but this is temperoray ... to make permanent change 
df.set_index('States',inplace=True) #for permanent change








W X Y Z
States 

CA 2.706850 0.628133 0.907969 0.503826
NY 0.651118 -0.319318 -0.848077 0.605965
WY -2.018168 0.740122 0.528813 -0.589001
OR 0.188695 -0.758872 -0.933237 0.955057
CO 0.190794 1.978757 2.605967 0.683509











In [215]:














Multi-Index and Index Hierarchy

api ai tutorial

Search This Blog

python commands pandas

Multi-Index and Index Hierarchy

Comments

Post a Comment

Popular posts from this blog

Gui logging in node js and python

fork and sync a github project to bitbucket

opening multiple ports tunnels ngrok in ubuntu

W	X	Y	Z
A	2.706850	0.628133	0.907969	0.503826
B	0.651118	-0.319318	-0.848077	0.605965
C	-2.018168	0.740122	0.528813	-0.589001
D	0.188695	-0.758872	-0.933237	0.955057
E	0.190794	1.978757	2.605967	0.683509

	W	X	Y	Z
States
CA	2.706850	0.628133	0.907969	0.503826
NY	0.651118	-0.319318	-0.848077	0.605965
WY	-2.018168	0.740122	0.528813	-0.589001
OR	0.188695	-0.758872	-0.933237	0.955057
CO	0.190794	1.978757	2.605967	0.683509