Now we want to send the entire rating column through this function, which is what apply()
does:
>>> data["grade"]=data["marks"].apply(rating_function)
>>> data
names marks city grade
0 ram 90 chennai good
1 som 80 chennai bad
2 ravi 85 chennai good
Dataframe/Series.head() method
Pandas head()
method is used to return top n (5 by default) rows of a data frame or series.
>>> data.head(2)
names marks city grade
0 ram 90 chennai good
1 som 80 chennai bad
Pandas Dataframe.describe() method
Pandas describe()
is used to view some basic statistical details like percentile, mean, std etc. of a data frame or a series of numeric values.
>>> data.describe()
marks
count 3.0
mean 85.0
std 5.0
min 80.0
25% 82.5
50% 85.0
75% 87.5
max 90.0
DataFrames Concatenation
concat()
function does all of the heavy lifting of performing concatenation operations along an axis while performing optional set logic (union or intersection) of the indexes (if any) on the other axes
df1
=
pd.DataFrame({
'A'
: [
'A0'
,
'A1'
,
'A2'
,
'A3'
],
'B'
: [
'B0'
,
'B1'
,
'B2'
,
'B3'
],
'C'
: [
'C0'
,
'C1'
,
'C2'
,
'C3'
],
'D'
: [
'D0'
,
'D1'
,
'D2'
,
'D3'
]},
index
=
[
0
,
1
,
2
,
3
])
df2
=
pd.DataFrame({
'A'
: [
'A4'
,
'A5'
,
'A6'
,
'A7'
],
'B'
: [
'B4'
,
'B5'
,
'B6'
,
'B7'
],
'C'
: [
'C4'
,
'C5'
,
'C6'
,
'C7'
],
'D'
: [
'D4'
,
'D5'
,
'D6'
,
'D7'
]},
index
=
[
4
,
5
,
6
,
7
])
df3
=
pd.DataFrame({
'A'
: [
'A8'
,
'A9'
,
'A10'
,
'A11'
],
'B'
: [
'B8'
,
'B9'
,
'B10'
,
'B11'
],
'C'
: [
'C8'
,
'C9'
,
'C10'
,
'C11'
],
'D'
: [
'D8'
,
'D9'
,
'D10'
,
'D11'
]},
index
=
[
8
,
9
,
10
,
11
])
pd.concat([df1, df2, df3])
Output:
>>> d={'names':['ram','som','kumar','bala','arun'],'marks':[90,90,90,78,56],'sections':['A','B','A','B','A']}
>>> data=pd.DataFrame(d)
>>> data
names marks sections
0 ram 90 A
1 som 90 B
2 kumar 90 A
3 bala 78 B
4 arun 56 A
iloc:
“iloc” in pandas is used to select rows and columns by number, in the order that they appear in the data frame
>>> data.iloc[0]
names ram
marks 90
sections A
Name: 0, dtype: object
left
=
pd.DataFrame({
'Key'
: [
'K0'
,
'K1'
,
'K2'
,
'K3'
],
'A'
: [
'A0'
,
'A1'
,
'A2'
,
'A3'
],
'B'
: [
'B0'
,
'B1'
,
'B2'
,
'B3'
]})
right
=
pd.DataFrame({
'Key'
: [
'K0'
,
'K1'
,
'K2'
,
'K3'
],
'C'
: [
'C0'
,
'C1'
,
'C2'
,
'C3'
],
'D'
: [
'D0'
,
'D1'
,
'D2'
,
'D3'
]})
pd.merge(left, right, how
=
'inner'
, on
=
'Key'
)
Output:
Pandas GroupBy:
Groupby mainly refers to a process involving one or more of the following steps they are:
- Splitting : It is a process in which we split data into group by applying some conditions on datasets.
- Applying : It is a process in which we apply a function to each group independently
- Combining : It is a process in which we combine different datasets after applying groupby and results into a data structure
>>>
d={
'names':['ram','som','kumar','bala','arun'],
'marks':[90,90,90,78,56],
'sections':['A','B','A','B','A']}
>>> data=pd.DataFrame(d)
>>> data
names marks sections
0 ram 90 A
1 som 90 B
2 kumar 90 A
3 bala 78 B
4 arun 56 A
>>> data.groupby('sections')
<pandas.core.groupby.generic.DataFrameGroupBy object at 0x000002B888579D88>
>>> data.groupby('sections').groups
{'A': [0, 2, 4], 'B': [1, 3]}
>>> data.groupby('sections').sum()
marks
sections
A 236
B 168
>>>
Comments
Post a Comment