Skip to main content

Posts

Showing posts from April, 2021

Panda's Task

  Consider, data=pd.DataFrame({'names':['tom','sam',...],'email':['tom21@gmail.com','samdr@yahoo.com','jk21456@abc.com',..],'Firstweekscore':[],'secondweekscore:[]}) d={'names':['tom','sam','ram','kumar'],'email':['tom21@gmail.com','samdr@yahoo.com','ram@gmail.com','k2@gmail.com'],'first_weekscore':[90,90,90,96],'second_weekscore':[92,89,78,87]} data=pd.DataFrame(d) print(data) 1.Write a function which will create a new column consisting of average of two scores      data['average']=(data['first_weekscore']+data['second_weekscore'])/2 2.List comprehension --> create another column which is consisting of scores 93 --> 96 week3=[i+3 for i in data['second_weekscore']] data['week3']=week3 print(data) 3.'gmail.com' -->regular expressions in pandas print(data[data['email

Pandas in python

  Pandas is an open source Python package that is most widely used for data science/data analysis and machine learning tasks. Install and import -> pip install pandas To import pandas we usually import it with a shorter name since it's used so much: import pandas as pd Data Structures in Pandas The primary two components of pandas are the  Series  and  DataFrame . A  Series  is essentially a column, and a  DataFrame  is a multi-dimensional table made up of a collection of Series. Pandas Series Pandas Series is a one-dimensional labeled array capable of holding data of any type (integer, string, float, python objects, etc.). The axis labels are collectively called  index . Pandas Series is nothing but a column in an excel sheet. >>> import pandas as pd >>> a=pd. Series ([1,2,3,4,5,6,7,8]) >>> a 0    1 1    2 2    3 3    4 4    5 5    6 6    7 7    8 dtype: int64 >>> b=pd. Series ([1,2,3,4,5,6,7,'hello']) >>> b 0