Data Handling Using Pandas - 2 Preeti Arora Class 12 Information Practices (IP) Solution


Note :-  Please Click on Question to Answer of that Question !!!






Q1. What do quantile and var() functions do?

Q2. What is a quartile? How is it different from quantile?

Q3. How do you create quantiles and quartiles in Python Pandas?

Q4. What is pivoting? How is it useful?

Q5. Which pivoting function can work with duplicate values?

Q6. What is the use of aggregation?

Q7. How useful is sorting and grouping?

Q8. How is pivot_table() different from pivot() when both perform pivoting?


Q9. Write a program to create two dataframes with the following data:
df1
Emp_code    Name
110    Taksh
112    Jeet Arora
114    Shubham Jain
df2
Emp_code    Name    Salary
110    Taksh    45000
112    Jeet Arora    56000
114    Shubham Jain    55000
Store these two dataframes as two separate table files inside the same database.



Q10. What is the use of creating groups?


Q11. How are agg() and transform() similar and different?


Q12. How is reindexing useful?


Q13. How can we print specific number of rows using dataframes?


Q14. Write a program to print data from a column and find out the maximum value.


Q15. Give example to implement the functions pipe, apply, aggregation (groupby), transform and applymap.


Q16. Write a Python program to select the 'name' and 'score' columns from the following dataframe.
Sample Python dictionary data and list labels:
exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily', Michael', 'Matthew', 'Laura', 'Kevin', 'Jonas'], 'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19], 'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1], 'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']




Q17. Write a Python program to select the specified columns and rows from a given dataframe.
Select 'name' and 'score' columns in rows 1, 3, 5, 6 from the following dataframe.
exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily', 'Michael', 'Matthew', 'Laura', 'Kevin', 'Jonas'], 'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19], 'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1], "qualify": ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']




Q18. Write a Python program to select the rows where the number of attempts in the examination is greater than 2.
exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily', 'Michael', 'Matthew', 'Laura', 'Kevin', 'Jonas'], 'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19], 'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1], 'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']}
labels = ['a', 'b', 'e', 'd', 'e', 'f', 'g', 'h', 'i', 'j']



Q19. For the given dataframe df, write a Python statement to sort the dataframe in the ascending order of points :

    House    Year    Points
0    Raman    2010    500
1    Tagore    2010    600
2    Raman    2011    300
3    Tagore    2011    400
4    Ashok    2010    500



Q20. For the following dataframe df, what will be the output of the given statement?
    rollno    name    physics    chem
0    101    Pat    90    75
1    101    Sid    40    80
2    103    Tom    50    60
3    102    Kim    90    85
4    104    Ray    65    60
df_desc = df.sort_values ('physics', ascending = False)


Q21. Predict the output of the following code:
import pandas as pd
d1 = {'rollno': [101, 101, 103, 102, 104], "name": ["Pat", "Sid", "Tom", "Kim", "Ray"],\ 'physics': [90, 40, 50, 90, 65], "chem": [75, 80, 60, 85, 60]}
df = pd.DataFrame (dl)
print (df)
print (' ------ Basic aggregate functions min (), max(), sun() and mean ()')
print('minimun is:', df["physics"].min())
print('maximum is:', df["physics"].max())
print('sum is:', df ['physics'].sum())
print ("average is:', df ["physics'].mean())




Q22. Consider the following dataframe, df1:

Classes    Country    Quarter    Tutor
28    USA    1    Tahira
36    UK    1    Jacob
41    Japan    2    Venkat
32    USA    2    Tahira
40    USA    3    Venkat
40    UK    3    Tahira

df1 = df.groupby (['Tutor', 'Country'])
print (df1.groups)
print (df1.get_group (('Tahira', 'USA')))
print (df1.size())
print (df1.count())
print (df1['classes'].head())
print (df1.get_group (('Jacob', 'UK')))

(a) What will be the output of the following statement upon execution:
print (dfl.groupby ("Tutor").transform (np.mean))
(b) Differentiate between agg() and transform()
(c) Find the output:
print (df2.groupby ('Tutor') ['Classes'].transform (np.mean))
df2['Classmean') = df1.groupby('Tutor') ["Classes'].transform (np.mean)
print (df2)


Q23. Write a Python program to count the number of rows and columns of a dataframe. Sample data:
exam_data = {'name': ["Anastasia', 'Dima', 'Katherine', 'James', 'Emily', Michael', 'Matthew', 'Laura', 'Kevin', 'Jonas'], 'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19], 'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1], "qualify": ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']



Q24. Write a Python program to select the rows where the score is missing, i.e., NaN.
exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily', Michael', 'Matthew', 'Laura', 'Kevin', 'Jonas'], 'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19], 'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1], 'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']



Q25. Write a Python program to calculate the mean score for each student in the dataframe.
exam_data('name': ['Anastasia", "Dima', 'Katherine', 'James', 'Emily', 'Michael', 'Matthew', 'Laura', 'Kevin', 'Jonas'], 'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19], 'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1], 'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes'])
labels = ['a', 'b', 'e', 'd', 'e', 'f', 'g', 'h', 'i', 'j']


Q26. Write a Python program to sort the dataframe first by 'name' in descending order, then by 'score' in ascending order.
exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily', 'Michael', 'Matthew', 'Laura', 'Kevin', 'Jonas'], 'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19], 'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1], 'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']}
labels = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']



Q27. Why do you need connection to an SQL database in order to get data from a table?


Q28. What all libraries do you require in order to interact with databases (and dataframe) from within Python?

Post a Comment

You can help us by Clicking on ads. ^_^
Please do not send spam comment : )

Previous Post Next Post