Combine Two data sets
Create 1st dataframe
df1 = pd.DataFrame({'Fruits': ['Mango', 'Banana',
'Grapes', 'Apple',
'Orange'],
'Price': [60, 40, 75, 100, 65]})
print(df1)
Output:
Fruits Price
0 Mango 60
1 Banana 40
2 Grapes 75
3 Apple 100
4 Orange 65
Create second dataframe
df2 = pd.DataFrame({'Fruits': ['Apple', 'Orange',
'Papaya',
'Pineapple', 'Mango', ],
'Price': [120, 60, 30, 70, 50]})
print(df2)
Output:
Fruits Price
0 Apple 120
1 Orange 60
2 Papaya 30
3 Pineapple 70
4 Mango 50
Merge two dataframe
A. Left Join
print(pd.merge(df1, df2,
how='left', on='Fruits'))
Output:
Fruits Price_x Price_y
0 Mango 60 50.0
1 Banana 40 NaN
2 Grapes 75 NaN
3 Apple 100 120.0
4 Orange 65 60.0
B. Right Join
print(pd.merge(df1, df2,
how='right', on='Fruits'))
Output:
Fruits Price_x Price_y
0 Apple 100.0 120
1 Orange 65.0 60
2 Papaya NaN 30
3 Pineapple NaN 70
4 Mango 60.0 50
C. Inner Join
print(pd.merge(df1, df2,
how='inner', on='Fruits'))
Output:
Fruits Price_x Price_y
0 Mango 60 50
1 Apple 100 120
2 Orange 65 60
D. Outer Join
print(pd.merge(df1, df2,
how='outer', on='Fruits'))
Output:
Fruits Price_x Price_y
0 Mango 60.0 50.0
1 Banana 40.0 NaN
2 Grapes 75.0 NaN
3 Apple 100.0 120.0
4 Orange 65.0 60.0
5 Papaya NaN 30.0
6 Pineapple NaN 70.0
Concatenation
A. Row-wise Concatenation having the same column name
data = {'FRUITS': ['Grapes', 'Pineapple'],
'QUANTITY': [23, 17],
'PRICE': [60, 30]
}
# Create Pandas Dataframe with dictionary
df1 = pd.DataFrame(data)
# Concatenate df and df1
df2 = pd.concat([df, df1], axis=0,
ignore_index=True)
print(df2)
Output:
FRUITS QUANTITY PRICE
0 Mango 40 80
1 Apple 20 100
2 Banana 25 50
3 Orange 10 70
4 Grapes 23 60
5 Pineapple 17 30
B. Column-wise Concatenation having the same column name
data = {'DISCOUNT': [5, 7, 10, 8, 6]}
# Create Pandas Dataframe with dictionary
discount = pd.DataFrame(data)
# Concatenate df2 and discount
df = pd.concat([df2, discount], axis=1)
print(df)
Output:
FRUITS QUANTITY PRICE DISCOUNT
0 Mango 40 80 5.0
1 Apple 20 100 7.0
2 Banana 25 50 10.0
3 Orange 10 70 8.0
4 Grapes 23 60 6.0
5 Pineapple 17 30 NaN
Pandas Cheat Sheet for Data Science in Python
Pandas is a powerful and versatile library that allows you to work with data in Python. It offers a range of features and functions that make data analysis fast, easy, and efficient. Whether you are a data scientist, analyst, or engineer, Pandas can help you handle large datasets, perform complex operations, and visualize your results.
This Pandas Cheat Sheet is designed to help you master the basics of Pandas and boost your data skills. It covers the most common and useful commands and methods that you need to know when working with data in Python. You will learn how to create, manipulate, and explore data frames, how to apply various functions and calculations, how to deal with missing values and duplicates, how to merge and reshape data, and much more.
If you are new to Data Science using Python and Pandas, or if you want to refresh your memory, this cheat sheet is a handy reference that you can use anytime. It will save you time and effort by providing you with clear and concise examples of how to use Pandas effectively.
Contact Us