How to Construct a Visuaization of the Palmer Penguins Data Set

pandas
seaborn
Author

Jade Liang

Published

October 14, 2024

1. Load the Data Set

Before we can create visualizations for the Palmer Penguins Data set, we need to first import the data set.

import pandas as pd
url = "https://raw.githubusercontent.com/pic16b-ucla/24W/main/datasets/palmer_penguins.csv"
penguins = pd.read_csv(url)

Here’s the first five rows of the Palmer Penguins data set:

studyName Sample Number Species Region Island Stage Individual ID Clutch Completion Date Egg Culmen Length (mm) Culmen Depth (mm) Flipper Length (mm) Body Mass (g) Sex Delta 15 N (o/oo) Delta 13 C (o/oo) Comments
0 PAL0708 1 Adelie Penguin (Pygoscelis adeliae) Anvers Torgersen Adult, 1 Egg Stage N1A1 Yes 11/11/07 39.1 18.7 181.0 3750.0 MALE NaN NaN Not enough blood for isotopes.
1 PAL0708 2 Adelie Penguin (Pygoscelis adeliae) Anvers Torgersen Adult, 1 Egg Stage N1A2 Yes 11/11/07 39.5 17.4 186.0 3800.0 FEMALE 8.94956 -24.69454 NaN
2 PAL0708 3 Adelie Penguin (Pygoscelis adeliae) Anvers Torgersen Adult, 1 Egg Stage N2A1 Yes 11/16/07 40.3 18.0 195.0 3250.0 FEMALE 8.36821 -25.33302 NaN
3 PAL0708 4 Adelie Penguin (Pygoscelis adeliae) Anvers Torgersen Adult, 1 Egg Stage N2A2 Yes 11/16/07 NaN NaN NaN NaN NaN NaN NaN Adult not sampled.
4 PAL0708 5 Adelie Penguin (Pygoscelis adeliae) Anvers Torgersen Adult, 1 Egg Stage N3A1 Yes 11/16/07 36.7 19.3 193.0 3450.0 FEMALE 8.76651 -25.32426 NaN

2. Import the Seaborn Package and Create a Visualization

After importing the dataset, we can then import the seaborn package. Then, use seaborn.relplot() from the seaborn package to create a scatter plot that compares the body mass (g) to the flipper length (mm) of each penguin for each sex. Notice that there is a 3rd parameter for Sex where Sex = ".". This is because there is one entry in the Palmer Penguins data set where the sex of that penguin isn’t specified.

import seaborn as sns

fgrid = sns.relplot(x = "Body Mass (g)", 
                    y = "Flipper Length (mm)",
                    hue = "Sex", # to color each point by Sex
                    data = penguins
                    )

fgrid.fig.suptitle("Body Mass (g) vs. Flipper Length (mm)") # to add title to plot
fgrid.fig.subplots_adjust(top=0.9) # to adjust placement of title

And that’s how you can create a simple scatter plot using Seaborn! You can adjust the arguments of sns.relplot() to create different scatter plots using Palmer Penguins data set.