Python Data Visualization
Updated at
- Find The Graphic You Need
- The Data Visualisation Catalogue
- Plot With Pandas
- Charjunk, Data-ink ratio: effective data visualization
- Tableau Color Blind 10
- Kernel density estimation (KDE): better histogram
- Small multiple: series of similar graphs or charts using the same scale and axes
- Matplotlib styles
Pandas
- df.plot(x=‘col’, y=‘col’, kind=‘scatter’)
- c=‘color’, color=‘color’
- figsize=(,), ax=ax1, grid=True
- label=’’, legend=True, title=’’
- xlim=(,), xticks=[]
- rot=30, alpha=1
- autopct=’%.1f%%’ # -% String Formatting -.1 precision -f fixed point -% perc -% symbol
- secondary_y=False, marker=‘o’
- df.plot.bar(x=‘col’, y=‘col’)
- df.plot.kde()
- df.hist(bins=, range=(,), histtype=‘step’)
- df.box(x=‘col’, y=‘col’)
- df.<graph>()
- from pandas.plotting import scatter_matrix
- scatter_matrix(cols, figsize(,))
Seaborn
| Seaborn Basics | Syntax |
|---|---|
| Import module | import seaborn as sns |
| Set background style | sns.set_style('darkgrid' | 'whitegrid' | 'dark' | 'white' | 'ticks') |
| Remove spines | sns.despine(left=True, bottom=True) |
| Histogram \w KDE | sns.distplot(y_values) |
| Kernel Density Plot | sns.kdeplot(y_values, shade=True) |
| Countplot (Clustered Bar) | sns.countplot(x='x_col', hue='y_col', data=df, order=[], hue_order=[]) |
| Strip Plot (Narrow Scatter) | sns.stripplot(x='x_col', y='y_col', data=df, jitter=True) |
| Box Plot (And Whisker) | sns.boxplot(x='x_col', y='y_col', data=df, whis=4, orient='vertical', width=.15) |
| Heatmap | sns.heatmap(data, cmap='Blues', cbar=False, annot=False, yticklabels=False) |
| Small Multiple | g = sns.FacetGrid(df, col='col1', row='col2', hue='col3', size=height) |
| Fill Small Multiple | g.map(sns.kdeplot, 'y_col', shade=True) |
Matplotlib
| Matplotlib Basics | Syntax |
|---|---|
| Import module | import matplotlib.pyplot as plt |
| Jupyter inline | %matplotlib inline |
| Set plot style | plt.style.use('fivethirtyeight') |
| Plot style list | plt.style.available |
| Show plot | plt.show() |
| Save plot | plt.savefig('file') | fig.savefig('file') |
| Create figure | fig = plt.figure(figsize=(width_dpi, height_dpi)) |
| Add plot to the figure | ax = fig.add_subplot(nrows, ncols, plot_number) |
| Create figure and subplots | fig, axes = plt.subplots(nrows, ncols, figsize=(width, height)) |
| Disable spines | ax.spines['side'].set_visible(False) | right, bottom, top, left |
| Matplotlib Charts | Syntax |
|---|---|
| Line chart | plt.plot(x_values, y_values, c='color', label='', linewidth=3) |
| Bar plot | plt.bar(bar_positions, bar_heights, [bar_width]) |
| Horizontal bar plot | plt.barh(bar_positions, bar_widths, [bar_height]) |
| Scatter plot | plt.scatter(x, y) |
| Histogram | plt.hist(y_values, bins=int, range=(min, max)) |
| Box plot | plt.boxplot(values) |
| Matplotlib Plot and Axis | plt | ax | Arguments |
|---|---|---|---|
| Set Title | title | set_title | 'text' |
| Add Legend | legend | legend | 'text', loc='upper left', fontsize=12 |
| Set Axis Labels | xlabel | set_xlabel | 'text', size=12 |
| Ticks and Their Labels | xticks | set_xticks, ax.set_xticklabels | [ticks], [labels], rotation=90, size=12 |
| Batch Tick Parameters | tick_params | tick_params | bottom='off', top='off', left='off', right='off', labelbottom='off', labelsize=12 |
| Set Axis Limit Range | xlim | set_xlim | min, max |
| Add H/V Lines | axhline | axhline | y, label='', c='color', alpha=1 |
| Add Text | text | text | x, y, 'text' |