Computing C.I. given the underlying distribution using regplot()

The seaborn.regplot() helps to plot data and a linear regression model fit. This function also allows plotting the confidence interval.

Syntax:

seaborn.regplot( x,  y,  data=None, x_estimator=None, x_bins=None,  x_ci=’ci’, scatter=True, fit_reg=True, ci=95, n_boot=1000, units=None, order=1, logistic=False, lowess=False, robust=False, logx=False, x_partial=None, y_partial=None, truncate=False, dropna=True, x_jitter=None, y_jitter=None, label=None, color=None, marker=’o’,    scatter_kws=None, line_kws=None, ax=None)

Parameters: The description of some main parameters are given below:

  • x, y: These are Input variables. If strings, these should correspond with column names in “data”. When pandas objects are used, axes will be labeled with the series name.
  • data:  This is dataframe where each column is a variable and each row is an observation.
  • lowess: (optional) This parameter take boolean value. If “True”, use “statsmodels” to estimate a nonparametric lowess model (locally weighted linear regression).
  • color: (optional) Color to apply to all plot elements.
  • marker: (optional) Marker to use for the scatterplot glyphs.

Return: The Axes object containing the plot.

Basically, it includes a regression line in the scatterplot and helps in seeing any linear relationship between two variables. Below example will show how it can be used to plot confidence interval as well.

Example:

Python3




# import libraries
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
  
# create random data
np.random.seed(0)
x = np.random.randint(0, 10, 10)
y = x+np.random.normal(0, 1, 10)
  
# create regression plot
ax = sns.regplot(x, y, ci=80)


The regplot() function works in the same manner as the lineplot() with a 95% confidence interval by default. Confidence interval can easily be changed by changing the value of the parameter ‘ci’ which lies in the range of [0, 100]. Here I have passed ci=80 which means instead of the default 95% confidence interval, an 80% confidence interval is plotted.

The width of light blue color shade indicates the confidence level around the regression line.

How to Plot a Confidence Interval in Python?

Confidence Interval is a type of estimate computed from the statistics of the observed data which gives a range of values that’s likely to contain a population parameter with a particular level of confidence.

A confidence interval for the mean is a range of values between which the population mean possibly lies. If I’d make a weather prediction for tomorrow of somewhere between -100 degrees and +100 degrees, I can be 100% sure that this will be correct. However, if I make the prediction to be between 20.4 and 20.5 degrees Celsius, I’m less confident. Note how the confidence decreases, as the interval decreases. The same applies to statistical confidence intervals, but they also rely on other factors.

A 95% confidence interval, will tell me that if we take an infinite number of samples from my population, calculate the interval each time, then in 95% of those intervals, the interval will contain the true population mean. So, with one sample we can calculate the sample mean, and from there get an interval around it, that most likely will contain the true population mean.

Area under the two black lines shows the 95% confidence interval

Confidence Interval as a concept was put forth by Jerzy Neyman in a paper published in 1937. There are various types of the confidence interval, some of the most commonly used ones are: CI for mean, CI for the median, CI for the difference between means, CI for a proportion and CI for the difference in proportions.

Let’s have a look at how this goes with Python.

Similar Reads

Computing C.I given the underlying distribution using lineplot()

The lineplot() function which is available in Seaborn, a data visualization library for Python is best to show trends over a period of time however it also helps in plotting the confidence interval....

Computing C.I. given the underlying distribution using regplot()

...

Computing C.I. using Bootstrapping

The seaborn.regplot() helps to plot data and a linear regression model fit. This function also allows plotting the confidence interval....