Learn more  » Push, build, and install  RubyGems npm packages Python packages Maven artifacts PHP packages Go Modules Bower components Debian packages RPM packages NuGet packages

alkaline-ml / statsmodels   python

Repository URL to install this package:

Version: 0.11.1 

/ graphics / _regressionplots_doc.py

_plot_added_variable_doc = """\
    Create an added variable plot for a fitted regression model.

    Parameters
    ----------
    %(extra_params_doc)sfocus_exog : int or string
        The column index of exog, or a variable name, indicating the
        variable whose role in the regression is to be assessed.
    resid_type : str
        The type of residuals to use for the dependent variable.  If
        None, uses `resid_deviance` for GLM/GEE and `resid` otherwise.
    use_glm_weights : bool
        Only used if the model is a GLM or GEE.  If True, the
        residuals for the focus predictor are computed using WLS, with
        the weights obtained from the IRLS calculations for fitting
        the GLM. If False, unweighted regression is used.
    fit_kwargs : dict, optional
        Keyword arguments to be passed to fit when refitting the
        model.
    ax: Axes
        Matplotlib Axes instance

    Returns
    -------
    Figure
        A matplotlib figure instance.
"""

_plot_partial_residuals_doc = """\
    Create a partial residual, or 'component plus residual' plot for a
    fitted regression model.

    Parameters
    ----------
    %(extra_params_doc)sfocus_exog : int or string
        The column index of exog, or variable name, indicating the
        variable whose role in the regression is to be assessed.
    ax: Axes
        Matplotlib Axes instance

    Returns
    -------
    Figure
        A matplotlib figure instance.
"""

_plot_ceres_residuals_doc = """\
    Conditional Expectation Partial Residuals (CERES) plot.

    Produce a CERES plot for a fitted regression model.

    Parameters
    ----------
    %(extra_params_doc)s
    focus_exog : {int, str}
        The column index of results.model.exog, or the variable name,
        indicating the variable whose role in the regression is to be
        assessed.
    frac : float
        Lowess tuning parameter for the adjusted model used in the
        CERES analysis.  Not used if `cond_means` is provided.
    cond_means : array_like, optional
        If provided, the columns of this array span the space of the
        conditional means E[exog | focus exog], where exog ranges over
        some or all of the columns of exog (other than the focus exog).
    ax : matplotlib.Axes instance, optional
        The axes on which to draw the plot. If not provided, a new
        axes instance is created.

    Returns
    -------
    Figure
        The figure on which the partial residual plot is drawn.

    Notes
    -----
    `cond_means` is intended to capture the behavior of E[x1 |
    x2], where x2 is the focus exog and x1 are all the other exog
    variables.  If all the conditional mean relationships are
    linear, it is sufficient to set cond_means equal to the focus
    exog.  Alternatively, cond_means may consist of one or more
    columns containing functional transformations of the focus
    exog (e.g. x2^2) that are thought to capture E[x1 | x2].

    If nothing is known or suspected about the form of E[x1 | x2],
    set `cond_means` to None, and it will be estimated by
    smoothing each non-focus exog against the focus exog.  The
    values of `frac` control these lowess smooths.

    If cond_means contains only the focus exog, the results are
    equivalent to a partial residual plot.

    If the focus variable is believed to be independent of the
    other exog variables, `cond_means` can be set to an (empty)
    nx0 array.

    References
    ----------
    .. [1] RD Cook and R Croos-Dabrera (1998).  Partial residual plots
       in generalized linear models.  Journal of the American
       Statistical Association, 93:442.

    .. [2] RD Cook (1993). Partial residual plots.  Technometrics 35:4.

    Examples
    --------
    Using a model built from the the state crime dataset, make a CERES plot with
    the rate of Poverty as the focus variable.

    >>> import statsmodels.api as sm
    >>> import matplotlib.pyplot as plt
    >>> import statsmodels.formula.api as smf
    >>> from statsmodels.graphics.regressionplots import plot_ceres_residuals

    >>> crime_data = sm.datasets.statecrime.load_pandas()
    >>> results = smf.ols('murder ~ hs_grad + urban + poverty + single',
    ...                   data=crime_data.data).fit()
    >>> plot_ceres_residuals(results, 'poverty')
    >>> plt.show()

    .. plot:: plots/graphics_regression_ceres_residuals.py
"""


_plot_influence_doc = """\
    Plot of influence in regression. Plots studentized resids vs. leverage.

    Parameters
    ----------
    {extra_params_doc}
    external : bool
        Whether to use externally or internally studentized residuals. It is
        recommended to leave external as True.
    alpha : float
        The alpha value to identify large studentized residuals. Large means
        abs(resid_studentized) > t.ppf(1-alpha/2, dof=results.df_resid)
    criterion : str {{'DFFITS', 'Cooks'}}
        Which criterion to base the size of the points on. Options are
        DFFITS or Cook's D.
    size : float
        The range of `criterion` is mapped to 10**2 - size**2 in points.
    plot_alpha : float
        The `alpha` of the plotted points.
    ax : AxesSubplot
        An instance of a matplotlib Axes.
    **kwargs
        Additional parameters passed through to `plot`.

    Returns
    -------
    Figure
        The matplotlib figure that contains the Axes.

    Notes
    -----
    Row labels for the observations in which the leverage, measured by the
    diagonal of the hat matrix, is high or the residuals are large, as the
    combination of large residuals and a high influence value indicates an
    influence point. The value of large residuals can be controlled using the
    `alpha` parameter. Large leverage points are identified as
    hat_i > 2 * (df_model + 1)/nobs.

    Examples
    --------
    Using a model built from the the state crime dataset, plot the influence in
    regression.  Observations with high leverage, or large residuals will be
    labeled in the plot to show potential influence points.

    >>> import statsmodels.api as sm
    >>> import matplotlib.pyplot as plt
    >>> import statsmodels.formula.api as smf

    >>> crime_data = sm.datasets.statecrime.load_pandas()
    >>> results = smf.ols('murder ~ hs_grad + urban + poverty + single',
    ...                   data=crime_data.data).fit()
    >>> sm.graphics.influence_plot(results)
    >>> plt.show()

    .. plot:: plots/graphics_regression_influence.py
    """


_plot_leverage_resid2_doc = """\
    Plot leverage statistics vs. normalized residuals squared

    Parameters
    ----------
    results : results instance
        A regression results instance
    alpha : float
        Specifies the cut-off for large-standardized residuals. Residuals
        are assumed to be distributed N(0, 1) with alpha=alpha.
    ax : Axes
        Matplotlib Axes instance
    **kwargs
        Additional parameters passed the plot command.

    Returns
    -------
    Figure
        A matplotlib figure instance.

    Examples
    --------
    Using a model built from the the state crime dataset, plot the leverage
    statistics vs. normalized residuals squared.  Observations with
    Large-standardized Residuals will be labeled in the plot.

    >>> import statsmodels.api as sm
    >>> import matplotlib.pyplot as plt
    >>> import statsmodels.formula.api as smf

    >>> crime_data = sm.datasets.statecrime.load_pandas()
    >>> results = smf.ols('murder ~ hs_grad + urban + poverty + single',
    ...                   data=crime_data.data).fit()
    >>> sm.graphics.plot_leverage_resid2(results)
    >>> plt.show()

    .. plot:: plots/graphics_regression_leverage_resid2.py
    """