This part of the project documentation focuses on
an information-oriented approach. Use it as a
reference for the technical implementation of the
mlpForecaster
project code.
CorrelationAnalyzer
A class to calculate and visualize the correlation between variables in a data frame.
corr
staticmethod
corr(
data,
variable_col,
target_col,
method="scatter",
ties="auto",
hue_col=None,
n_sample=None,
)
Calculate the correlation between the target column and other variables in the data frame.
Parameters:
-
data
(DataFrame
) –The data frame containing the data.
-
variable_col
(list of str
) –List of column names to be used as independent variables.
-
target_col
(str
) –The name of the dependent variable column.
-
method
(str
, default:'scatter'
) –The method to use for calculating the correlation: - 'scatter' (default): Scatter plot. - 'pearson': Pearson correlation. - 'kendall': Kendall rank correlation. - 'spearman': Spearman rank correlation. - 'ppscore': Predictive Power Score (PPS). - 'xicor': Xi correlation.
-
ties
(str or bool
, default:'auto'
) –How to handle ties in Xi correlation calculation: - 'auto' (default): Decide based on the uniqueness of y values. - True: Assume ties are present. - False: Assume no ties are present.
-
hue_col
(str
, default:None
) –The column in
data
to use for color grouping. -
n_sample
(int
, default:None
) –The number of samples to use for the scatter
Returns:
-
DataFrame
–DataFrame containing the correlation between the target column and each variable.
Raises:
-
ValueError
–If the method is not supported.
Source code in mlpforecast/stats/corr.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 |
|
plot
staticmethod
plot(ax, corr_df)
Plot the correlation data using a heatmap.
Parameters:
-
ax
(Axes
) –The axes on which to plot the heatmap.
-
corr_df
(DataFrame
) –DataFrame containing the correlation data with three columns: two for the pairs of items and one for the correlation values.
Source code in mlpforecast/stats/corr.py
68 69 70 71 72 73 74 75 76 77 78 |
|