eda_report#
- eda_report.get_word_report(data: Iterable, *, title: str = 'Exploratory Data Analysis Report', graph_color: str = 'cyan', groupby_variable: str | int = None, output_filename: str = 'eda-report.docx', table_style: str = 'Table Grid') ReportDocument [source]#
Analyze data, and generate a report document in Word (.docx) format.
- Parameters:
data (Iterable) – The data to analyze.
title (str, optional) – The title to assign the report. Defaults to “Exploratory Data Analysis Report”.
graph_color (str, optional) – The color to apply to the graphs. Defaults to “cyan”.
groupby_variable (Union[str, int], optional) – The label/index for the column to use to group values. Defaults to None.
output_filename (str, optional) – The name/path to save the report document. Defaults to “eda-report.docx”.
table_style (str, optional) – The style to apply to the tables created. Defaults to “Table Grid”.
- Returns:
Document object with analysis results.
- Return type:
Example
>>> import eda_report >>> eda_report.get_word_report(iris_data) Analyze variables: 100%|███████████████████████████████████| 5/5 Plot variables: 100%|███████████████████████████████████| 5/5 Bivariate analysis: 100%|███████████████████████████████████| 6/6 pairs. [INFO 16:14:53.648] Done. Results saved as 'eda-report.docx' <eda_report.document.ReportDocument object at 0x7f196753bd60>
- eda_report.summarize(data: Iterable) Variable | Dataset [source]#
Get summary statistics for the supplied data.
- Parameters:
data (Iterable) – The data to analyze.
- Returns:
Analysis results.
- Return type:
Example
>>> eda_report.summarize(iris_data) Summary Statistics for Numeric features (4) ------------------------------------------- count avg stddev min 25% 50% 75% max skewness kurtosis sepal_length 150 5.8433 0.8281 4.3 5.1 5.80 6.4 7.9 0.3149 -0.5521 sepal_width 150 3.0573 0.4359 2.0 2.8 3.00 3.3 4.4 0.3190 0.2282 petal_length 150 3.7580 1.7653 1.0 1.6 4.35 5.1 6.9 -0.2749 -1.4021 petal_width 150 1.1993 0.7622 0.1 0.3 1.30 1.8 2.5 -0.1030 -1.3406 Summary Statistics for Categorical features (1) ----------------------------------------------- count unique top freq relative freq species 150 3 setosa 50 33.33% Pearson's Correlation (Top 20) ------------------------------ petal_length & petal_width -> very strong positive correlation (0.96) sepal_length & petal_length -> very strong positive correlation (0.87) sepal_length & petal_width -> very strong positive correlation (0.82) sepal_width & petal_length -> moderate negative correlation (-0.43) sepal_width & petal_width -> weak negative correlation (-0.37) sepal_length & sepal_width -> very weak negative correlation (-0.12)