Comprehensive data analysis toolkit for statistical analysis, outlier detection, correlation analysis, and data preprocessing for machine learning workflows.
Detailed documentation for each data analysis node available in Bioshift.
Generate comprehensive statistical summaries of numerical data
DataFrame with numerical columns
Statistical summary DataFrame
Detailed descriptive statistics
Compute correlation matrices and analyze relationships between variables
DataFrame with numerical columns
Correlation coefficient matrix
Heatmap visualization
Highly correlated variable pairs
Identify and handle outliers using multiple statistical methods
DataFrame with numerical columns
Data with outliers removed/replaced
Indices of detected outliers
Detailed outlier analysis report
Normalize and scale data using various transformation methods
DataFrame with numerical columns
Normalized data
Scaling parameters for inverse transform
Normalization method report
Create new features through mathematical transformations
DataFrame with numerical columns
Data with new features
Feature importance scores
Applied transformations log
Common data analysis workflows you can build with these nodes.
Complete workflow for analyzing data quality and preparing for ML
Comprehensive EDA workflow for understanding dataset characteristics
Automated feature selection and engineering workflow