Comprehensive data manipulation toolkit with 6 specialized nodes for filtering, sorting, merging, and transforming tabular data in scientific workflows.
Detailed documentation for each data processing node available in Bioshift.
Select specific columns from DataFrames
Input DataFrame
List of column names to select
DataFrame with selected columns
Number of selected columns
Filter DataFrame rows based on conditions
Input DataFrame
Filter condition expression
Filtered DataFrame
Number of filtered rows
Filtering operation summary
Extract rows by position or range
Input DataFrame
Starting row index
Ending row index
Sliced DataFrame
Number of sliced rows
Remove duplicate rows from DataFrames
Input DataFrame
Columns to consider for duplicates
DataFrame without duplicates
Number of duplicates removed
Removed duplicate rows
Sort DataFrame rows by one or more columns
Input DataFrame
Columns to sort by
Sort order for each column
Sorted DataFrame
Sorting operation details
Combine multiple DataFrames using various join operations
Left DataFrame
Right DataFrame
Left DataFrame join columns
Right DataFrame join columns
Merged DataFrame
Merge operation statistics
Common data processing workflows you can build with these nodes.
Complete workflow for cleaning and preparing datasets
Combine multiple datasets with join operations
Extract representative samples from large datasets
Typical workflow for data preprocessing in scientific applications.