AgroBioSTAT

Uploaded data

Shapiro-Wilk normality test — per group

Lilliefors (Kolmogorov-Smirnov) normality test — per group

Normal Q-Q plots per group

Levene's test for homogeneity of variances (median-based)

Variance per group

Overall interpretation

Normality & Levene module

This module evaluates whether your data meet the assumptions of normality and homogeneity of variances (homoscedasticity) that underlie parametric tests such as One-way ANOVA. It applies Shapiro-Wilk, Lilliefors (Kolmogorov-Smirnov), and Levene's test per group, generates Q-Q plots, and provides an overall interpretive summary with a recommendation.

Accepted file format

Upload an .xlsx file where:

One column is a categorical factor (treatment, group, medium...) → set as Factor.
One or more columns are numeric response variables → set as Response.
Other columns can be set to Ignore.

Sidebar options

Step 1 — Upload data

Click Browse .xlsx to load an Excel file. The file is read by readxl::read_excel() . Only the first sheet is imported. Character/factor columns are auto-classified as Factor and numeric columns as Response.

Step 2 — Classify variables

Each column is assigned one of three roles via a dropdown:
Factor (grouping): The categorical column that defines the groups (e.g. treatment, medium, genotype). Exactly one factor must be selected.
Response variable: A numeric column with the measurements to test.
Ignore: Columns excluded from the analysis.

Step 3 — Analysis setup

Grouping factor: Select which Factor column defines the groups.
Response variable: Select which Response column to test for normality.
Significance level (α): The threshold for declaring significance (default: 0.05). Applied uniformly to Shapiro-Wilk, Lilliefors, and Levene's test. If p < α, the null hypothesis is rejected.

Tests included

Shapiro-Wilk (1965): The most powerful test for normality for small-to-moderate samples (n ≤ 5000). Tests the null hypothesis that the sample was drawn from a normal distribution. Applied per group independently. Implemented via stats::shapiro.test() .

Lilliefors / KS (1967): A variant of the Kolmogorov-Smirnov test that does not require the population mean and variance to be specified a priori (parameters are estimated from the data). Useful as a complement to Shapiro-Wilk, especially for larger n. Implemented via nortest::lillie.test() .

Q-Q Plots: Normal quantile-quantile plots are generated per group using ggplot2::stat_qq() and stat_qq_line() . Points falling along the diagonal indicate normality; systematic deviations suggest non-normal distributions.

Levene's test — median-based (Brown-Forsythe, 1974): Tests the null hypothesis that all group variances are equal (homoscedasticity). The median-based version (Brown-Forsythe) is more robust to non-normality than the original mean-based formulation. Implemented via car::leveneTest(center = 'median') . This is the recommended check before running ANOVA.

Summary: The Summary tab provides an overall interpretation based on the results of all three tests. It counts the proportion of groups passing/failing Shapiro-Wilk and Lilliefors, evaluates Levene's test, and issues a recommendation (e.g. proceed with parametric ANOVA vs. consider a non-parametric alternative).

Output tabs

Shapiro-Wilk: Table with W statistic, p-value, and significance verdict per group.
Lilliefors (K-S): Table with D statistic, p-value, and significance verdict per group.
Q-Q Plots: Faceted Q-Q plot for visual inspection of normality per group.
Levene's Test: F-statistic, p-value, and significance verdict, plus a variance-per-group table.
Summary: Integrated verdict and recommendation.

Downloads

Download Results (.xlsx): An Excel workbook with sheets for Analysis_Info, Shapiro-Wilk, Lilliefors, Levene, and Variance_Groups.

R packages used

stats (shapiro.test), nortest (lillie.test), car (leveneTest), ggplot2 (Q-Q plots), readxl (data import), writexl (Excel export), DT (interactive tables).

How to cite in your manuscript

Shapiro-Wilk:
Shapiro, S. S., & Wilk, M. B. (1965). An analysis of variance test for normality (complete samples). Biometrika , 52 (3–4), 591–611.

Lilliefors:
Lilliefors, H. W. (1967). On the Kolmogorov-Smirnov test for normality with mean and variance unknown. Journal of the American Statistical Association , 62 (318), 399–402.

Levene / Brown-Forsythe:
Brown, M. B., & Forsythe, A. B. (1974). Robust tests for the equality of variances. Journal of the American Statistical Association , 69 (346), 364–367.

Uploaded data

Pairwise correlation coefficients and significance

Correlation heatmap

Bivariate scatter plot with regression line

Correlations module

This module computes pairwise correlation coefficients between two or more numeric variables, tests their statistical significance, and provides visual summaries (heatmap and scatter plot). Both parametric (Pearson) and non-parametric (Spearman) methods are available.

Accepted file format

Upload an .xlsx file where:

Two or more columns are numeric variables → set as Numeric.
Other columns can be set to Ignore.

Sidebar options

Step 1 — Upload data

Click Browse .xlsx to load an Excel file. Numeric columns are auto-classified as Numeric; text/factor columns as Ignore.

Step 2 — Classify variables

Each column is assigned one of two roles:
Numeric variable: Included in the pairwise correlation matrix. At least two numeric variables are required.
Ignore: Excluded from the analysis.

Step 3 — Analysis setup

Correlation method:
• Pearson — Measures linear association between two continuous variables. Assumes bivariate normality. Best when the relationship is approximately linear.
• Spearman — A rank-based non-parametric measure of monotonic association. Robust to non-normality and outliers. Recommended when normality cannot be assumed.

Handle missing values:
• Complete observations only — Only rows with no missing values across all selected variables are used. All pairs share the same sample size.
• Pairwise complete — Each pair uses all rows where both variables are non-missing. Maximises data usage but pairs may have different sample sizes.

Significance level (α): The threshold for declaring a correlation statistically significant (default: 0.05). Each pair is tested individually with cor.test() . Significance codes: *** p < 0.001, ** p < 0.01, * p < 0.05, . p < 0.10, ns otherwise.

Output tabs

Correlation Table: A long-format table showing all pairwise combinations with columns for Variable_1, Variable_2, n, r (or ρ), p-value, confidence interval (Pearson only), and significance code. Rows with p < α are highlighted.
Heatmap: A colour-coded matrix of correlation coefficients (red = negative, green = positive). Significant pairs are annotated with their r value and significance code.
Scatter Plot: A bivariate scatter plot with regression line for any selected pair. The controls above the plot allow choosing the X and Y variables. The annotation shows r, p-value, and n.

Downloads

Download Results (.xlsx): An Excel workbook with sheets for Analysis_Info and Correlations (full pairwise table).
Download Heatmap (.pptx): An editable PowerPoint slide containing the correlation heatmap as a vector graphic ( rvg::dml ).

R packages used

stats (cor.test), ggplot2 (heatmap, scatter plot), readxl (data import), writexl (Excel export), officer + rvg (PowerPoint export), DT (interactive tables).

How to cite in your manuscript

Pearson:
Pearson, K. (1895). Notes on regression and inheritance in the case of two parents. Proceedings of the Royal Society of London , 58 , 240–242.

Spearman:
Spearman, C. (1904). The proof and measurement of association between two things. The American Journal of Psychology , 15 (1), 72–101.

Uploaded data

One-way ANOVA summary

Descriptive statistics per group

One-way ANOVA module

This module performs a One-way Analysis of Variance (ANOVA) to test whether the means of a numeric response variable differ significantly across the levels of a single categorical factor. If the ANOVA is significant, a post-hoc test (Tukey HSD or Bonferroni) identifies which specific group pairs differ. Results are presented as a publication-ready bar plot with compact letter display (CLD), an ANOVA summary table, descriptive statistics, and a pairwise comparison table.

Accepted file format

Upload an .xlsx file where:

One (or more) column(s) are categorical factors (treatment, medium, genotype...) → set as Factor.
The remaining columns are numeric response variables (measurements) → set as Response.
Columns you do not need can be set to Ignore.

Sidebar options

Step 1 — Upload data

Click Browse .xlsx to load an Excel file. Character/factor columns are auto-classified as Factor and numeric columns as Response.

Step 2 — Classify variables

Each column is assigned one of three roles:
Factor (grouping): The categorical column defining groups.
Response variable: A numeric column with measurements.
Ignore: Excluded from the analysis.

Step 3 — Analysis setup

Grouping factor: Which Factor column defines the groups to compare.

Response variable: Which numeric column contains the measurements.

Y-axis label: Custom label for the bar plot Y-axis. Auto-filled with the response variable name; editable to any text (e.g. 'Shoot length (cm)').

Significance level (α): Threshold for ANOVA and post-hoc significance (default: 0.05). Affects the CLD grouping letters and the Tukey/Bonferroni significance flag.

Post-hoc test:
• Tukey HSD — Honestly Significant Difference. Controls the family-wise error rate for all pairwise comparisons. Computed via agricolae::HSD.test() . This is the standard choice for ANOVA post-hoc.
• Bonferroni — Adjusts α by the number of comparisons. More conservative than Tukey HSD; appropriate when controlling for multiple comparisons strictly.

Color palette:
• AgroBio (green) — The institutional green palette.
• Viridis — A perceptually uniform, colourblind-friendly palette.
• Colorblind safe — The Okabe-Ito palette, optimised for deuteranopia and protanopia.

Bar plot order:
• Alphabetical (A → Z / Z → A) — Groups sorted by name.
• Mean (high → low / low → high) — Groups sorted by their mean response value.
• Custom — A text area appears where you define the exact order (one group per line).

What is One-way ANOVA?

One-way Analysis of Variance tests the null hypothesis that all group means are equal (H₀: μ₁ = μ₂ = … = μk) against the alternative that at least one differs. The F-statistic is the ratio of between-group variance to within-group variance. If p < α, at least one pair of groups differs significantly. ANOVA is computed via stats::aov() .

The bar plot displays group means with error bars (± standard error) and compact letter display (CLD). Groups sharing the same letter are not significantly different. CLD letters are derived from the selected post-hoc test.

Output tabs

Bar Plot: Publication-ready bar chart with group means, SE error bars, significance letters (CLD), and customisable colours/order.
ANOVA Results: The ANOVA summary table (Df, Sum Sq, Mean Sq, F, p) plus descriptive statistics per group (n, mean, SD, SE, CV%).
Post-hoc Pairwise: Full pairwise comparison table from the selected post-hoc test (difference, CI, adjusted p-value, significance flag).

Downloads

Download Plot (.pptx): An editable PowerPoint slide with the bar plot as a vector graphic ( rvg::dml ). Fully editable in PowerPoint.
Download Results (.xlsx): An Excel workbook with sheets for ANOVA_Table, Descriptive_Stats, Post-hoc pairwise comparisons, CLD letters, and Analysis_Info.

R packages used

stats (aov), agricolae (HSD.test for Tukey HSD and CLD), ggplot2 (bar plot), scales (colour palettes), readxl (data import), writexl (Excel export), officer + rvg (PowerPoint export), DT (interactive tables).

How to cite in your manuscript

For One-way ANOVA:
Fisher, R. A. (1925). Statistical Methods for Research Workers. Oliver & Boyd.

For Tukey HSD:
Tukey, J. W. (1949). Comparing individual means in the analysis of variance. Biometrics , 5 (2), 99–114.

For agricolae (CLD implementation):
de Mendiburu, F. (2023). agricolae: Statistical Procedures for Agricultural Research. R package.

For Bonferroni correction:
Dunn, O. J. (1961). Multiple comparisons among means. Journal of the American Statistical Association , 56 (293), 52–64.

Uploaded data

Terminal node summary

Model summary

Conditional Inference Tree module

This module fits a Conditional Inference Tree (CTree) to identify the predictor variables and split thresholds that best partition the data with respect to a target response variable. CTree avoids variable selection bias by embedding permutation-based hypothesis testing directly into the tree-building procedure.

Accepted file format

Upload an .xlsx file where:

One (or more) column(s) are categorical factors (treatment, medium, genotype...) → set as Factor or Predictor.
The remaining columns are numeric response variables (measurements) → set as Target or Predictor.
Columns you do not need can be set to Ignore.

Sidebar options

Step 1 — Upload data

Click Browse .xlsx to load an Excel file. Numeric columns are auto-classified as Predictor; text/factor columns also as Predictor (CTree handles both numeric and categorical predictors natively).

Step 2 — Classify variables

Each column is assigned one of three roles:
Target (response): The variable to model (numeric or categorical). Exactly one target must be selected.
Predictor: Input variables used to split the tree. Multiple predictors can (and should) be selected. Numeric predictors with ≤ 10 unique values are automatically converted to factors.
Ignore: Excluded from the analysis.

Step 3 — Analysis setup

Target variable: The response variable to model.

Predictors (select multiple): Multi-select list of input variables. All predictors are selected by default; deselect any you wish to exclude.

Significance level (α): The threshold for splitting (default: 0.05). At each node, the algorithm tests independence between each predictor and the response using permutation tests. A split is only performed if the smallest adjusted p-value is below α. Lower values produce simpler trees.

Max tree depth: Maximum number of levels from root to terminal node (default: 4). Depth 3–4 is recommended for biological interpretability. Depth ≥ 6 may produce overly specific splits.

Min observations per node: Minimum number of observations required in a terminal node (default: 10, minimum: 3). Prevents the tree from making splits based on very few data points. Increase for noisy datasets.

Node annotation (continuous target): Controls the statistics displayed inside terminal-node boxplots:
• None — Standard boxplot only.
• Median — Annotates each node with Md = value.
• Mean ± SD — Annotates with x̅ = value ± SD.
• Median + Mean — Shows both Md and x̅ side by side.

What is CTree?

The Conditional Inference Tree (CTree) is a non-parametric decision tree algorithm based on the conditional inference framework proposed by Hothorn et al. (2006). Unlike classical recursive partitioning methods such as CART or CHAID, CTree avoids variable selection bias by embedding statistical hypothesis testing directly into the tree-building procedure. At each node, the algorithm tests the null hypothesis of independence between each predictor and the response variable using permutation-based significance tests, and only proceeds with a split if the association is statistically significant after correction for multiple comparisons. The splitting variable is selected as the one with the smallest corrected p-value, and the optimal split point is determined by maximising the test statistic. This approach ensures that variable selection and split criteria are not influenced by the number of categories or the scale of measurement of the predictors.

In AgroBioSTAT, CTree is implemented via the ctree() function from the partykit R package.

Output tabs

Tree: The full conditional inference tree plot. Inner nodes show the splitting variable, the test p-value, and the split threshold. Terminal nodes show boxplots (continuous target) or bar plots (categorical target), optionally annotated with summary statistics. Full-screen mode is available.
Node Statistics: A table with one row per terminal node, showing N, Mean, SD, Min, Max (continuous targets) or class distribution (categorical targets).
Model Info: A summary table with the target variable, predictors used, number of observations, terminal/internal node counts, and parameter settings.

Downloads

Download Results (.xlsx): An Excel workbook with sheets for Model_Info, Node_Statistics, and Variable_Importance (split frequency count).
Download Tree (.pptx): An editable PowerPoint slide with the CTree plot as a vector graphic ( rvg::dml ).

R packages used

partykit (ctree, ctree_control, plotting), ggplot2 (auxiliary plots), readxl (data import), writexl (Excel export), officer + rvg (PowerPoint export), DT (interactive tables).

How to cite CTree in your manuscript

For the algorithm:
Hothorn, T., Hornik, K., & Zeileis, A. (2006). Unbiased recursive partitioning: A conditional inference framework. Journal of Computational and Graphical Statistics , 15 (3), 651–674. https://doi.org/10.1198/106186006X133933

For the R implementation:
Hothorn, T., & Zeileis, A. (2015). partykit: A modular toolkit for recursive partytioning in R. Journal of Machine Learning Research , 16 , 3905–3909.

Uploaded data

Artificial Neural Network

Surrogate decision tree (approximating the ANN)

Predicted vs. Observed (ANN)

Importance values

Validation results

Cross-validated / Test predictions vs. Observed

Per-fold metrics

Model summary

Decision Rules (ANN) module

This module trains an Artificial Neural Network (ANN) and then extracts human-readable IF–THEN rules via a surrogate decision tree. It combines the predictive power of ANNs with the interpretability of rule-based models. Additionally, it computes variable importance using connection-weight algorithms and provides nested cross-validation for robust model evaluation.

Accepted file format

Upload an .xlsx file where:

One or more columns are Predictors (inputs, numeric or categorical).
One column is the Target (response variable, numeric).
Other columns can be set to Ignore.

Note: Categorical predictors are automatically one-hot encoded before training. The Target must always be numeric.

Sidebar options

Step 1 — Upload data

Click Browse .xlsx to load an Excel file. Numeric columns are auto-classified as Predictor (numeric); text/factor columns as Predictor (categorical).

Step 2 — Classify variables

Each column is assigned one of the available roles:
Target (response): The numeric variable to predict. Exactly one must be selected.
Predictor (numeric): A continuous input variable.
Predictor (categorical): A categorical input variable. Automatically one-hot encoded via model.matrix() (full-rank encoding) before scaling and training.
Ignore: Excluded from the analysis.
Note: text/factor columns can only be assigned as Predictor (categorical) or Ignore; numeric columns can be Target, Predictor (numeric), or Ignore.

Step 3 — Analysis setup

Target variable: The numeric response to model.

Predictors (select multiple): Multi-select list of input variables (both numeric and categorical). All predictors are selected by default.

Hidden layers

Defines the topology of the ANN. Enter comma-separated integers, e.g. 8,5,3 for three hidden layers with 8, 5, and 3 neurons. A single number (e.g. 5 ) creates one hidden layer.
8,5,3 is the default. Simpler datasets may work with 5,3 or even 3 ; more complex datasets may benefit from wider layers. Avoid very deep architectures with small datasets (n < 100).

Activation function

The non-linear function applied at each hidden neuron.
Logistic (sigmoid) — outputs in [0, 1]. The classical default for most applications.
Tanh — outputs in [−1, 1]. Often converges faster than logistic for standardised data.

Learning algorithm

The optimisation algorithm used for training.
Rprop (Resilient Propagation) — the recommended default. Adapts the learning rate per weight, solving convergence issues common with standard backpropagation.
SCG (Scaled Conjugate Gradient) — a second-order method that can be faster for medium-sized datasets.
Backprop + Momentum — classic backpropagation with momentum term. May require tuning the learning rate.
Standard Backprop — basic gradient descent without momentum. Generally slower and less robust than Rprop or SCG.

Max iterations (maxit)

Maximum number of training iterations. The algorithm may converge before reaching this limit. The default ( 1000 ) is appropriate for most plant tissue culture datasets with Rprop. If the model underfits, increase to 2000–5000. Very high values increase computation time.

Training repetitions

Number of independent ANN training runs, each starting from different random initial weights. AgroBioSTAT automatically selects the replicate with the lowest error. More repetitions increase the probability of finding a good solution, at the cost of longer computation time. The default ( 3 ) is adequate for most cases; increase to 5–10 for complex datasets.

Surrogate tree max. depth

Controls the maximum depth of the surrogate decision tree. A deeper tree generates more rules with more conditions (conjunctions). A shallower tree produces fewer, simpler rules. Depth 2–4 is recommended for biological interpretability; depth ≥ 5 tends to generate overly specific rules. Default: 2 .

Surrogate tree complexity (cp)

The complexity parameter for rpart . A split is only performed if it improves the overall fit by at least cp (relative to the root node error). Lower values allow more splits (more complex tree, more rules); higher values produce simpler trees. Default: 0.01 . Range: 0.001–0.1.

Validation (nested)

AgroBioSTAT uses a nested validation strategy: a Train/Test split (external validation) combined with internal cross-validation within the training set.

Training set (%): Slider to set the proportion of data used for training (default: 70%). The remaining observations form the held-out test set. A donut chart shows the split visually. Range: 50–90%.

Internal CV method:
• k-fold CV — Splits the training set into k folds; trains on k−1 folds and validates on the held-out fold, rotating through all k. The default k = 5 is suitable for most datasets; increase to 10 for smaller datasets.
• LOOCV (Leave-One-Out CV) — k = n(train). Each observation is used as validation once. Computationally expensive for large datasets; consider k-fold if n > 50.

Number of folds (k): Only visible when k-fold CV is selected. Default: 5. Range: 3–20.

Random seed: Ensures reproducibility of the Train/Test split, fold assignment, and ANN weight initialisation. Default: 42.

Decision Rules from Artificial Neural Networks

Artificial Neural Networks (ANNs) are powerful non-linear models that can capture complex relationships between inputs and outputs. However, they are typically considered black-box models: it is difficult to understand why they make a particular prediction.

This module uses a Surrogate Modelling approach to extract interpretable IF–THEN rules from a trained ANN:

An ANN is trained on the user data ( RSNNS::mlp package). Inputs are automatically standardised (z-score). Categorical predictors are one-hot encoded.
The ANN generates predicted values for the entire dataset.
A regression tree ( rpart ) is trained using the original inputs to predict the ANN predictions — not the raw experimental observations.
The paths from root to terminal node in this surrogate tree define interpretable IF–THEN rules that approximate the behaviour of the neural network.
Variable importance is computed from connection weights using the Olden (signed) and Garson (relative) algorithms from NeuralNetTools .

All thresholds in the rules are reported in original experimental units (the inverse z-score transformation is applied automatically). The surrogate tree will never perfectly replicate the ANN, but it provides a human-readable approximation of the most important decision boundaries learned by the network.

Output tabs

ANN Summary: Displays the model topology (input → hidden → output), the number of connection weights, the final training error (SSE), and overall fit metrics (R², RMSE) on the full training set.
Surrogate Tree: The regression tree trained to approximate the ANN predictions. Shows splitting variables, thresholds (in original units), and terminal node predicted values. Full-screen mode is available.
Rules: Extracted IF–THEN rules from the surrogate tree. Each rule shows the conditions (variable, operator, threshold), the predicted value (mean of the ANN predictions in that node), and the number of observations covered. Rules can be filtered and sorted. A legend at the bottom shows the level meanings for categorical predictors (mapping one-hot encoded columns back to original factor levels).
Fit: Scatter plot of ANN-predicted vs. observed values with R² and RMSE annotations. The 1:1 line is shown for reference.
Variable Importance: Bar plot of connection-weight-based variable importance. Two algorithms are available: Olden (signed, the default, suitable for multi-layer networks) and Garson (relative, 0–100%, only for single hidden layer). The table below shows exact numeric values. Use the dropdown above the plot to switch between Olden and Garson.
Validation: Summary of the nested validation: R² and RMSE for the external test set (Train/Test split) and for the internal CV (mean across folds). Two scatter plots show predictions vs. observed for both validation layers. A per-fold metrics table shows R² and RMSE for each individual fold.
Model Info: Complete parameter summary: target, predictors, topology, activation, learning algorithm, iterations, repetitions, surrogate tree depth/cp, number of rules extracted, surrogate fidelity R², validation method, and seed.

Downloads

Download Results (.xlsx): An Excel workbook with sheets for: Rules, Raw_Rules, Predicted_vs_Obs, Model_Info, Variable_Importance, Validation (nested metrics), Ext_Test_Preds, Int_CV_Preds, and Fold_Metrics.

R packages used

RSNNS (mlp — ANN training), NeuralNetTools (olden, garson — variable importance), rpart (surrogate regression tree), ggplot2 (fit plots, variable importance bar plots), gridExtra (multi-panel layouts), readxl (data import), writexl (Excel export), DT (interactive tables).

How to cite in your manuscript

For RSNNS (ANN engine):
Bergmeir, C., & Benítez, J. M. (2012). Neural Networks in R Using the Stuttgart Neural Network Simulator: RSNNS. Journal of Statistical Software , 46(7), 1–26.

For NeuralNetTools (variable importance):
Beck, M. W. (2018). NeuralNetTools: Visualization and Analysis Tools for Neural Networks. Journal of Statistical Software , 85(11), 1–20.

For Olden variable importance:
Olden, J. D., Joy, M. K., & Death, R. G. (2004). An accurate comparison of methods for quantifying variable importance in artificial neural networks using simulated data. Ecological Modelling , 178 (3–4), 389–397.

For rpart (surrogate tree):
Therneau, T. M., & Atkinson, E. J. (2019). An introduction to recursive partitioning using the RPART routines. Mayo Foundation.

For the surrogate modelling approach:
Craven, M. W., & Shavlik, J. W. (1996). Extracting tree-structured representations of trained networks. Advances in Neural Information Processing Systems , 8, 24–30.

AgroBioSTAT is a data analysis tool developed by the AgroBioTech for Health group at the University of Vigo.

What does it do?

It allows researchers to perform comprehensive and rigorous data analysis without any programming knowledge. Specifically, it:

Assesses Data Quality: Checks whether the data meet the necessary requirements (normality and homoscedasticity) before applying any statistical test.
Compares Groups: Performs ANOVA and post-hoc tests to detect significant differences, visually indicating which groups differ from one another using compact letter display (CLD).
Identifies Key Drivers: Uses multivariate tools like Conditional Inference Trees to find which variables best explain a given outcome, even when relationships are complex.
Simplifies Artificial Intelligence: Trains AI models capable of making predictions and translates them into simple IF–THEN rules so that results remain transparent and understandable.

How does it work?

AgroBioSTAT integrates several advanced statistical packages from the R ecosystem — such as RSNNS, NeuralNetTools, agricolae, partykit, and ggplot2 — into a single intuitive interface. The user simply loads their data and obtains ready-to-interpret results, without writing code or managing any of the technical processes (such as one-hot encoding or z-score transformations) running in the background.

AgroBioSTAT — AgroBioTech for Health Group, University of Vigo · 2.0.0 — 14 March 2026