ENVIRONMENTAL MONITORING AND ASSESSMENT, vol.194, no.5, 2022 (SCI-Expanded)
The use of computer-based tools has been becoming popular in the field of produce safety. Various algorithms have been applied to predict the population and presence of indicator microorganisms and pathogens in agricultural water sources. The purpose of this study is to improve the Salmonella prediction success of deep feed-forward neural network (DFNN) in agricultural surface waters with a determined correlation value based on selected features. Datasets were collected from six agricultural ponds in Central Florida. The most successful physicochemical and environmental features were selected by the gain ratio for the prediction of generic Escherichia coli population with machine learning algorithms (decision tree, random forest, support vector machine). Salmonella prediction success of DFNN was evaluated with dataset including selected environmental and physicochemical features combined with predicted E. coli populations with and without correlation value. The performance of correlation value was evaluated with all possible mathematical dataset combinations (nCr) of six ponds. The higher accuracy performances (%) were achieved through DFNN analyses with correlation value between 88.89 and 98.41 compared to values with no correlation value from 83.68 to 96.99 for all dataset combinations. The findings emphasize the success of determined correlation value for the prediction of Salmonella presence in agricultural surface waters.