Go to main content
Formats
Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

Aflatoxin is a carcinogenic product of mold that affects thousands of corn-producing farms in the United States every year, so being able to predict when a county will have compromising aflatoxin levels could be beneficial for insurance companies. With this research, our objective is to create a machine learning model capable of predicting whether a county will have aflatoxin insurance claims based on its daily average temperature for a particular year. To find such a model, we used day-by-day temperature data from all U.S. counties from 2010 through 2020, as well as the insurance claims due to aflatoxin they reported during those years. We trained an extra-trees, binary classification model using Scikit-Learn with an oversampled version of the temperature data from 2010 through 2019 as the model’s inputs and whether or not the county had aflatoxin insurance claims as its output. After training, we found the average accuracy, precision, and recall on the testing dataset for this model to be 98.15%, 97.26%, and 99.14%, respectively. By performing Welch’s t-test on the model’s predictions for counties that grew corn and reported aflatoxin losses vs. counties that grew corn and did not report aflatoxin losses in 2020, we found that there was a 99.99% statistical difference (p-value=~1.18x10^-6) between the two groups of predictions. As demonstrated by this experiment, training an extra-trees, binary classifier with temperature data alone can be an effective way of predicting aflatoxin insurance claims in the United States. Presented at the 2022 UNC Charlotte Undergraduate Research Conference.

Details

PDF

Statistics

from
to
Export
Download Full History