Publicatie

Publication date

Who is at risk of developing breast cancer-related fatigue - a prediction study.

Beenhakker, L., Wijlens, K.A.E., Witteveen, A., Heins, M., Bode, C., Siesling, S., Vollenbroek-Hutten, M.M.R. Who is at risk of developing breast cancer-related fatigue - a prediction study. European Journal of Cancer: 2022, 175(Suppl. 1), p. S52-S53. Meeting Abstract of the 13th European Breast Cancer Conference (EBCC), Barcelona, Spain, 16-18 November 2022
ABSTRACT:

Background
Cancer-related fatigue (CRF) is still experienced by 20% of the breast cancer patients ten years after diagnosis. Although there are interventions against CRF, they should be started on time to prevent CRF
from becoming chronic. Therefore, it is important to identify patients at risk of developing CRF to subsequently monitor them actively. The goal of this study is to explore the possibility to determine the risk breast cancer patients have for developing CRF.

Methods
To assess the risk for CRF, the Dutch Primary Secondary Cancer Care Registry (PSCCR) was used. This registry consists of a part with patient reported outcomes (PSCCR-PROFIEL) and a link between data of General Practitioners (GPs) and the Netherlands Cancer Registry (PSCCR). Both have information on breast cancer patient, tumor and treatment characteristics and late effects. In PSCCR-PROFIEL, 23 input variables for 390 patients were available and the patient reported outcomes included the late effect fatigue (yes/no, n = 254). In PSCCR, 12,813 patients were included and GP visits for fatigue were extracted (n = 2224). Fifty-three input variables were used, including information on complaints before diagnosis. Missing data was imputed using Multiple Imputation by Chained Equations. Risk was predicted using machine learning comparing several models: Random Forest Classifier, Logistic Regression, Gaussian Naïve Bayes, K-Nearest Neighbors and Multi-Layer Perceptron. For extra comparison, a statistical logistic regression model was devloped. A nested 5-fold cross validation was used to optimize hyperparameters. The area under the receiver operator characteristic curve (AUC) was calculated to compare model performances.

Results
For PSCCR-PROFIEL, the Logistic Regression machine learning model performed best with an AUC of 0.669 ± 0.040. The statistical logistic regression model did not do better, with an AUC of 0.629 ± 0.040. For PSCCR, the best AUC found was 0.561 ± 0.006, also for the Logistic Regression model and the statistical Logistic Regression did about the same with 0.551 ± 0.008 as AUC. The predicted probabilities were plotted and visually compared with the true value. This showed no difference between
the fatigued and non-fatigued patients.

Conclusion
When calculating the risk patients have for CRF, we found relatively low AUCs, meaning that the models have low discriminative abilities. It could be that the variables present in the datasets are not predictive of fatigue and more information is needed (e.g. lifestyle factors). Another reason could be that the binary way fatigue is reported in both datasets is not detailed enough to predict CRF, because CRF is a multidimensional and complex long-term effect. In future studies, lifestyle factors should be included and CRF has to be measured multidimensionally to hopefully better predict the risk an individual has for developing CRF