Machine learning is one of the hottest topics in radiology and all of healthcare, but reading the latest and greatest ML research can be difficult, even for experienced medical professionals. A new analysis written by a team at Northern Ireland’s Belfast City Hospital and published in the American Journal of Roentgenology was written with that very problem in mind.
Here’s a quick primer:
- Cross Validation: How a lot of ML algorithms generate various performance measures. When researchers begin work on their algorithm, they separate subjects into two groups: a training dataset and a testing dataset. The training dataset is used to create the algorithm, training it so that it can make predictions. The testing dataset is used as an initial test of the algorithm’s accuracy. The program can compare them, see what’s best, alter overall predictive capability and “improve the generalizability of the results,” the authors wrote.
- ROC Curve: By “plotting the effect of different levels of sensitivity on specificity,” researchers can help readers understand the performance of their algorithm. “Algorithms that perform better will have a higher sensitivity and specificity and thus the area under the plotted line will be greater than those that perform worse. The metric termed the ‘area under the ROC curve’ or ‘AUROC’ is commonly quoted and offers a quick way to compare algorithms.”
- Confusion Matrix: This helps readers locate information about a specific term or metric and compare an algorithm with others. It is largely comprised of true-positive and false-positive rate, specificity, accuracy, positive predictive value, likelihood ratios and diagnostic odds ratio. A study may mention an algorithm’s accuracy, but what if there are more important metrics a specific instance than accuracy? The confusion matrix helps the reader locate those other metrics.
- Mean squared error and mean absolute error: The relationship between variables in ML—regression—are expressed through an equation which minimizes the distance between a fitted line and data point. The degree of regression and its reliability to make predictions is represented by the mean squared error (MSE). “Smaller is better” except in the case of coefficient of determination (R2) metric.
- Image Segmentation Evaluation: When the algorithm is designed to detect the presence of something, for instance, it’s not just about detecting the finding; it’s about looking at its location and size. “The predicted area of interest generated by the algorithm is compared against an ideal or completely accurate evaluation image,” the authors wrote.