Deep learning for detection of keratoconus and prediction of CXL efficacy

A study sought to implement and test a trained deep learning algorithm using pre-operative corneal topography scans in order to classify between keratoconus and normal corneas, stage the disease, and predict whether a patient will be likely to benefit from CXL treatment.

Reviewed by Henry Liu, MD

In traditional machine learning, a programmer tells the machine the features of whatever it is supposed to identify. In deep learning, the features are picked out by the neural network, without human intervention.

Instead, the program is given a large sample of appropriately labeled images, and the program is able to learn to recognize patterns by itself, without pre-existing rules set by the programmer.

In recent years, convolutional neural networks (CNNs) have been developed that provide higher accuracy in image detection tasks using multiple layers to progressively extract higher level features.

These CNNs can be used to automatically grade retina images of patients with diabetic retinopathy, and also have been used in glaucoma, but there is a “paucity of research” in the field of cornea.

Keratoconus is a progressive disease that leads to the thinning of the cornea. If caught early, disease progression can be halted, using corneal collagen crosslinking.

Despite treatment, approximately 10% of patients continue to progress, eventually requiring corneal transplantation.

The goal of this study was to implement and test a trained deep learning algorithm using pre-operative corneal topography scans in order to 1) classify between keratoconus and normal corneas, 2) stage the disease according to Amsler-Krumeich scale, and 3) predict whether a patient will be likely to benefit from crosslinking treatment.

Oculus Pentacam images were processed and standardized for comparison. The algorithm used is an 11-layer convolutional neural network programmed using TensorFlow software with the assistance of a data scientist (Figure 1).

A method called cross-validation was used to test the method. The data is split into two parts, a training set and a validation set.

The training set uses 80% of the image set to train the model, and the validation set uses 20% of the images to evaluate the model’s performance, including images the network has not seen before.

Final accuracies were reported after 50 epochs (one cycle through the full training dataset) for all three stages of the study.

In total, 2450 scans were included (1215 keratoconus and 1235 controls) for stage 1, 985 keratoconus scans were classified according to the Amsler-Krumeich scale for stage 2, and 138 keratoconus scans (69 progressed and 69 stabilized) were labeled for stage 3.

In stage 1, differentiating eyes with keratoconus from normal eyes, a very high validation accuracy rate (99.3%) was achieved. For stage 2, grading keratoconus, the validation accuracy was 73.5%.

One reason to account for the lower accuracy is the fact that the grading system used does not incorporate manifest refraction data. Implementing a marker system, the authors were able to improve the accuracy 14%, to 87.8%.

Because there were more images used in stage 1 than in stage 2, the authors re-did the testing of stage 1 with fewer images, and achieved a validation accuracy of 83.6%, which is closer to that found in stage 2.

For stage 3, assessing the accuracy of the CNN in predicting whether a patient with keratoconus will stabilize following crosslinking, or continue to progress, the images were split into groups of stabilized vs progressed.

The program was unable to learn any useful parameters that would improve its accuracy in categorizing the scans, and only achieved a validation accuracy of 53.6%.

The sample size may have limited this study.Another limitation is that with deep learning algorithms it’s not possible to visualize what parameters the algorithm is using for classification.

However researchers have now developed class activation maps to highlight regions to better elucidate the inner workings of the deep learning black box.

Last, the authors noted the results of this detection is limited to Pentacams, and without more images it can’t be generalized to other topographers such as the OrbScan.


The authors were able to build and test a deep learning model that was able to diagnose keratoconus with an accuracy of 99.3%, and stage the disease at 87.8%.

They said that even though it was not able to predict disease progression, the applications of convolutional neural networks are encouraging and can be utilized to solve practical clinical problems and serve as an adjunct tool in clinical decision making in the near future.


Henry Liu, MD
This article is based on Dr. Liu‘s presentation at the ARVO 2021 virtual annual meeting. Dr. Liu has no financial disclosures.