Nazlee Zebardast, MD, MSc sat down with David Hutton, Managing Editor, Ophthalmology Times®, to discuss her research on using machine learning methods to identify image-based, specifically OCT based, phenotypes or structural phenotypes for glaucoma. at this year's ARVO meeting.
Nazlee Zebardast, MD, MSc sat down with David Hutton, Managing Editor, Ophthalmology Times®, to discuss her research on using machine learning methods to identify image-based, specifically OCT based, phenotypes or structural phenotypes for glaucoma. at this year's ARVO meeting.
Editor’s note: This transcript has been edited for clarity.
I'm David Hutton of Ophthalmology times. The Association for Research in Vision and Ophthalmology recently held its annual meeting in New Orleans. At the meeting, Dr. Nazlee Zebardast presented "Genome Wide Discovery via Feature Space Mapping of Deep Learning Derived Clinical OCT Phenotypes to the UK Biobank." Thank you so much for joining us today. Tell us a little bit about your ARVO presentation.
Thank you so much for having me, David and for the opportunity to talk about my work. So we all know that glaucoma is the leading cause of irreversible vision loss worldwide. And that primary open angle glaucoma is a complex and heterogeneous disease. It has long been recognized, including by Dr. Steven Grant, that there is structural and functional variations in glaucoma.
We also know that glaucoma is highly heritable with over 120 genetic risk loci identified to date. But these genetic variants that we know of, only explained a limited portion of the diseases heritability. So clearly, there remains a large gap. The question is, how do we fill this gap and there are things called endo phenotypes, which are quantitative traits that provide a continuum of risk and increase in our ability to detect genetic variants. You can think of that for example, like intraocular pressure, cup-to-disc ratio.
But recently, machine learning approaches have come forth that allow us to make sense of unstructured and high dimensional data. And so that's what we did in this study was to use unsupervised machine learning methods to identify image based, specifically OCT based, phenotypes or structural phenotypes for glaucoma.
The problem is that not all datasets that we have available have all the information we need. That is genetics data, information about phenotypes, such as glaucoma, the disease state, and then also genetic information. So what we did in this study was to use the large clinical data set that we have available at Mass Eye and Ear. That was over 18,000 high quality OCTs from over 8000 patients that had at least one diagnosis code for glaucoma. And we use them in two machine learning approaches. One is an unsupervised auto encoder approach. And the other one is a contrasted learning approach.
We use these to discover OCT based phenotypes, specifically for two layers, the ganglion cell complex and the retinal nerve fiber layer. Once we had these machine learning models trained, we then transferred our trained models onto the UK Biobank. So the UK Biobank has OCTs, but it's a population based study. So it has data from people that don't have disease. For the most part, there's a small number of people that do have glaucoma. But we use 80,000 images from 40,000 people in the UK Biobank that had genetic information available, not all of whom had glaucoma. We found those same phenotypes in the UK Biobank.
We then use those same phenotypes that we found in the UK Biobank to perform gene wide discovery. We identified quite a number of variants, and quite a number of loci. What is really interesting is that we found a very nice overlap with what is already known about glaucoma and its phenotypes. So we found variants that have already been associated with primary open angle glaucoma, cip-to-disc ratio, optic nerve size, even RNFL thickness. We also found genes that have been associated with these different traits.
But we also found novel genes and novel loci that we looked at an expression level data from the crowd, and we saw that they're highly expressed in the retinal ganglion cells. So a lot of new hypotheses being generated, there is a lot of work to be done, but very exciting approach that I think we have utilized here.
What is the next step for this research?
So the next step here is to really understand what are the functional correlates of our loci and genes discovered, and what are the potential causal genes. So that means doing further genetic analysis like enrichment analysis, co-localization analysis, pathway analysis. There's a lot of different things we can do to really hone down to see of all these novel variants and loci we have discovered with our approach, which of them are meaningful and truly disease related.
And then you know, there's also whenever you use machine learning there is this black box that people don't quite understand what is happening. What we really want to be able to also quantify better with our patterns is how are they related clinically. For example, is one pattern more associated with visual field loss, with different types of glaucoma, with intraocular pressure, if need for eye surgery, so we have all that data already from Mount Sinai. We also have a lot of phenotype data from the UK Biobank as well as systemic association.
So deep phenotyping our patterns will be one of our major next steps.