Artificial Intelligence: Quest to Outperform Human Expertise

As the world-wide incidence of glaucoma climbs, placing pressure on already constrained eye health resources, automated disease diagnosis and monitoring will become increasingly important to patient management. Fortunately, machine learning and deep learning classifiers have rapidly matched human level benchmarks on specific tests and may someday completely automate risk assessment for glaucoma.

Glaucoma comprises a group of eye diseases that cause optic nerve damage with loss of retinal nerve fibres, leading to a visual field deficit. It is a leading cause of irreversible blindness, with a prevalence of 3% but around 50% of affected people remain undiagnosed. The condition is chronic and progressive, which results in blindness if left untreated. By the time symptoms are apparent, a large amount of irreversible visual field loss has occurred. In order to prevent blindness, early detection and treatment, with on-going monitoring for disease progression, is required.

It is the lack of transparency around the data processing… that has led to the biggest criticism of deep learning

Figure 1. Relationship between artificial intelligence, machine learning and deep learning.

The incidence of glaucoma increases with age and our ageing population is increasing the number of people with glaucoma. Specsavers, the largest group of optometrists in Australia, recently reported that they are referring 16% more people per annum for any ophthalmology review. Efficiency in diagnosing and managing the growing number of people with glaucoma is becoming a priority as this increased workload is occurring in an environment of limitations in resources and specially trained healthcare providers. Recent advances in computer technology have meant that truly functional and automated systems to aid the diagnosis and monitoring of glaucoma is fast becoming a present-day reality.

CAN MACHINES THINK?

For as long as we have had machines, the concept of independently thinking machines has been ever present and was the subject of many a science fiction writer from as far back as the 1800s. It was in 1950 that Alan Turing first asked the question, “Can machines think?”

Advances in technology and miniaturisation have provided us with computing power in hand-held devices that were unthinkable just a few decades ago. This increase in processing capability has brought numerous applications, based on artificial intelligence (AI) technology, to consumers that many of us use on a daily basis. Digital assistants, such as Alexa and Siri and applications, such as language translation and automated facial tagging of photos, are all based on AI.

UNDERSTANDING THE DIFFERENCES

The field of AI is filled with many terms (for example, machine learning, artificial neural networks, deep learning, and convolutional neural networks) which are often used interchangeably but have important differences in definition (Figure 1). AI can be defined as an automated system capable of undertaking tasks that are generally considered as requiring human intelligence. This would include visual perception, decision making and feature recognition. Within this broad definition are various sub-types of technologies, such as machine learning, deep learning, and artificial neural networks.

At the basis of AI are algorithms that attempt to mimic human thought and reasoning. Early work in AI required explicit programming in which a machine would process data through a sequence of ‘if-then’ routines. The real revolution in AI came with the advent of machine learning.

Wen et al³ reported on a deep learning network that was able to use a single 24-2 Humphrey visual field to predict the visual field up to 5.5 years later

Machine Learning

Machine learning is a type of AI in which a machine is able to learn without explicit instructions. In machine learning, algorithms are applied to input data in order to classify or grade the data. Examples of these algorithms are K-nearest neighbour classifiers, random forest algorithms, support vector machines, Naïve Bayes classifiers and Gaussian mixture models. These algorithms function in either a supervised or unsupervised manner, are based on theories of clustering, and are extensions of classical statistical modelling. With supervised machine learning, the algorithms are guided by human intervention towards correct classification. This human ‘input’ to define correct classification is done via initial training of the algorithm with labelled examples; for example, optic disc photos are labelled with a diagnosis such as ‘glaucoma’ and ‘no glaucoma’ to create an output definition before being applied to unknown data. In contrast, unsupervised learning requires no human involvement or training to classify data since the algorithm clusters similarities within the provided data to suggest a new classification. This can lead to new insights previously unknown to experts.

Deep Learning

Deep learning is the newest advance in machine learning and the latest evolution in artificial neural networks. Unlike earlier generation algorithms, that were based on clustering and statistical theories, neural networks take inspiration from biology and use algorithms to mimic the human brain. Deep learning refers to algorithms that comprise more than three layers of neural networks and share many architectural and organisational similarities to the visual cortex. Like the visual cortex, deep learning networks comprise a series of artificial neurons or nodes which respond to one aspect of input data, which is analogous to the receptive fields of cortical neurons. The three or more layers of deep learning are the input layer, the output layer and one or more hidden layers between the input and output layers. It is the lack of transparency around the data processing within these hidden layers that has led to the biggest criticism of deep learning; the ‘black box’ nature in which it is difficult to explain the reasoning behind the algorithm’s output classification of input data.

APPLICATIONS FOR GLAUCOMA

There are already numerous applications of AI in clinical medicine and ophthalmology. In ophthalmology, the early work and use of the technology was in retinal diseases such as diabetic retinopathy, as retinal photographs were particularly well suited to the strengths of machine learning in image processing. In 2018, the Food and Drug Administration in the United States approved the IDx-DR (Digital Diagnostics, USA) software, which used machine learning algorithms to automate the classification of retinal photographs based on the presence of more than moderate diabetic retinopathy.

AI technology has great potential to revolutionise glaucoma diagnosis and management. In the face of an increased workload with constraints on suitably qualified clinicians, expert systems using AI promise to make specialist level glaucoma care more readily and easily accessible to more people. Diagnosing glaucoma and monitoring progression relies heavily on imaging (optic disc appearance and optical coherence tomography [OCT] imaging) and visual field testing.

Visual Fields and Progression

In 1994, Goldbaum et al¹ was first to publish the use of a two-layer neural network machine learning classifier to analyse visual fields. This algorithm was able to classify visual fields as either glaucomatous or nonglaucomatous with sensitivity of 65% and specificity of 72%, which was comparable to two glaucoma specialists.

More recent studies have also shown that machine learning and deep learning classifiers can at least match, and in many cases outperform, glaucoma specialists in differentiating glaucomatous from nonglaucomatous visual fields.

Visual field progression is a major indicator that glaucoma treatment needs to be escalated. However, the variability that can occur due to the subjectiveness of visual field testing makes detection of progression a challenge. Saeedi et al² found that six commonly used indices for visual field progression (mean deviation slope, visual field index slope, pointwise linear regression, permutation of pointwise linear regression, Advanced Glaucoma Intervention Study, and Collaborative Initial Glaucoma Treatment Study) only showed limited agreement. Since the Goldbaum et al1 study in 1994, there have been numerous reports of the use of various machine learning algorithms to detect glaucomatous visual field progression. These algorithms have generally been able to detect visual field progression more reliably and often earlier than traditional methods. Wen et al³ reported on a deep learning network that was able to use a single 24-2 Humphrey visual field to predict the visual field appearance up to 5.5 years later.

STRUCTURAL DAMAGE

Glaucomatous structural damage of the optic disc can be assessed by colour fundus photographs or imaging devices. Early applications of machine learning used confocal scanning laser ophthalmoscopy and scanning laser polarimetry, which are no longer in common clinical use. OCT has now become the de facto standard for objective measurement owing to its excellent reproducibility in measuring retinal nerve fibre layer thickness (RNFL).

Despite current limitations, there is no doubt that the progress and breakthroughs in AI promise to revolutionise the diagnosis and management of glaucoma

Newer machine learning algorithms and deep learning networks work particularly well with images but require very large training datasets. Ting et al⁴ used deep learning in a landmark study of around 500,000 colour fundus photographs, including 125,000 glaucoma images, and reported achieving a high area under the receiver operating curve of 0.942 with sensitivity of 96% and specificity 87% for referable glaucoma suspects. Subsequent studies have reported similar findings. Medeiros et al⁵ even reported using deep learning to predict the spectral-domain OCT average RNFL thickness from fundus photographs with a mean error of less than 10μm.

A number of studies have reported successful use of machine learning and artificial neural networks to differentiate not only glaucomatous from healthy eyes, but eyes with early and advanced glaucoma. The biggest limitation in more automated AI guided classification of OCT images for glaucoma lies in the accuracy of OCT measurements and potential for artefacts. Examples of this include the presence of blood vessel shadows and position of the vascular arcades, both of which affect the thickness of the retinal nerve fibre layer.

At Stanford University, researchers including Dr Chang (co-author of this article) are working towards solving some of the problems in AI for glaucoma. The Stanford Centre for Artificial Intelligence in Medicine and Imaging (AIMI) is developing, evaluating, and disseminating AI systems to benefit patients. They are expanding from the fields of radiology, dermatology, and pathology into ophthalmology to help with public deidentified labelled dataset hosting. Public datasets are very important for available training data to help advance the field. Dr Chang and colleagues have published on OCT 3D deep learning algorithms for glaucoma (arxiv.org/abs/1910.06302), using clinical information including longitudinal photos and visual fields to find optic nerve changes of glaucoma in high myopia, and they intend to release a public dataset.

Two of the main problems slowing AI algorithm development for glaucoma are:

1) an established consensus standard as to what type/stage of glaucoma the algorithm should detect, and

2) availability of enough diverse multimodal data for training and establishing generalisability for many different situations such as myopic tilted nerves.

This is because the term ‘glaucoma’ is a complicated label for a group of diseases with differing natural histories although they are essentially all treated by lowering eye pressure or observation. Many data sets come from tertiary hospital systems where the referral pattern for glaucoma cases may be less representative than case identification in the general population. Given all the privacy regulations and expected commercial value of health data, data sharing is still a challenging process. Thus, depending on the objective of the deep learning algorithm, such as automated test result classification versus an actual diagnostic predictive tool, the field will likely need aggregated data sets across many sites throughout the world, which is greater than what is accessible today.

INTO THE FUTURE

Despite current limitations, there is no doubt that the progress and breakthroughs in AI promise to revolutionise the diagnosis and management of glaucoma. Newer deep learning algorithms are already performing at least as well as glaucoma specialists in differentiating glaucoma from normal in visual fields, fundus, and OCT imaging. More widespread availability of such expertise will ultimately increase the historical 50% of undiagnosed glaucoma in the community without excessive burden on limited health resources.

Dr Jonathon Ng is Clinical Associate Professor at the University of Western Australia and currently serves on the Ophthalmology Committees of Glaucoma Australia and the Australian and New Zealand Glaucoma Society. He was a former member of the RANZCO Glaucoma Curriculum Standards Review Panel and the RANZCO Clinical Audit Working Group. Dr Ng is actively engaged in research at the Eye and Vision Epidemiology Research Group and is an investigator on research grants worth more than $2 million and author of 57 peer-reviewed papers. As a comprehensive ophthalmologist with subspecialty glaucoma training, Dr Ng consults with Perth Eye Surgeons in Midland and Geraldton and operates at the Perth Eye Hospital and St John of God Hospitals in Midland and Geraldton.

Dr Robert Chang is an Associate Professor of Ophthalmology at the Stanford University Medical Centre and runs a busy glaucoma and cataract surgical practice with special expertise in minimally invasive glaucoma surgery and complex cataract removal. His current research interests include: high myopia and glaucoma, glaucoma biobanking, virtual reality visual fields and remote IOP checks, improving glaucoma therapy compliance, validating eye care delivery systems leveraging mobile devices, developing computer vision machine learning algorithms, and aggregating big data with differential privacy features.

References

Goldbaum MH, Sample PA, White H, Colt B, Raphaelian P, Fechtner RD, et al. Interpretation of automated perimetry for glaucoma by neural network. Invest Ophthalmol Vis Sci. 1994;35:3362–73.
Saeedi OJ, Elze T, D’Acunto L, Swamy R, Hegde V, Gupta S, et al. Agreement and predictors of discordance of six visual field progression algorithms. Ophthalmology. 2019;126:822–8.
Wen JC, Lee CS, Keane PA, et al. Forecasting future Humphrey visual fields using deep learning. PLoS One 2019;14:e0214875.
Ting DSW, Cheung CY-L, Lim G, et al. Development and validation of a deep learning system for diabetic retinopathy and related eye diseases using retinal images from multiethnic populations with diabetes. JAMA 2017;318:2211–23.
Medeiros FA, Jammal AA, Thompson AC. From machine to machine: an OCT-Trained deep learning algorithm for objective quantification of glaucomatous damage in fundus Photographs. Ophthalmology 2019;126:513–21.

Recent Posts

ACO Reminder: New Therapeutics Certificate Starts Soon

Glaucoma NZ Symposium to Put Spotlight on Clinical Judgement

MIVISION DIGITAL JOURNAL