Hey, AI! Think You’re so Smart. Where’s Waldo?

Post date: October 29, 2021

Lorry Weaver, MT(ASCP), CLS(NCA)

Medical device and In-vitro Diagnostics (IVD) are rapidly adopting Artificial Intelligence (AI) and Machine Learning (ML) to expeditiously and accurately recognize anomalies in data. Regulators are tasked with assuring safety and efficacy. Manufacturers don’t often fully understand how their AI is really working or certainly to a level they can clearly document. So how do manufacturers satisfy regulators’ task in assuring the safety and efficacy of AI/ML software? Transparency. A big focus for Regulators has always been transparency (what it does and how it works). This has not changed with AI/ML algorithms that are difficult to describe but are essential to navigating in a regulated environment. Moreover, transparency is required for trust, and trust is needed for adopting.

Let’s illustrate the nature of the challenge with Waldo.

Can you describe how you know where Waldo is? Oh, he always has a striped shirt (vertical, horizontal broad, narrow etc.). Yes, and he always wears glasses (horned, pilot, sun etc.). Oh yeah, and he always looks like a doofus. What does a doofus look like? You’ll know it when you see it.

What about Wilma (kind of a Waldete)? –

What if there are a few Wilma’s in the crowd with various hairdos? Or perhaps Waldo is in an environment that sort of masks him?

Hmm? Seems simple to us but difficult to really explain what is going on.

An example of an identification algorithm that is easier to explain is from a previous life when I designed motion detectors: Passive Infrared (with a Fresnel lens) and an active microwave transceiver (doppler). They needed to be low cost (very high volumes) but not cause false alarms – the bane of the industry! The infrared detector was great at detecting heat signals going across the field of view while the microwave was excellent at detecting mass going towards the detector. The figure below shows the two patterns separated but they actually would emanate from a single sensor and hence the patterns are overlapped.

These technologies were what we call orthogonal, and their voting scheme helped us determine if there was truly an intruder (there are lots of false alarm sources for each technology but less so when combined in one sensor). The task then became can you tell a human from a pet? The level of information these two technologies provided was very coarse. I tried to explain to the CEO via a picture grid whereby the information I was working with to determine a pet, or a human put me somewhere between case B & C below.

Was the animal a cat that jumped up on furniture near the sensor or a small woman walking 20 feet from the sensor. Was it going towards the sensor, across the zone field etc.? You can imagine the permutations! Working with various lenses, microwave frequencies, weighting factors, voting schemes etc. we could improve the probabilities and confidence levels but that was it.

Fortunately, we have many high-resolution imaging technologies (MRI, Ultrasound, CT etc.). Image analysis (along with many other diagnostics) are now starting to employ Machine Learning (ML) to aid with analysis. Like all signal processing ML is dependent on signal to noise (Signal to Noise Ratio – SNR). Think of having a conversation in a large, crowded room with lots of people talking loud, background music and HVAC. The conversation is the signal, the rest is noise. (Unless you happen to like the song and then the song becomes the signal, and the conversation becomes part of the noise.) All signal (image) processing deals with signal to noise (image artifact, EMC noise, circuit noise, quantization noise etc.). On top of the noise is the difficulty of trying to figure out (recognize) what is truly characteristic of the target you are looking for.

A very common way to describe the capability of a given diagnostic is to use terms like sensitivity and specificity.

Let’s apply our Where’s Waldo example. Sensitivity would be how often we identify Waldo/Waldos (true positive) when he is truly in the mix versus how often we miss Waldo/Waldos (false negative) when he is truly in the mix. Alternatively, specificity is how often we do not find Waldo/Waldos (true negative) when he truly isn’t there versus how often we say he is present (false positive) when he really isn’t. (COVID testing is also characterized this way.) Accuracy is how close our results match true reality (true positives & true negatives overall cases like beach scenes, ski scenes, city scenes, stadium scenes, etc.). With Waldo, we use our very sophisticated eyes and brains to analyze the scene. It is amazing how early in life we can find him. Sometimes youngsters have false negatives and false positives but not for long. Most of the time they somehow develop their skills and Eureka – there he is!

Machine learning (ML) has been around for some time. The motion detector mentioned above employed fuzzy logic to estimate whether it was sensing a true intruder based on programmed rules based on research rather than a simple binary yes/no. These kinds of algorithms are explainable.

Artificial Intelligence is the general term that includes Machine Learning. Machine learning utilizes algorithms that parse data, learn from that data, and then apply what they’ve learned to make informed decisions.

These algorithms are trained on data sets and programmed with the resulting coefficients, weights etc. for the algorithm to process future data sets with. These algorithms are trained during development and then frozen prior to release. These algorithms can be are often explainable like the motion detector example.

Deep learning is a subfield of machine learning that structures algorithms in layers to create an artificial neural network that can learn and make intelligent decisions on its own (like our 2-year-old with Waldo searching).

Exposing these networks to the problem space and what is “true” causes these internal networks to hone their connections (some become stronger, some become weaker or become inhibited, etc.) This is not unlike how the two-year-old recognizes Waldo. We don’t normally explain to the two-year-old why Waldo is who he is (shape of his face, eye spacing, clothes, etc.) Instead, we point to him, and their internal networks adjust. This deep learning of AI is often hard to explain, unlike a rules-based algorithm. This is the process behind recognizing. “Recognize” means to identify something/someone having encountered it before – to know again. (Did you know dogs have an area in their brain to recognize faces?) The networks themselves are adjusting based on their exposure to true/not true data.

Deep learning is often treated as a black box. Regulators are not prone to accepting the “black box”. The black box is the term used for not exposing the inside mechanisms from input to output. Also, often algorithms are considered proprietary so how do we make them transparent without loss of our intellectual property (IP)?

Deep learning networks are “trained” on data sets and the network can be frozen prior to release or continue to learn on new data during its software lifecycle – post-release. If the AI network is trained during development but then prevented from adapting after released it is termed “locked”. If it is allowed to keep adjusting itself with each new data set it is termed “continuously learning”. (A good example of continuously learning AI is what Google and Amazon are doing with your mouse clicks and keystrokes.)

Both ML and AI are finding their place in medical devices including image analysis, in-vitro diagnostics, physiology signal monitoring (mobile platforms) etc. AI/ML is being implemented in software in a medical device (SiMD) and software as a medical device (SaMD). Regulators are highly motivated to facilitate these new applications reaching the market, but they are more highly motivated in assuring safety and effectiveness. There is the rub! Advance technology to help people yet not harm them.

The FDA recently held a seminar on transparency in AI [https://www.fda.gov/medical-devices/workshops-conferences-medical-devices/virtual-public-workshop-transparency-artificial-intelligencemachine-learning-enabled-medical-devices] Think about it, your doctor gives you the big “C” diagnosis based on some image or diagnostic test you have had, and your world has changed. What, how, why – does the doctor understand? Well, the diagnostic test says so, we will run some more tests to confirm. What is the nature of this test, what does it base it decisions on? Well, it’s a new diagnostic software tool and it uses Artificial Intelligence, and it says you have a malignancy in your vital organ. How does it work doc? What is its accuracy, sensitivity, and specificity?

A big issue with transparency is trust. How do doctors, patients, insurance providers, hospital technology purchasing trust that the algorithm has come up with the correct result (sensitivity and specificity) for their intended use of this technology? These metrics are all highly dependent on how they work and what and how much representative data they trained on. Think, if you only knew of beach scenes with Waldo and then you are exposed to some complicated city scenes with many other non-Waldo look-alikes in a whole new environment.

Regulators are working to develop frameworks for assessing AI-based medical devices. There is a lot of collaborative discussion going on between regulators, physicians, industry, academia, and patients. AI/ML is implemented in software and hence beholden to software standards (i.e. ISO IEC 62304) and guidance. Make no mistake, however cool, seductive, innovative, and cutting edge technology is, it still requires good solid engineering methodologies. Technologies can tend to get hyped up and we all want to believe in new breakthroughs.

Regulators are prone to probe into fundamental specifications and rigorous verifications that support subsequent validations. Validations need to have statically significant data sets supporting the diverse patient population, platforms, and users. What are your user needs (intended use, users, use environment)? What are your specifications? How are errors handled? How did you verify against your specifications (known inputs such as simulated data, phantoms)? What data did you train with versus test with? AI/ML relies on lots of data from lots of sources (age, race, gender, size, shape, medical condition) in order to meet the needs of the general public. Regulators need to understand both how it works and what data was used to achieve claimed results.

Besides general software development processes (i.e. IEC 62304), Risk Management (i.e. ISO 14971) and Usability (i.e. IEC 62366), there are some slightly modified risk classifications (IMDRF) being presented. AI/ML has proposed using the same risk model as Software as a Medical Device (SaMD).

What is the state of the condition and what is the AI trying to do: diagnose/treat, drive or merely inform?

Regulators are proposing manufacturers create SaMD pre-specifications (SPC). An SPS has anticipated modifications to performance or input, or changes related to the intended use. Also proposed are algorithm change protocols (ACP) which are the methods manufacturers have in place to achieve and appropriately control risks of anticipated types of modifications delineated in the SPS.

Action plans are in process to harmonize globally how to regulate AI/ML. Regulators are working collaboratively with industry, academics, physicians, and patients on how to support transparency, address algorithm bias and promote algorithm robustness and real-world performance. Pre-market processes are being refined like pre-submission and pre-certification. The FDA has recently released a list of all approved ML/AI-based devices to provide public awareness of currently approved AI/ML-enabled devices across medical disciplines. https://www.fda.gov/medical-devices/software-medical-device-samd/artificial-intelligence-and-machine-learning-aiml-enabled-medical-devices Qserve has a team of global expertise (regulatory, quality, technical advisors) to assist in the grey area of getting safe and effective AI/ML devices on the global market.

One of the greatest potential benefits of ML resides in its ability to create new and important insights from the vast amount of data generated during the delivery of health care every day.

The subtle nuances AI/ML is working on far exceed Waldo's searching and recognition. Personalized medicine, mobile platforms, IVD, etc. are all beginning to incorporate AI/ML. Transparency, good engineering practices, and wise regulatory processes will be key in facilitating trust, safety, efficacy, and device adoption. Qserve is here to help with navigating through this evolving and maturing area.

Share this article

Need more information?

Do you have questions, or do you need more information about this topic? Please contact us.

Contact us

Back to Knowledge Center

How can we help you? Contact us