1. Introduction
Anterior eye slit-lamp findings are commonly graded with the use of illustrative grading scales and should offer high discriminability, reliability and be simple to administer [
1- Chong T.
- Simpson T.
- Fonn D.
The Repeatability of Discrete and Continuous Anterior Segment Grading Scales.
,
2- Efron N.
- Morgan P.B.
- Katsara S.S.
Validation of grading scales for contact lens complications.
]. When grading scales for the eye first gained popularity in the 1990 s, wordy descriptions were used [
[3]- Wolffsohn J.S.
- Naroo S.A.
- Christie C.
- Morris J.
- Conway R.
- Maldonado-Codina C.
- et al.
Anterior eye health recording.
]. Verbal descriptive scales can be overly subjective and arbitrary to determine selected levels of severity [
[4]- Schulze M.M.
- Jones D.A.
- Simpson T.L.
The development of validated bulbar redness grading scales.
]. What one clinician finds to be unusual, another clinician may find to be typical, depending on the practice setting. Numeric scales are commonly used to indicate a stage of development, progression and regression as the clinician observes, evaluates, and then assigns a numeric grade. This grade then serves as a standard and baseline by which future assessments are then compared. Grades, like this, can be used to notate deviations from normality and as a factor contributing to treatment decisions [
[5]- Bailey I.L.
- Bullimore M.A.
- Raasch T.W.
- Taylor H.R.
Clinical grading and the effects of scaling.
]. Numeric scales can be created using standard reference photographs, artistic renderings or computer-generated images and can be combined with qualitative terminology for ease of use [
[6]Key factors in the subjective and objective assessment of conjunctival erythema.
]. Visual scales may afford more consistency among clinicians and, when combined with a fine incremental scale, can be very sensitive to detecting clinical change [
1- Chong T.
- Simpson T.
- Fonn D.
The Repeatability of Discrete and Continuous Anterior Segment Grading Scales.
,
5- Bailey I.L.
- Bullimore M.A.
- Raasch T.W.
- Taylor H.R.
Clinical grading and the effects of scaling.
]. Visual scales are now commonplace in the assessment of the anterior eye. Their use are considered to be a best practice to reduce inconsistencies between examiners and to encourage a more uniform grading approach [
[7]- Peterson R.C.
- Wolffsohn J.S.
Sensitivity and reliability of objective image analysis compared to subjective grading of bulbar hyperaemia.
]. Common examples, at present, are the Brien Holden Vision Institute scale and the Efron Grading Scale [
8- Huntjens B.
- Basi M.
- Nagra M.
Evaluating a new objective grading software for conjunctival hyperaemia.
,
9Learning Resources | Brien Holden Foundation. Https://BrienholdenfoundationOrg/ n.d. https://brienholdenfoundation.org/international-program/learning-resources/ (accessed May 31, 2022).
].
Wolffsohn et al. found that 84.5 % of respondents to a worldwide survey regarding the examination of the anterior eye (680 out of 809 practitioners) regularly used grading scales in practice [
[3]- Wolffsohn J.S.
- Naroo S.A.
- Christie C.
- Morris J.
- Conway R.
- Maldonado-Codina C.
- et al.
Anterior eye health recording.
]. Such illustrations can be derived from real ocular photos of patients’ eyes or can be artistic renderings [
10- McMonnies C.W.
- Chapman-Davies A.
Assessment of conjunctival hyperemia in contact lens wearers. part I.
,
11Grading scales for contact lens complications.
,
12Terry RL, Schnider CM, Holden BA, Cornish R, Grant T, Sweeney D, et al. CCLRU standards for success of daily and extended wear contact lenses. Optom Vis Sci 1993;70:234–43. https://doi.org/10.1097/00006324-199303000-00011.
]. These scales allow the clinician to directly classify and compare a condition based on the referenced levels of severity depicted by the visual images in the scales [
[4]- Schulze M.M.
- Jones D.A.
- Simpson T.L.
The development of validated bulbar redness grading scales.
], improving standardization and reducing subjectivity [
[13]- Schulze M.-M.
- Ng A.
- Yang M.
- Panjwani F.
- Srinivasan S.
- Jones L.W.
- et al.
Bulbar Redness and Dry Eye Disease: Comparison of a Validated Subjective Grading Scale and an Objective Automated Method.
]. Construction of photographic scales is reliant upon images being available in the needed breadth of severity that is reflective of the population. Naturally, magnification, lighting, and other photographic conditions need to be standardized [
[2]- Efron N.
- Morgan P.B.
- Katsara S.S.
Validation of grading scales for contact lens complications.
]. Optimal scales should have adequate grading precision, reliability, and inter- & intra-observer agreement.
At present, the most commonly used grading scale for lid wiper epitheliopathy (LWE) does not include visual representations [
14- Jones L.
- Downie L.E.
- Korb D.
- Benitez-del-Castillo J.M.
- Dana R.
- Deng S.X.
- et al.
TFOS DEWS II Management and Therapy Report.
,
15- Wolffsohn J.S.
- Dumbleton K.
- Huntjens B.
- Kandel H.
- Koh S.
- Kunnen C.M.E.
- et al.
CLEAR - Evidence-based contact lens practice.
]. Instead, the grading protocol proposed by Korb [
16- Korb D.R.
- Herman J.P.
- Blackie C.A.
- Scaffidi R.C.
- Greiner J.V.
- Exford J.M.
- et al.
Prevalence of lid wiper epitheliopathy in subjects with dry eye signs and symptoms.
,
17- Korb D.R.
- Herman J.P.
- Greiner J.V.
- Scaffidi R.C.
- Finnemore V.M.
- Exford J.M.
- et al.
Lid wiper epitheliopathy and dry eye symptoms.
] requires observers to visualize and mentally measure the curvilinear width and length of staining on the everted eye lid and then average these measurements together to derive a severity assessment. Yamamoto et al. used a modified Korb scale in conjunction with four representative images when grading the presence of LWE in the upper and lower eyelid margins [
[18]- Yamamoto Y.
- Shiraishi A.
- Sakane Y.
- Ohta K.
- Yamaguchi M.
- Ohashi Y.
Involvement of Eyelid Pressure in Lid-Wiper Epitheliopathy.
]. Kunnen et al compared the subjective assessment of LWE using Korb’s protocol with a semi-automated image analysis system and found that observers overestimated the height and underestimated the width of LWE staining [
[19]- Kunnen C.M.E.
- Wolffsohn J.S.
- Ritchey E.R.
Comparison of subjective grading of lid wiper epitheliopathy with a semi-objective method.
]. To overcome the challenges associated with grading LWE staining, the present work aims to develop and validate a novel photographic LWE grading scale (referred as the PLWE) and assess graders’ preference between the Korb and the PLWE grading scales.
4. Discussion
This study aimed to develop and validate a new grading scale to support recording of LWE staining in clinical practice. The present study used digital image analysis in the selection of images and proposed a scale with a linear progression of LWE. All of the images included in this study were sourced from the same bank of images, taken by the same camera/instrumentation to prevent issues with magnification, lighting and image quality. In selecting images for the PLWE scale, the goal was to identify eyes in which the severity of LWE differed while the other image characteristics remained as consistent as possible.
The agreement of the PLWE was compared against the commonly used Korb scale for the subjective evaluation of LWE [
[15]- Wolffsohn J.S.
- Dumbleton K.
- Huntjens B.
- Kandel H.
- Koh S.
- Kunnen C.M.E.
- et al.
CLEAR - Evidence-based contact lens practice.
]. Grading precision and reliability indicated that the novel PLWE scale was comparable to the Korb scale. The reliability (i.e. standard deviation of the discrepancy) was found to be>0.1 for both scales (see
Table 2). Based on this, the authors suggest that clinicians should use a 0.5 grading step to monitor change when using the PLWE. Clinicians should also be instructed to extrapolate their grading estimates beyond the illustrated limit of the PLWE scale if LWE appears to be greater than grade 3.0. For example, if a patient presents with a continuous band of LWE encompassing the whole length and width of the lid wiper area the grade should be recorded as 3.5. Similarly, Vianya-Estopa et al. [
[26]- Vianya-Estopa M.
- Nagra M.
- Cochrane A.
- Retallic N.
- Dunning D.
- Terry L.
- et al.
Optimising subjective anterior eye grading precision.
] recently suggested that a 0.5 grading step is adequately precise in evaluating hyperemia in the anterior eye using a visual grading scale, and previous work has also graded LWE using 0.5 steps [
[21]- Delaveris A.
- Stahl U.
- Madigan M.
- Jalbert I.
Comparative performance of lissamine green stains.
]. It is noteworthy that even when previous studies have encouraged graders to use 0.1 steps, there was a noticeable aggregation of scoring around whole-integer or half-integer steps [
[2]- Efron N.
- Morgan P.B.
- Katsara S.S.
Validation of grading scales for contact lens complications.
]. Kunnen et al. noticed objective analysis was more accurate than human observers when grading LWE, yet, objective analysis is more costly and clinicians still rely on grading scales in contact lens practice [
[27]Kunnen, C; Percy, L; Holden, BA; Papas E. Automated assessment of lid margin lissamine green staining. vol. 35. C.V. Mosby Co; 2014.
].
Previous studies have not reported intra-observer or inter-observer LWE reliability data. The grading reliability reported in this study (
Table 2) is similar to that reported by Efron et al. [
[2]- Efron N.
- Morgan P.B.
- Katsara S.S.
Validation of grading scales for contact lens complications.
] for grading conjunctival redness using the Brien Holden Vision Institute scale. Moreover, the COR values reported in this study also appear to be in line with the 95 % confidence limits reported by Efron et al. for corneal staining, conjunctival redness and papillary conjunctivitis (±1.2 grading scale units when using 0.1 increments) in a group of inexperienced graders [
[2]- Efron N.
- Morgan P.B.
- Katsara S.S.
Validation of grading scales for contact lens complications.
]. In contrast, Huntjens et al. reported a slightly lower COR of 0.78 for palpebral hyperemia for an experienced grader [
[8]- Huntjens B.
- Basi M.
- Nagra M.
Evaluating a new objective grading software for conjunctival hyperaemia.
]. COR data indicates the size of the change or difference in severity that can be considered statistically significant and is dependent on clinician experience and/or training with the grading system. Typically, when using anterior eye grading scales a change of 1.0 grading scale represents a clinically significant changes [
[11]Grading scales for contact lens complications.
]. In this case, when using either of the LWE scales a change of 1.0 should be taken to be both clinically and statistically significant. Additionally, it is worthwhile to note that the Korb scales uses 1-point increments and PLWE allows for 0.5-point increment. As such, it is expected to find a slightly larger COR with Korb as compared to the PLWE.
A primary goal of the present study was to determine any difference between the Korb and the novel PLWE scale. Unsurprisingly, the greatest agreement between the scales was shown on the extremes (around scores of 0.0 and 3.0 as shown in the Bland and Altman plots). This suggests that graders accurately assess a lack of LWE (with only the line of Marx picking up lissamine green staining) and a large extent of LWE. Some disparity of grading scores was noted mostly near the middle of both scales where graders underestimate the amount of LWE when using the Korb scale as compared to the PLWE scale. In fact, clinicians should be aware that grades are not interchangeable between PLWE and Korb scales. In support of these findings, Kunnen et al. noted that observers tended to overestimate the height and underestimated the width of LWE staining. Because the lid wiper region is not well defined, it is a difficult process for human observers to judge the stained region as a proportion of the lid wiper total region [
[19]- Kunnen C.M.E.
- Wolffsohn J.S.
- Ritchey E.R.
Comparison of subjective grading of lid wiper epitheliopathy with a semi-objective method.
]. In particular, the Korb grading protocol requires the grader to mentally measure the length and width of the curvilinear ocular anatomy (whilst ignoring the line of Marx) which can pose a challenge and can explain differences when comparing the two methods. In contrast, the PLWE scale shows real ocular images, in which the line of Marx is present in all severity grades, including grade 0. As such, the line of Marx does not have to be mentally subtracted and the images can be assessed in their natural state. The use of the Korb protocol showed that clinicians underestimated LWE compared to PLWE particularly for greater levels of severity. This might be partly explained because graders were encouraged to use a 0.5 step when using the PLWE method which included extrapolation to 3.5.
The format of the PLWE is similar to other anterior segment visual grading scales and this study showed a clear preference for employing the PLWE as compared to Korb by the graders. The presence of LWE has been associated with dry eye symptomatology [
[28]- Wolffsohn J.S.
- Arita R.
- Chalmers R.
- Djalilian A.
- Dogru M.
- Dumbleton K.
- et al.
TFOS DEWS II Diagnostic Methodology report.
], and the addition of this new visual scale might facilitate the examination of LWE in clinical practice. The linear design of the PLWE affords an appropriate level of discrimination for the clinician to aptly determine LWE severity. Visual scales, like this one, are simple to administer and require little to no instruction. Since the average time to report the entirety of anterior eye health has been reported as 6.8 ± 5.7 min and only a few seconds is reportedly all that is required for precise grading of complications, efficiency and ease of use is key [
3- Wolffsohn J.S.
- Naroo S.A.
- Christie C.
- Morris J.
- Conway R.
- Maldonado-Codina C.
- et al.
Anterior eye health recording.
,
29Grading contact lens complications under time constraints.
]. Grading scales remain to be the most widely used tool in clinical practice to gauge findings and change over time [
[4]- Schulze M.M.
- Jones D.A.
- Simpson T.L.
The development of validated bulbar redness grading scales.
]. Delaveris et al. proposed an alternative to Korb with a simplified visual grading for LWE but their work lacked a full validation analysis [
[21]- Delaveris A.
- Stahl U.
- Madigan M.
- Jalbert I.
Comparative performance of lissamine green stains.
]. In line with the present work, Delaveris et al. also suggested that a simplified scoring process using four images might be sufficient to assess LWE when using lissamine green vital dye [
[21]- Delaveris A.
- Stahl U.
- Madigan M.
- Jalbert I.
Comparative performance of lissamine green stains.
].
Limitations of the present study include the use of images rather than viewing an eye on the slit lamp and potential grader fatigue. However, these should not compromise the validity of the findings as similar limitations have also been highlighted by Efron et al. in validation of grading scales for contact lens complications [
[2]- Efron N.
- Morgan P.B.
- Katsara S.S.
Validation of grading scales for contact lens complications.
]. As mentioned previously, differences in clinicians’ experience and/or familiarity with clinical grading systems might affect grading reliability [
2- Efron N.
- Morgan P.B.
- Katsara S.S.
Validation of grading scales for contact lens complications.
,
26- Vianya-Estopa M.
- Nagra M.
- Cochrane A.
- Retallic N.
- Dunning D.
- Terry L.
- et al.
Optimising subjective anterior eye grading precision.
]. To overcome this limitation, the present study included a balanced number of graders with different levels of clinical experience. Despite this attempt, all of them reported to not routinely assess the lid wiper and for this reason they might be considered ‘inexperienced’ in LWE assessment. Future work should find ways to support novel and experienced clinicians establish their grading reliability as this is invaluable in clinical decision making and sharing clinical data with colleagues. Inexperienced graders should familiarize themselves with the line of Marx and LWE staining patterns as much as possible prior to implementing the PLWE to provide the greatest level of consistency. LWE has also been reported to have varied clinical presentations [
[30]- Varikooty J.
- Srinivasan S.
- Subbaraman L.
- Woods C.A.
- Fonn D.
- Simpson T.L.
- et al.
Variations in observable lid wiper epitheliopathy (LWE) staining patterns in wearers of silicone hydrogel lenses.
]. The PLWE images chosen had continuous patterns. Non-continuous patterns could be more difficult to judge as they would have to be mentally added prior to matching to the visual scale. Neither contact lens wearers nor dry eye patients were specifically enrolled, nor excluded, in the photographic process of image collection. A separate investigation using the PLWE in these patient bases would be of interest to determine the scale’s utility. Lower-LWE was not assessed in the present study and the PLWE is primarily intended to be used to assess the severity of upper-LWE.
In conclusion, this study proposed and validated a novel PLWE. The scale was found to be repeatable and a reliable method to assess the severity of LWE when compared to the existing Korb protocol. In addition, graders with a range of clinical experience showed a strong preference for the use of this photographic scale when evaluating LWE staining.
Article info
Publication history
Published online: October 25, 2022
Accepted:
October 18,
2022
Received in revised form:
October 3,
2022
Received:
June 16,
2022
Publication stage
In Press Corrected ProofCopyright
© 2022 The Author(s). Published by Elsevier Ltd on behalf of British Contact Lens Association.